[
https://issues.apache.org/jira/browse/HAWQ-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082229#comment-15082229
]
ASF GitHub Bot commented on HAWQ-314:
-------------------------------------
GitHub user realdawn opened a pull request:
https://github.com/apache/incubator-hawq/pull/241
HAWQ-314. Fix AO Read when a segment file larger than 4TB
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/realdawn/incubator-hawq HAWQ-314
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-hawq/pull/241.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #241
----
commit ca5cce8f1cf860d90bddd5970822f94cbb606516
Author: doli <[email protected]>
Date: 2016-01-05T02:11:03Z
HAWQ-314. Fix AO Read when a segment file larger than 4TB
----
> AO read error due to wrong init of bufferedRead->largeReadLen
> -------------------------------------------------------------
>
> Key: HAWQ-314
> URL: https://issues.apache.org/jira/browse/HAWQ-314
> Project: Apache HAWQ
> Issue Type: Bug
> Components: Storage
> Reporter: Dong Li
> Assignee: Dong Li
>
> {code}
> newperf=# create table os_order_item_1 as select * from os_order_item;
> ERROR: VarBlock is not valid. Valid block check error 10, detail 'offset 0
> at index 0 is bad -- must equal 8 (bytes_0_3 0x81007c60, bytes_4_7
> 0x000001a1)' (appendonlyam.c:747) (seg15 ****:5532 pid=561713)
> (dispatcher.c:1701)
> DETAIL:
> Append-Only storage Small Content header: smallcontent_bytes_0_3 0x1906840F,
> smallcontent_bytes_4_7 0xF5400000, headerKind = 1, executorBlockKind = 1,
> rowCount = 417, usingChecksums = false, header checksum 0x0, block checksum
> 0x0, dataLength 32682, compressedLength 0, overallBlockLen 32696
> Scan of Append-Only Row-Oriented relation 'os_order_item'. Append-Only
> segment file 'hdfs://***:9000/hawq/hawq-1451203332/16385/26367/26435/139',
> block header offset in file = 16088, bufferCount 4103
> {code}
> Debug information
> {code}
> (gdb) p ( * (AppendOnlyScanDescData
> *)0x33df218)->storageRead->bufferedRead->largeReadLen
> $24 = 48928
> {code}
> The bufferedRead->largeReadLen should be 65536(bufferedRead->maxLargeReadLen).
> I find the bug is in cdbbufferedread.c:158
> {code}
> int32 real_fileLen = fileLen - bufferedRead->largeReadPosition;
> if (real_fileLen > 0)
> {
> /*
> * Do the first read.
> */
> if (real_fileLen > bufferedRead->maxLargeReadLen)
> bufferedRead->largeReadLen =
> bufferedRead->maxLargeReadLen;
> else
> bufferedRead->largeReadLen = (int32)real_fileLen;
> BufferedReadIo(bufferedRead);
> }
> {code}
> real_fileLen should be int64 as fileLen is a int64 number and
> bufferedRead->largeReadPosition is also a int64 number.
> According to the code, a int64 number is changed to int32 which make the
> number wrong.
> To *fix* this, just define the real_fileLen as a int64 number.
> {code}
> int64 real_fileLen = fileLen - bufferedRead->largeReadPosition;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)