[
https://issues.apache.org/jira/browse/HADOOP-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062165#comment-17062165
]
Ctest commented on HADOOP-16836:
--------------------------------
I updated the description to make it more readable.
> Bug in widely-used helper function caused valid configuration value to fail
> on multiple tests, causing build failure
> --------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-16836
> URL: https://issues.apache.org/jira/browse/HADOOP-16836
> Project: Hadoop Common
> Issue Type: Bug
> Components: common
> Affects Versions: 3.3.0, 3.2.1
> Reporter: Ctest
> Priority: Blocker
> Labels: configuration, easyfix, patch, test
> Attachments: HADOOP-16836-000.patch, HADOOP-16836-000.patch
>
>
> {code:java}
> org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1
> org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength
> org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1
> org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code}
>
> 4 actively-used tests above call the helper function
> `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then
> call `TestTFileByteArrays#readRecords()` to assert the key and the value part
> (v) of these kv pairs matched with what they wrote. All v of kv pairs are
> hardcode strings with a length of 6.
>
> `readRecords()` uses
> `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()`
> to get full length of the v of these kv pairs. But `getValueLength()` can
> only get the full length of v when v's full length is less than the value of
> configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will
> throw an exception. So, *when `tfile.io.chunk.size` is configured/set to a
> value less than 6, these 4 tests failed because of the exception from
> `readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`.*
> The definition of `tfile.io.chunk.size` is "Value chunk size in bytes.
> Default to 1MB. Values of the length less than the chunk size is guaranteed
> to have known value length in read time (See also
> TFile.Reader.Scanner.Entry.isValueLengthKnown())".
> *Fixes*
> `readRecords()` should call
> `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])`
> instead, which returns the correct full length of the `value` part despite
> whether the value's length is larger than `tfile.io.chunk.size`.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]