[
https://issues.apache.org/jira/browse/HADOOP-16836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ctest updated HADOOP-16836:
---------------------------
Description:
{code:java}
org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1
org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength
org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1
org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code}
4 actively-used tests above call the helper function
`TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then
call `TestTFileByteArrays#readRecords()` to assert the key and the value part
(v) of these kv pairs matched with what they wrote. All v of kv pairs are
hardcode strings with a length of 6.
`readRecords()` uses
`org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()`
to get full length of the v of these kv pairs. But `getValueLength()` can only
get v's full length when it is less than the value of configuration parameter
`tfile.io.chunk.size`, otherwise `readRecords()` will throw an exception. So,
*when `tfile.io.chunk.size` is configured/set to a value less than 6, these 4
tests failed because of the exception from `readRecords()`, even 6 is a valid
value for `tfile.io.chunk.size`.*
The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default
to 1MB. Values of the length less than the chunk size is guaranteed to have
known value length in read time (See also
TFile.Reader.Scanner.Entry.isValueLengthKnown())".
*Fixes*
`readRecords()` should call
`org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])`
instead, which returns the correct full length of the `value` part despite
whether the value's length is larger than `tfile.io.chunk.size`.
was:
{code:java}
org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1
org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength
org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1
org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code}
4 actively-used tests above call the helper function
`TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then
call `TestTFileByteArrays#readRecords()` to assert the key and the value part
(v) of these kv pairs matched with what they wrote. All v of kv pairs are
hardcode strings with a length of 6.
`readRecords()` uses
`org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()`
to get full length of the v of these kv pairs. But `getValueLength()` can only
get the full length of v when v's full length is less than the value of
configuration parameter `tfile.io.chunk.size`, otherwise `readRecords()` will
throw an exception. So, *when `tfile.io.chunk.size` is configured/set to a
value less than 6, these 4 tests failed because of the exception from
`readRecords()`, even 6 is a valid value for `tfile.io.chunk.size`.*
The definition of `tfile.io.chunk.size` is "Value chunk size in bytes. Default
to 1MB. Values of the length less than the chunk size is guaranteed to have
known value length in read time (See also
TFile.Reader.Scanner.Entry.isValueLengthKnown())".
*Fixes*
`readRecords()` should call
`org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])`
instead, which returns the correct full length of the `value` part despite
whether the value's length is larger than `tfile.io.chunk.size`.
> Bug in widely-used helper function caused valid configuration value to fail
> on multiple tests, causing build failure
> --------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-16836
> URL: https://issues.apache.org/jira/browse/HADOOP-16836
> Project: Hadoop Common
> Issue Type: Bug
> Components: common
> Affects Versions: 3.3.0, 3.2.1
> Reporter: Ctest
> Priority: Blocker
> Labels: configuration, easyfix, patch, test
> Attachments: HADOOP-16836-000.patch, HADOOP-16836-000.patch
>
>
> {code:java}
> org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryMixedLengths1
> org.apache.hadoop.io.file.tfile.TestTFileStreams#testOneEntryUnknownLength
> org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryMixedLengths1
> org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams#testOneEntryUnknownLength{code}
>
> 4 actively-used tests above call the helper function
> `TestTFileStreams#writeRecords()` to write key-value pairs (kv pairs), then
> call `TestTFileByteArrays#readRecords()` to assert the key and the value part
> (v) of these kv pairs matched with what they wrote. All v of kv pairs are
> hardcode strings with a length of 6.
>
> `readRecords()` uses
> `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValueLength()`
> to get full length of the v of these kv pairs. But `getValueLength()` can
> only get v's full length when it is less than the value of configuration
> parameter `tfile.io.chunk.size`, otherwise `readRecords()` will throw an
> exception. So, *when `tfile.io.chunk.size` is configured/set to a value less
> than 6, these 4 tests failed because of the exception from `readRecords()`,
> even 6 is a valid value for `tfile.io.chunk.size`.*
> The definition of `tfile.io.chunk.size` is "Value chunk size in bytes.
> Default to 1MB. Values of the length less than the chunk size is guaranteed
> to have known value length in read time (See also
> TFile.Reader.Scanner.Entry.isValueLengthKnown())".
> *Fixes*
> `readRecords()` should call
> `org.apache.hadoop.io.file.tfile.TFile.Reader.Scanner.Entry#getValue(byte[])`
> instead, which returns the correct full length of the `value` part despite
> whether the value's length is larger than `tfile.io.chunk.size`.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]