Cyrille Chépélov created TEZ-2256:
-------------------------------------
Summary: Avoid use of BufferTooSmallException to signal end of
buffer in UnorderedPartitionedKVWriter
Key: TEZ-2256
URL: https://issues.apache.org/jira/browse/TEZ-2256
Project: Apache Tez
Issue Type: Improvement
Affects Versions: 0.6.0, 0.7.0
Reporter: Cyrille Chépélov
Priority: Minor
UnorderedPartitionedKVWriter delegates serialization to the application,
passing it a private ByteArrayOutputStream. In case the buffer is exhausted,
ByteArrayOutputStream signals that with a private BufferTooSmallException,
which can be seen but not dealt with by the application. As [~cwensel] pointed
out, when the application is in fact a complex framework, there is no way to
distinguish this exception from a real failure, which compels logging the full
stack even for reasonable events such as "buffer complete".
Suggested approach: set a "complete" flag in ByteArrayOutputStream that
disables any further output, and replace BufferTooSmallException (BTSE)
handling by checking that flag.
[~sseth] suggested checking out SortedOutput as well, as the mechanisms there
should be similar.
I'll give this a go this week.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)