[
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520212#comment-14520212
]
Siddharth Seth commented on TEZ-2348:
-------------------------------------
[~gopalv] - I see the point of throwing an exception if this is accessed
incorrectly. Fail the query fast with a specific Exception rather than the app
potentially going into a loop - which can be really difficult to debug -
especially if there's no logging / large clusters.
Putting debugging aside, from an API perspective, I think an iterator like
interface is a lot cleaner. Issues should ideally be found in smaller scale
testing. In this particular case, this manifests as an exception from IFiles
which causes unnecessary confusion.
Moving all the next() invocations to throw an exception is theoretically an
incompatible change. However, it's highly unlikely that anyone goes past a
next() invocation returning false so the impact may not be huge (unlikely to be
used in different places, and not safe for multiple threads). If we're making
this change, it should be for all readers and as early as possible.
A terrible option would be to have the behaviour configurable :)... Exception
when we think an issue may be caused by incorrect usage, loop otherwise.
> EOF exception during UnorderedKVReader.next()
> ---------------------------------------------
>
> Key: TEZ-2348
> URL: https://issues.apache.org/jira/browse/TEZ-2348
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.5.2
> Reporter: Jason Dere
> Assignee: Rajesh Balamohan
> Attachments: TEZ-2348.1.patch, TEZ-2348.2.patch, TEZ-2348.3.patch,
> _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF.
> Completed reading 516605
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
> ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
> at
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
> at
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
> at
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
> at
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
> at
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
> at
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
> ... 15 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)