[ 
https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520212#comment-14520212
 ] 

Siddharth Seth commented on TEZ-2348:
-------------------------------------

[~gopalv] - I see the point of throwing an exception if this is accessed 
incorrectly. Fail the query fast with a specific Exception rather than the app 
potentially going into a loop - which can be really difficult to debug - 
especially if there's no logging / large clusters.

Putting debugging aside, from an API perspective, I think an iterator like 
interface is a lot cleaner. Issues should ideally be found in smaller scale 
testing. In this particular case, this manifests as an exception from IFiles 
which causes unnecessary confusion.

Moving all the next() invocations to throw an exception is theoretically an 
incompatible change. However, it's highly unlikely that anyone goes past a 
next() invocation returning false so the impact may not be huge (unlikely to be 
used in different places, and not safe for multiple threads). If we're making 
this change, it should be for all readers and as early as possible.

A terrible option would be to have the behaviour configurable :)... Exception 
when we think an issue may be caused by incorrect usage, loop otherwise.


> EOF exception during UnorderedKVReader.next()
> ---------------------------------------------
>
>                 Key: TEZ-2348
>                 URL: https://issues.apache.org/jira/browse/TEZ-2348
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.5.2
>            Reporter: Jason Dere
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-2348.1.patch, TEZ-2348.2.patch, TEZ-2348.3.patch, 
> _tez_session_dir.tgz
>
>
> {noformat}
> Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. 
> Completed reading 516605
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
>       ... 13 more
> Caused by: java.io.IOException: Reached EOF. Completed reading 516605
>       at 
> org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817)
>       at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698)
>       at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731)
>       at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727)
>       at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151)
>       at 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232)
>       ... 15 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to