Siddharth Seth commented on TEZ-3439:

[~hitesh] - the initial validate was only meant to validate if the two sides 
were the same or not, and not generate a detailed report on the diff. (The 
variable names used are incorrect. I must've been trying to generate a full 
diff, and then decided that a single diff is enough to break out).
The RHS check outside would have broken as a result of the change which caused 
an IOException to be thrown if a reader is accessed after reaching EOF (as 
against returning a false which is what the behaviour used to be). In that 
respect, I think the initial patch posted solves the issue nicely.

In terms of diffs - it's a little more complicated to find the exact diff. Once 
there's a diff in a key - both iterators need to be moved forward till there's 
another match. Meanwhile, both lhs and rhs counters need to be incremented for 
diffs  (The end result could be LHS_EXTRA=2, RHS_EXTRA=200). The EOF case is 
simpler, i.e. count till the end of the other - but is not adequate to generate 
a full report of diffs.

> Tez joinvalidate example failed when first input argument size is bigger than 
> the second
> ----------------------------------------------------------------------------------------
>                 Key: TEZ-3439
>                 URL: https://issues.apache.org/jira/browse/TEZ-3439
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Hui Cao
>            Assignee: Hui Cao
>         Attachments: TEZ-3439.1.patch
> when using joinvalidate in Tez example jar. as command
> {{"hadoop jar tez-examples-<version>.jar joinvalidate <input1> <input2>"}}
> if the size of <input1> is bigger than <input2>, an IOException is thrown.
> {noformat}
> 16/09/21 00:07:53 INFO examples.JoinValidate: DAG diagnostics: [Vertex 
> failed, vertexName=joinvalidate, vertexId=vertex_1473073428528_0031_1_02, 
> diagnostics=[Task failed, taskId=task_1473073428528_0031_1_02_000000, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : attempt_1473073428528_0031_1_02_000000_0:java.io.IOException: 
> Please check if you are invoking moveToNext() even after it returned false.
>       at 
> org.apache.tez.runtime.library.common.ValuesIterator.hasCompletedProcessing(ValuesIterator.java:221)
>       at 
> org.apache.tez.runtime.library.common.ValuesIterator.moveToNext(ValuesIterator.java:103)
>       at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput$OrderedGroupedKeyValuesReader.next(OrderedGroupedKVInput.java:321)
>       at 
> org.apache.tez.examples.JoinValidate$JoinValidateProcessor.run(JoinValidate.java:254)
>       at 
> org.apache.tez.runtime.library.processor.SimpleProcessor.run(SimpleProcessor.java:53)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>       at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}

This message was sent by Atlassian JIRA

Reply via email to