[ https://issues.apache.org/jira/browse/TEZ-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514090#comment-15514090 ]
Siddharth Seth commented on TEZ-3439: ------------------------------------- [~hitesh] - the initial validate was only meant to validate if the two sides were the same or not, and not generate a detailed report on the diff. (The variable names used are incorrect. I must've been trying to generate a full diff, and then decided that a single diff is enough to break out). The RHS check outside would have broken as a result of the change which caused an IOException to be thrown if a reader is accessed after reaching EOF (as against returning a false which is what the behaviour used to be). In that respect, I think the initial patch posted solves the issue nicely. In terms of diffs - it's a little more complicated to find the exact diff. Once there's a diff in a key - both iterators need to be moved forward till there's another match. Meanwhile, both lhs and rhs counters need to be incremented for diffs (The end result could be LHS_EXTRA=2, RHS_EXTRA=200). The EOF case is simpler, i.e. count till the end of the other - but is not adequate to generate a full report of diffs. > Tez joinvalidate example failed when first input argument size is bigger than > the second > ---------------------------------------------------------------------------------------- > > Key: TEZ-3439 > URL: https://issues.apache.org/jira/browse/TEZ-3439 > Project: Apache Tez > Issue Type: Bug > Reporter: Hui Cao > Assignee: Hui Cao > Attachments: TEZ-3439.1.patch > > > when using joinvalidate in Tez example jar. as command > {{"hadoop jar tez-examples-<version>.jar joinvalidate <input1> <input2>"}} > if the size of <input1> is bigger than <input2>, an IOException is thrown. > {noformat} > 16/09/21 00:07:53 INFO examples.JoinValidate: DAG diagnostics: [Vertex > failed, vertexName=joinvalidate, vertexId=vertex_1473073428528_0031_1_02, > diagnostics=[Task failed, taskId=task_1473073428528_0031_1_02_000000, > diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( > failure ) : attempt_1473073428528_0031_1_02_000000_0:java.io.IOException: > Please check if you are invoking moveToNext() even after it returned false. > at > org.apache.tez.runtime.library.common.ValuesIterator.hasCompletedProcessing(ValuesIterator.java:221) > at > org.apache.tez.runtime.library.common.ValuesIterator.moveToNext(ValuesIterator.java:103) > at > org.apache.tez.runtime.library.input.OrderedGroupedKVInput$OrderedGroupedKeyValuesReader.next(OrderedGroupedKVInput.java:321) > at > org.apache.tez.examples.JoinValidate$JoinValidateProcessor.run(JoinValidate.java:254) > at > org.apache.tez.runtime.library.processor.SimpleProcessor.run(SimpleProcessor.java:53) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)