[
https://issues.apache.org/jira/browse/HIVE-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380407#comment-14380407
]
Gunther Hagleitner commented on HIVE-9976:
------------------------------------------
Not your fault - but there are 2 paths through HiveSplitGenerator. The class is
used once without calling init and once being properly init'd. The reason is
that some other code needs to use the "group splits" method. Since you've moved
init to the constr now, this has gotten even uglier. Could you move the split
grouper methods to a separate util class (static) and leave the pruner to just
prune.
Also - I think you've moved the initialization of the dynamic pruner to the
constr of the input initializer, in order to not miss any events. Can you add a
comment to the code explaining this?
Very cool to see a real unit test :-) thanks.
> Possible race condition in DynamicPartitionPruner for <200ms tasks
> ------------------------------------------------------------------
>
> Key: HIVE-9976
> URL: https://issues.apache.org/jira/browse/HIVE-9976
> Project: Hive
> Issue Type: Bug
> Components: Tez
> Affects Versions: 1.0.0
> Reporter: Gopal V
> Assignee: Siddharth Seth
> Fix For: 1.0.1
>
> Attachments: HIVE-9976.1.patch, llap_vertex_200ms.png
>
>
> Race condition in the DynamicPartitionPruner between
> DynamicPartitionPruner::processVertex() and
> DynamicPartitionpruner::addEvent() for tasks which respond with both the
> result and success in a single heartbeat sequence.
> {code}
> 2015-03-16 07:05:01,589 ERROR [InputInitializer [Map 1] #0]
> tez.DynamicPartitionPruner: Expecting: 1, received: 0
> 2015-03-16 07:05:01,590 ERROR [Dispatcher thread: Central] impl.VertexImpl:
> Vertex Input: store_sales initializer failed,
> vertex=vertex_1424502260528_1113_4_04 [Map 1]
> org.apache.tez.dag.app.dag.impl.AMUserCodeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Incorrect event count in
> dynamic parition pruning
> {code}
> !llap_vertex_200ms.png!
> All 4 upstream vertices of Map 1 need to finish within ~200ms to trigger
> this, which seems to be consistently happening with LLAP.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)