[ https://issues.apache.org/jira/browse/HIVE-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mithun Radhakrishnan updated HIVE-13756: ---------------------------------------- Resolution: Fixed Fix Version/s: 2.0.0 Target Version/s: 2.0.0, 1.2.1 (was: 1.2.1, 2.0.0) Status: Resolved (was: Patch Available) > Map failure attempts to delete reducer _temporary directory on multi-query > pig query > ------------------------------------------------------------------------------------ > > Key: HIVE-13756 > URL: https://issues.apache.org/jira/browse/HIVE-13756 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 1.2.1, 2.0.0 > Reporter: Chris Drome > Assignee: Chris Drome > Fix For: 2.0.0 > > Attachments: HIVE-13756-branch-1.patch, HIVE-13756.1-branch-1.patch, > HIVE-13756.1.patch, HIVE-13756.patch > > > A pig script, executed with multi-query enabled, that reads the source data > and writes it as-is into TABLE_A as well as performing a group-by operation > on the data which is written into TABLE_B can produce erroneous results if > any map fails. This results in a single MR job that writes the map output to > a scratch directory relative to TABLE_A and the reducer output to a scratch > directory relative to TABLE_B. > If one or more maps fail it will delete the attempt data relative to TABLE_A, > but it also deletes the _temporary directory relative to TABLE_B. This has > the unintended side-effect of preventing subsequent maps from committing > their data. This means that any maps which successfully completed before the > first map failure will have its data committed as expected, other maps not, > resulting in an incomplete result set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)