Partitioner Happilly accepts negative int number and data gets lost in Hadoop 
framework
---------------------------------------------------------------------------------------

                 Key: HADOOP-3425
                 URL: https://issues.apache.org/jira/browse/HADOOP-3425
             Project: Hadoop Core
          Issue Type: Bug
            Reporter: Amir Youssefi


Using Partitioner, 

 If user passes negative partition number, framework happily accepts it. Data 
goes to wrong location and (many) reducers get zero data.  Suggested 
resolutions:

 1) Prevent the problem from start. partitioner checks the range and throws an 
exception if that' out of range.

 2) Have a more generic check: Compare counters to see if all data gets past 
Shuffle stage. No leak. Per feedback we got from Owen, this idea get a bit 
complicated when considering having combiners.

 Example:  using  my_id.hashCode() % numPartitions creates negative numbers and 
data gets lost in the framework. Reducers get zero rows ( while data is 
actually in  partitions index with negative numbers).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to