PigStorage may miss records when loading a file
-----------------------------------------------

                 Key: PIG-1080
                 URL: https://issues.apache.org/jira/browse/PIG-1080
             Project: Pig
          Issue Type: Bug
            Reporter: Richard Ding
            Assignee: Richard Ding


When a file is assigned to multiple mappers (one block per mapper), the blocks 
may not end at the exact record boundary. Special care is taken to ensure that 
all records are loaded by mappers (and exactly once), even for records that 
cross the block boundary. 

The PigStorage, however, doesn't correctly handle the case where a block ends 
at exactly record boundary and results in missing records.

 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to