[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-12-22 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1080:


Fix Version/s: 0.6.0

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.6.0
>
> Attachments: PIG-1080.patch, PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-11 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1080:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

patch committed, thanks Richard!

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch, PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1080:


Status: Patch Available  (was: Open)

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch, PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1080:
--

Attachment: PIG-1080.patch

This patch excludes the bzip and gzip files from the change.

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch, PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1080:
--

Status: Open  (was: Patch Available)

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1080:
--

Status: Patch Available  (was: Open)

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1080:
--

Attachment: PIG-1080.patch

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1080:


Affects Version/s: 0.6.0

To be clear, this bug affects only trunk code, not any released version of Pig. 
 It is a result of the switch to using LineRecordReader, (PIG-960).

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch, PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1080:


Status: Open  (was: Patch Available)

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1080:
--

Status: Patch Available  (was: Open)

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1080) PigStorage may miss records when loading a file

2009-11-10 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1080:
--

Attachment: PIG-1080.patch

This patch fixes the problem.

> PigStorage may miss records when loading a file
> ---
>
> Key: PIG-1080
> URL: https://issues.apache.org/jira/browse/PIG-1080
> Project: Pig
>  Issue Type: Bug
>Reporter: Richard Ding
>Assignee: Richard Ding
> Attachments: PIG-1080.patch
>
>
> When a file is assigned to multiple mappers (one block per mapper), the 
> blocks may not end at the exact record boundary. Special care is taken to 
> ensure that all records are loaded by mappers (and exactly once), even for 
> records that cross the block boundary. 
> The PigStorage, however, doesn't correctly handle the case where a block ends 
> at exactly record boundary and results in missing records.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.