[jira] [Updated] (MAPREDUCE-5656) bzip2 codec can drop records when reading data in splits

2013-12-09 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5656:
--

   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks to Nathan, Chris, and Vinay for the reviews!  I committed this to trunk 
and branch-2.

 bzip2 codec can drop records when reading data in splits
 

 Key: MAPREDUCE-5656
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5656
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha, 0.23.8
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 3.0.0, 2.4.0

 Attachments: HADOOP-9622-2.patch, HADOOP-9622-testcase.patch, 
 HADOOP-9622.patch, MAPREDUCE-5656-2.patch, MAPREDUCE-5656.patch, 
 blockEndingInCR.txt.bz2, blockEndingInCRThenLF.txt.bz2


 Bzip2Codec.BZip2CompressionInputStream can cause records to be dropped when 
 reading them in splits based on where record delimiters occur relative to 
 compression block boundaries.
 Thanks to [~knoguchi] for discovering this problem while working on PIG-3251.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (MAPREDUCE-5656) bzip2 codec can drop records when reading data in splits

2013-12-02 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5656:
--

Attachment: MAPREDUCE-5656-2.patch

Slightly updated patch to fix the spacing issue in SplitLineReader.

 bzip2 codec can drop records when reading data in splits
 

 Key: MAPREDUCE-5656
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5656
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha, 0.23.8
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: HADOOP-9622-2.patch, HADOOP-9622-testcase.patch, 
 HADOOP-9622.patch, MAPREDUCE-5656-2.patch, MAPREDUCE-5656.patch, 
 blockEndingInCR.txt.bz2, blockEndingInCRThenLF.txt.bz2


 Bzip2Codec.BZip2CompressionInputStream can cause records to be dropped when 
 reading them in splits based on where record delimiters occur relative to 
 compression block boundaries.
 Thanks to [~knoguchi] for discovering this problem while working on PIG-3251.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5656) bzip2 codec can drop records when reading data in splits

2013-11-26 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5656:
--

Attachment: MAPREDUCE-5656.patch

Cleanup work in preparation for commit.  Moving this JIRA to MAPREDUCE since 
it's primarily changes in that project.  Also uploading a binary patch that can 
be applied with git-apply as reference of what will be committed (same patch as 
before with binary test files added).

Awating Jenkins confirmation and commit of MAPREDUCE-5640 to avoid test name 
conflicts.

 bzip2 codec can drop records when reading data in splits
 

 Key: MAPREDUCE-5656
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5656
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.4-alpha, 0.23.8
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: HADOOP-9622-2.patch, HADOOP-9622-testcase.patch, 
 HADOOP-9622.patch, MAPREDUCE-5656.patch, blockEndingInCR.txt.bz2, 
 blockEndingInCRThenLF.txt.bz2


 Bzip2Codec.BZip2CompressionInputStream can cause records to be dropped when 
 reading them in splits based on where record delimiters occur relative to 
 compression block boundaries.
 Thanks to [~knoguchi] for discovering this problem while working on PIG-3251.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5656) bzip2 codec can drop records when reading data in splits

2013-11-26 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated MAPREDUCE-5656:
--

Target Version/s: 2.3.0
  Status: Patch Available  (was: Open)

 bzip2 codec can drop records when reading data in splits
 

 Key: MAPREDUCE-5656
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5656
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.23.8, 2.0.4-alpha
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: HADOOP-9622-2.patch, HADOOP-9622-testcase.patch, 
 HADOOP-9622.patch, MAPREDUCE-5656.patch, blockEndingInCR.txt.bz2, 
 blockEndingInCRThenLF.txt.bz2


 Bzip2Codec.BZip2CompressionInputStream can cause records to be dropped when 
 reading them in splits based on where record delimiters occur relative to 
 compression block boundaries.
 Thanks to [~knoguchi] for discovering this problem while working on PIG-3251.



--
This message was sent by Atlassian JIRA
(v6.1#6144)