[
https://issues.apache.org/jira/browse/PIG-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Koji Noguchi updated PIG-4779:
------------------------------
Attachment: pig-4779-v01.patch
Sorry for the delay.
In my test environment (on both mac and linux), test was incorrectly passing by
throwing
{panel}
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
java.io.IOException: unexpected end of stream
{panel}
even when hadoop's TextInputFormat was used.
Looking at the test, found that concatenated bzip file was corrupt.
I don't think we can use character-line based BufferedWriter/Reader for
creating a binary(bzip) file.
Though, I don't know how the build on the description was correctly failing...
Fixing test's {{catInto}} method and adjusting the testBZ2Concatenation to
follow [~rohini]'s suggestion.
{quote}
Can you fix the testcase? I think it would be good to keep the original one
which throws the exception for Pig's bzipinputformat and another one for
hadoop's which passes and also verifies the output.
{quote}
> testBZ2Concatenation[pig.bzip.use.hadoop.inputformat = true] failing due to
> successful read
> -------------------------------------------------------------------------------------------
>
> Key: PIG-4779
> URL: https://issues.apache.org/jira/browse/PIG-4779
> Project: Pig
> Issue Type: Bug
> Reporter: Koji Noguchi
> Priority: Minor
> Attachments: pig-4779-v01.patch
>
>
> From
> [Pig-3251|https://issues.apache.org/jira/browse/PIG-3251?focusedCommentId=15096780#comment-15096780],
> {{testBZ2Concatenation [pig.bzip.use.hadoop.inputformat = true\]}} is
> failing .
> {quote}
> Koji Noguchi,
> https://builds.apache.org/job/Pig-trunk-commit/2278/testReport/org.apache.pig.test/TestBZip/testBZ2Concatenation_pig_bzip_use_hadoop_inputformat___true__/
> tests are failing. This should because concatenated bzip works with hadoop's
> TextInputFormat. Can you fix the testcase? I think it would be good to keep
> the original one which throws the exception for Pig's bzipinputformat and
> another one for hadoop's which passes and also verifies the output.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)