[ http://issues.apache.org/jira/browse/HADOOP-779?page=all ]
Hairong Kuang updated HADOOP-779:
---------------------------------
Status: Patch Available (was: Open)
The patch fixes the described bug. In addition, it does the following:
1. clean up "next" function of StreamLineRecordReader
2. add a junit test for gzipped input
3. restructure the junit test TestStreaming
4. turn on the debug option for streaming test cases in build-contrib.xml
> Hadoop streaming does work with gzipped input
> ---------------------------------------------
>
> Key: HADOOP-779
> URL: http://issues.apache.org/jira/browse/HADOOP-779
> Project: Hadoop
> Issue Type: Bug
> Components: contrib/streaming
> Affects Versions: 0.9.0
> Reporter: Hairong Kuang
> Assigned To: Hairong Kuang
> Fix For: 0.10.0
>
> Attachments: GzipIn.patch
>
>
> When input files are gzipped, StreamLineRecordReader does not take the corect
> OutputStream to fetch the next record. Instead of using a GzipOutputStream,
> it uses a FSOutputStream. So input files are read as uncompressed plain text.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira