StreamXMLRecordReader does not support gzipped files
----------------------------------------------------

                 Key: HADOOP-3562
                 URL: https://issues.apache.org/jira/browse/HADOOP-3562
             Project: Hadoop Core
          Issue Type: Bug
          Components: contrib/streaming
    Affects Versions: 0.17.0
            Reporter: Bo Adler


I am using Hadoop Streaming to analyze Wikipedia data files, which are in XML 
format and are compressed because they are so large.  While doing some 
preliminary tests, I discovered that you cannot use StreamXMLRecordReader with 
gzipped data files -- the data is fed into the mapper script as raw data.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to