[jira] Created: (NUTCH-506) Nutch should delegate compression to Hadoop

JIRA Fri, 29 Jun 2007 05:46:28 -0700

Nutch should delegate compression to Hadoop
-------------------------------------------


                 Key: NUTCH-506
                 URL: https://issues.apache.org/jira/browse/NUTCH-506
             Project: Nutch
          Issue Type: Improvement
            Reporter: Doğacan Güney
             Fix For: 1.0.0


Some data structures within nutch (such as Content, ParseText) handle their own 
compression. We should delegate all compressions to Hadoop. 

Also, nutch should respect io.seqfile.compression.type setting. Currently even 
if io.seqfile.compression.type is BLOCK or RECORD, nutch overrides it for some 
structures and sets it to NONE (However, IMO, ParseText should always be 
compressed as RECORD because of performance reasons).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (NUTCH-506) Nutch should delegate compression to Hadoop

Reply via email to