[Nutch-dev] [jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

JIRA Fri, 01 Jun 2007 00:53:41 -0700

    [ 
https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500603
 ]


Doğacan Güney commented on NUTCH-392:
-------------------------------------

From what I  understand of MapFile.Writer code in hadoop, if you give 
CompressionType as an argument in its constructor it overwrites the compression 
value in config. So since nutch manually sets parse_text and parse_data to 
RECORD compression ( and crawl_parse to NONE), we will not get the advantages 
of BLOCK compression even if we set it in config. 

BLOCK compression seems to work really great if you got the native libraries in 
place, so IMHO it would be better to not manually set CompressionType and allow 
people to set it to whatever they want in config.

> OutputFormat implementations should pass on Progressable
> --------------------------------------------------------
>
>                 Key: NUTCH-392
>                 URL: https://issues.apache.org/jira/browse/NUTCH-392
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: Doug Cutting
>            Assignee: Andrzej Bialecki 
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-392.patch
>
>
> OutputFormat implementations should pass the Progressable they are passed to 
> underlying SequenceFile implementations.  This will keep reduce tasks from 
> timing out when block writes are slow.  This issue depends on 
> http://issues.apache.org/jira/browse/HADOOP-636.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

[Nutch-dev] [jira] Commented: (NUTCH-392) OutputFormat implementations should pass on Progressable

Reply via email to