[ https://issues.apache.org/jira/browse/NUTCH-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500603 ]
Doğacan Güney commented on NUTCH-392: ------------------------------------- From what I understand of MapFile.Writer code in hadoop, if you give CompressionType as an argument in its constructor it overwrites the compression value in config. So since nutch manually sets parse_text and parse_data to RECORD compression ( and crawl_parse to NONE), we will not get the advantages of BLOCK compression even if we set it in config. BLOCK compression seems to work really great if you got the native libraries in place, so IMHO it would be better to not manually set CompressionType and allow people to set it to whatever they want in config. > OutputFormat implementations should pass on Progressable > -------------------------------------------------------- > > Key: NUTCH-392 > URL: https://issues.apache.org/jira/browse/NUTCH-392 > Project: Nutch > Issue Type: New Feature > Components: fetcher > Reporter: Doug Cutting > Assignee: Andrzej Bialecki > Fix For: 1.0.0 > > Attachments: NUTCH-392.patch > > > OutputFormat implementations should pass the Progressable they are passed to > underlying SequenceFile implementations. This will keep reduce tasks from > timing out when block writes are slow. This issue depends on > http://issues.apache.org/jira/browse/HADOOP-636. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers