Semyon Semyonov commented on NUTCH-1541:

but why don't you write directly to HDFS without local file system step? In 
other words, why don't you create a new file in HDFS for each reducer?
I understand that it will reduce I/O for the file, but it will give a control 
for the distribution through multiple reducers.

> Indexer plugin to write CSV
> ---------------------------
>                 Key: NUTCH-1541
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1541
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer
>    Affects Versions: 1.7
>            Reporter: Sebastian Nagel
>            Priority: Minor
>         Attachments: NUTCH-1541-v1.patch, NUTCH-1541-v2.patch
> With the new pluggable indexer a simple plugin would be handy to write 
> configurable fields into a CSV file - for further analysis or just for export.

This message was sent by Atlassian JIRA

Reply via email to