It is great.

Thanks Julien

On Wed, Dec 16, 2015 at 4:54 PM, Julien Nioche <
[email protected]> wrote:

> Hi Tien
>
> Short answer : not yet.
> BTW the WARCExporter is more scalable than the CCDataDumper. As mentioned
> in [NUTCH-2102 <https://issues.apache.org/jira/browse/NUTCH-2102>] we
> could add an importer in the package org/apache/nutch/tools/warc.
>
> Julien
>
> On 16 December 2015 at 07:22, Nguyen Manh Tien <[email protected]>
> wrote:
>
>> Hi,
>>
>> We have a tool (CommonCrawlDataDumper) to convert Nutch data into Common
>> Crawl WARC format.
>>
>> Can we  import WARC file back to Nutch data (segments) and is there any
>> tool to do that?
>>
>> Thanks,
>> Tien
>>
>
>
>
> --
>
> *Open Source Solutions for Text Engineering*
>
> http://www.digitalpebble.com
> http://digitalpebble.blogspot.com/
> #digitalpebble <http://twitter.com/digitalpebble>
>

Reply via email to