Hi Tien Short answer : not yet. BTW the WARCExporter is more scalable than the CCDataDumper. As mentioned in [NUTCH-2102 <https://issues.apache.org/jira/browse/NUTCH-2102>] we could add an importer in the package org/apache/nutch/tools/warc.
Julien On 16 December 2015 at 07:22, Nguyen Manh Tien <[email protected]> wrote: > Hi, > > We have a tool (CommonCrawlDataDumper) to convert Nutch data into Common > Crawl WARC format. > > Can we import WARC file back to Nutch data (segments) and is there any > tool to do that? > > Thanks, > Tien > -- *Open Source Solutions for Text Engineering* http://www.digitalpebble.com http://digitalpebble.blogspot.com/ #digitalpebble <http://twitter.com/digitalpebble>

