Merge branch 'ccdump-inlinks' of https://github.com/thammegowda/nutch into inlinks
Project: http://git-wip-us.apache.org/repos/asf/nutch/repo Commit: http://git-wip-us.apache.org/repos/asf/nutch/commit/04d203ad Tree: http://git-wip-us.apache.org/repos/asf/nutch/tree/04d203ad Diff: http://git-wip-us.apache.org/repos/asf/nutch/diff/04d203ad Branch: refs/heads/master Commit: 04d203ad4d8fa66eab5f1bff53eedfe68b2d5310 Parents: 0e03daf f5adbcc Author: Chris Mattmann <[email protected]> Authored: Sat May 7 10:38:50 2016 -1000 Committer: Chris Mattmann <[email protected]> Committed: Sat May 7 10:38:50 2016 -1000 ---------------------------------------------------------------------- .../nutch/tools/AbstractCommonCrawlFormat.java | 644 ++++++++++--------- .../nutch/tools/CommonCrawlDataDumper.java | 47 +- .../apache/nutch/tools/CommonCrawlFormat.java | 15 + 3 files changed, 387 insertions(+), 319 deletions(-) ----------------------------------------------------------------------
