Speed of linkDB Merge

2017-04-02 Thread Michael Coffey
In my situation, I find that linkdb merge takes much more time than fetch and parse combined, even though fetch is fully polite. What is the standard advice for making linkdb-merge go faster? I call invertlinks like this: __bin_nutch invertlinks "$CRAWL_PATH"/linkdb

[ANNOUNCE] Apache Nutch 1.13 Release

2017-04-02 Thread lewis john mcgibbney
Hello Folks, The Apache Nutch [0] Project Management Committee are pleased to announce the immediate release of Apache Nutch v1.13, we advise all current users and developers of the 1.X series to upgrade to this release. Nutch is a well matured, production ready Web crawler. Nutch 1.x enables

[RESULT] WAS Re: [VOTE] Release Apache Nutch 1.13 RC#1

2017-04-02 Thread lewis john mcgibbney
Hi Folks, Thank you to everyone who was able to review the RC and VOTE, greatly appreciated. 72 has come and gone, please see below for RESULT's. [9] +1 Release this package as Apache Nutch 1.13. Lewis John McGibbney * Julien Nioche * Kevin Ratnasekera Chris A. Mattmann * Furkan KAMACI * Matei