Re: Custom IndexWriter never called on index command

2017-08-09 Thread Barnabás Balázs
the Indexer creates the ParseData perfectly? On 2017. 08. 09. 19:00:27, Barnabás Balázs <barnabas.bal...@impresign.com> wrote: Dear community! I'm relatively new to Nutch 1.x and got stumped on an indexing issue. I have a local Java application that sends Nutch jobs to a remote Hadoop depl

Crawl issues and Custom IndexWriter never called on index command solution

2017-08-17 Thread Barnabás Balázs
selenium.page.load.delay = "10" Some of the above issues could be because of the faulty data received from some of these sites, auto-skipping those could be an acceptable solution if we can predictably detect those cases. Anyway, I'd appreciate any help I could get with any of the above issues

Custom IndexWriter never called on index command

2017-08-09 Thread Barnabás Balázs
Dear community! I'm relatively new to Nutch 1.x and got stumped on an indexing issue. I have a local Java application that sends Nutch jobs to a remote Hadoop deployment for execution. The jobs are sent in the following order: Inject -> Generate -> Fetch -> Parse -> Index -> Update ->

Re: Crawl issues and Custom IndexWriter never called on index command solution

2017-08-22 Thread Barnabás Balázs
Hi, Sadly I haven't been able to progress with these issues, by any chance does anyone in the community know, how any of these problems could be solved? On 2017. 08. 17. 12:00:32, Barnabás Balázs <barnabas.bal...@impresign.com> wrote: Dear Nutch Community, Thanks for the answer, as it