RE: Parsed segment has outlinks filtered

2019-10-17 Thread yossi.tamari
Hi Sachin, I'm not sure what you are trying to achieve: If you don't want to filter the outlinks, why do you enable urlfilter-regex? Anyway, if you set the property parse.filter.urls to false, the Parser will not filter outlinks at all. Yossi. -Original Message- From: Sachin

Parsed segment has outlinks filtered

2019-10-17 Thread Sachin Mittal
Hi, I was bit confused on the outlinks generated from a parsed url. If I use the utility: bin/nutch parsechecker url The generated outlinks has all the outlinks. However if I check the dump of parsed segment generated using nutch crawl script using command: bin/nutch readseg -dump