Hi,

the example is meant as a tab-separated seed file
 <url> <tab> <key>=<value>

The invalid URL looks like the seed list contained
  url  space  backslash  letter t  space  key=value

Best Sebastian

On 03/28/2017 11:29 AM, Chaushu, Shani wrote:
> Hi,
> I'm trying to run crawl with nutch 1.12, and the seed file contains urls in 
> this form (like the Example in the code comments)
> http://www.nutch.org/ \t key=value
> 
> when I try to crawl, the log has error with invalid url 
> http://www.nutch.org/%20\t%20key=value - the tab and key value custom 
> metatags are considers as part of the url - the injector didn't  parse the 
> meta tags.
> I tried to add urlmeta in plugin.include property, and add the key to 
> urlmeta.tags
> 
> Am I missing something? Something else to make it work ?
> 
> Thanks,
> Shani
> 
> ---------------------------------------------------------------------
> Intel Electronics Ltd.
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 

Reply via email to