Hi Folks,
I've implemented what Dave suggested... it is clean and easy but it maybe
not quite as ad-hoc-capable as one would always want. For my use cases it
was acceptable.
More responses inline
On Thu, Sep 19, 2019 at 2:47 PM wrote:
> From: Jorge Betancourt
> To: user@nutch.apache.org
> Cc:
>
could be a good improvement
> for Nutch.
>
> Regards, Roannel
>
> - Original Message -
> > From: "Jorge Betancourt"
> > To: "user"
> > Sent: Lunes, 16 de Septiembre 2019 13:14:36
> > Subject: [MASSMAIL]Re: Injection from webservic
6 de Septiembre 2019 13:14:36
> Subject: [MASSMAIL]Re: Injection from webservice
> Hi Roannel,
>
> The current implementation of the injector only accepts a path (actually an
> org.apache.hadoop.fs.Path) this means that there is no way to feed an URL
> directly unless you download the
Or use a scheduled wget job to pull them from the remote server and store
them on a path that Nutch can access locally.
Regards,
Dave Beckstrom
Technical Delivery Manager / Senior Developer
em: dbeckst...@collectivefls.com
ph: 763.323.3499
On Mon, Sep 16, 2019 at 12:14 PM Jorge Betancourt <
be
Hi Roannel,
The current implementation of the injector only accepts a path (actually an
org.apache.hadoop.fs.Path) this means that there is no way to feed an URL
directly unless you download the content first.
If you use the REST API you can send the seed file using the API endpoint.
Otherwise, y
Hi folks,
Is there any way in Nutch 1.15 to inject a remote seed file (accessible via
http or https)?
I mean this, for instance:
bin/nutch inject crawl http://example.org/seed
Regards
1519-2019: Aniversario 500 de la Villa de San Cristóbal de La Habana
Por La Habana, lo más grande. #Habana500
6 matches
Mail list logo