Thanks for the response Markus. disabling urlnormalizer-basic works.
On Tue, Jan 9, 2024 at 3:43 PM Markus Jelsma
wrote:
> Hello Steve,
>
> Having those spaces normalized/encoded is expected behaviour with
> urlnormalizer-basic active. I would recommend to keep it this way and have
> all URLs in
Hello Steve,
Having those spaces normalized/encoded is expected behaviour with
urlnormalizer-basic active. I would recommend to keep it this way and have
all URLs in Solr properly encoded. Having spaces in Solr IDs is also not
recommended as it can lead to unexpected behaviour.
If you really don'
unsubscribe
On Tue, Jan 9, 2024 at 1:20 PM Steve Cohen wrote:
> Hello,
>
> I am updating a nutch crawl that read files in directories that have
> spaces. The urls show %20 instead of spaces. This doesn't seem to be what
> the behavior was in the past.
>
> In nutch 1.10 I get these results
>
> Nu
3 matches
Mail list logo