Hi David,
there are a couple of options to configure how links are followed by the
crawler, esp.
db.max.outlinks.per.page
db.ignore.external.links
It the white space in the URLs intended?
> https://assets0.mysite.com/asset /DB_product.pdf
>>> +https://assets.*. mysite.com/asset
URLs normall
Hey Sebastian,
thanks a lot. I already increased it to around 65MB. All our pdfs about 3 to
8mb big.
Any other suggestions?
;)
Thanks
David
> Am 09.08.2017 um 18:50 schrieb Sebastian Nagel :
>
> Hi David,
>
> for PDFs you usually need to increase the following property:
>
>
> http.conten
Hi David,
for PDFs you usually need to increase the following property:
http.content.limit
65536
The length limit for downloaded content using the http
protocol, in bytes. If this value is nonnegative (>=0), content longer
than it will be truncated; otherwise, no truncation at all. Do
3 matches
Mail list logo