Anyone?

I see better results with fixing embedded params disabled? No more crap URL's 
and i've never actually seen embedded params used in real life.

On Tuesday 13 September 2011 13:53:40 Markus Jelsma wrote:
> Hi,
> 
> Another complaint on Nutch' handling of outlinks. Since NUTCH-436 there is
> better support for embedded segment parameters. This exotic feature,
> however, causes a lot of invalid outlinks to be generated.
> 
> For some reason (most likely bad webmasters like my other thread) i see a
> lot of URL's with embedded params that actually are not meant to be
> embedded params such as:
> 
> http://<HOST>.nl/webwinkel-tips.html;-plezier/55802-speelspiraal-van-baby-
> butt.html anchor: TIPS
> 
> I would propose an option to disable the fixing of embedded params in
> DomContentUtils.
> 
> Thoughts?
> 
> Thanks,

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to