Use the normalizer.
 
If you see regex-normalizer.xml  (conf directory) it should already have a rule to remove Jsession iDs.
This one is a bit unique as it has a "-" . So we may need to write another one -- sho ld be simple.
 
CC-
 


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of LocalSearch.HK
Sent: Saturday, February 26, 2005 9:05 PM
To: [EMAIL PROTECTED]
Subject: [Nutch-dev] Re: [Nutch-general] Question about normalizing urls

Hi,
 
Can anyone help? I know it is a bit of a newbie question, but I'm stuck with it.
 
Shri
 
----- Original Message -----
Sent: Thursday, February 24, 2005 11:27 PM
Subject: [Nutch-general] Question about normalizing urls

Hi,
 
Can someone help me with this URL? How would I remove the session ids?
 
 ;jsessionid=Cdpd452kLqgrnjrvrCJjVjQdmLwWJjnG4JQ4KhPJq2ThQL4XbFzS!-1364780454?
 
Regards,
Shri
 

Reply via email to