Awesome, it's works! Thanks a lot, Kevin.
-----Original Message----- From: kevin chen [mailto:kevinc...@bdsing.com] Sent: 2009年9月27日 9:36 To: nutch-user@lucene.apache.org Subject: Re: How can nutch crawl the content of a dynamic url with a query string? By default, nutch skips URLs containing certain characters. To change it, open regex-urlfilter.txt, comment out the following line. # skip URLs containing certain characters as probable queries, etc. -[...@=] On Sun, 2009-09-27 at 03:55 +0800, Shawn Young wrote: > Hi all, > > I have a question, if a web page's url likes > http://www.test.com/test.php?gid=1111111 ,how can nutch crawl its content? > I've had a try, but it seems that nutch ignores the query string > 'gid=1111111' of the url. > > Can someone helps me? > Thanks. >