According to zheng hong:
> I have some confusing about htdig crawlering for .asp dynamic web page,
> some company use the lotus notes to create the web page, the web page
> have the button(like ) to express different sessions, these button can
> be open to express the hyperlink to the subsession, also the button can
> be click to the close state which don't have the hyperlink for
> subsession. My questions are if we crawler these web site and the all
> the session will change a little for open or close modes, with same time
> the hyperlink subsession come out or have a little change, at this time,
> the dynamic state cause we have the unlimited web pages, so for our
> crawling, we always crawling, but can't go to end, so how can we solve
> this problem? how can we separate the dynamic change to our web page
> crawling?
Well, if you can find specific patterns in the URLs or query strings that
you want to avoid, for instance to avoid the links to expanded sections,
then you can add these patterns to exclude_urls or bad_querystr. It's
hard to give an exact solution when you don't clearly define the problem,
i.e. what the URLs look like and how they change when htdig is continually
crawling the same pages, and you don't give us URLs that we can visit
ourselves to see what's going on.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html