On Fri, 16 Nov 2001, Matt Sullivan wrote: > On Thu, 15 Nov 2001, Kir Kolyshkin wrote: > > > Gregory Kozlovsky wrote: > > > > > > Here are some small problems I found that would be nice to have fixed. > > > > > > 1. ASPSeek tries to follow <IMG SRC ...> links to images. This wastes time > > > and space. > > > > I believe you are wrong here. Please re-check this and prove ;) > > Images are only retrieved as a side effect of indexing dynamic content since it > not known until the document headers are examined what the content type is. By > this time the url must have been stored into urlword (via href discovery) yet > it will not store the images content when index processes the url, a side > effect of how the indexer works.
This is of course provided content-type is actually set correctly. > It does not index IMG SRC tags directly. > > > Matt. > > > > 2. There are a lot of messages like: > > > mailto:somebody@somewhere > > > Unsupported protocol > > > The same thing for javascript: > > > > This is just a debugging message, it should not be considered as error. > > > > > 3. The French stopwords list does not include "de" which is one of the most > > > common words in French. > > > > Ok, I will include it. Can somebody from France confirm it? > > > > -- > > [EMAIL PROTECTED] ICQ 7551596 Phone +7 903 6722750 > > Hard work may not kill you, but why take chances? > > -- > > > > > >
