On Fri, 16 Nov 2001, Matt Sullivan wrote:

> On Thu, 15 Nov 2001, Kir Kolyshkin wrote:
> 
> > Gregory Kozlovsky wrote:
> > > 
> > > Here are some small problems I found that would be nice to have fixed.
> > > 
> > > 1. ASPSeek tries to follow <IMG SRC ...> links to images. This wastes time
> > > and space.
> > 
> > I believe you are wrong here. Please re-check this and prove ;)
> 
> Images are only retrieved as a side effect of indexing dynamic content since it
> not known until the document headers are examined what the content type is. By
> this time the url must have been stored into urlword (via href discovery) yet
> it will not store the images content when index processes the url, a side
> effect of how the indexer works.

This is of course provided content-type is actually set correctly.


> It does not index IMG SRC tags directly. 
> 
> 
> Matt.
> 
> > > 2. There are a lot of messages like:
> > >    mailto:somebody@somewhere
> > >    Unsupported protocol
> > > The same thing for javascript:
> > 
> > This is just a debugging message, it should not be considered as error.
> > 
> > > 3. The French stopwords list does not include "de" which is one of the most
> > > common words in French.
> > 
> > Ok, I will include it. Can somebody from France confirm it?
> > 
> > -- 
> > [EMAIL PROTECTED]  ICQ 7551596  Phone +7 903 6722750
> > Hard work may not kill you,  but why take chances?
> > --
> > 
> 
> 
> 
> 

Reply via email to