On Wed, Jan 20, 2010 at 8:11 PM, axi <axi...@gmail.com> wrote:
>
> If you put image as link, is commonly known that alt text of that image is
> equivalent to the anchor text of text link. Now if you put an image with alt
> text inside a link, anchor text for that link is empty and no image alt text
> is counted.

are you crawling for images? or

http://svn.apache.org/repos/asf/lucene/nutch/trunk/conf/crawl-urlfilter.txt.template

# skip image and other suffixes we can't yet parse
-\.(gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|sit|eps|wmf|zip|ppt|mpg|xls|gz|rpm|tgz|mov|MOV|exe|jpeg|JPEG|bmp|BMP)$

>
> Nutch Newbie wrote:
>>
>> On Wed, Jan 20, 2010 at 4:16 PM, axi <axi...@gmail.com> wrote:
>>>
>>> after several test, I have noticed that nutch ignores alt text of images
>>> inside  " tags.
>  So, this feature isn't implemented yet right?
>>
>> what exactly you want nutch should do to the "alt text" index it?
>> tokenize it? make this field available as query i.e. "img_alt:my alt
>> tags" or?
>>
>>
>>>
>>>
>>> thanks in advance,
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Alt-text-of-images-as-anchor-text-tp27244358p27244358.html
>>> Sent from the Nutch - Dev mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/Alt-text-of-images-as-anchor-text-tp27244358p27247820.html
> Sent from the Nutch - Dev mailing list archive at Nabble.com.
>
>

Reply via email to