Brian Whitman wrote:
> Hi all,
> I looked into this a bit more after it crashed for the third time in a row.
> 
> every time it has segfaulted it's had this url as one of the past few 
> fetches:
> 
> fetching http://www.c 
> bs.nu/cgi-bin/ac/adcycle.cgi?gid=4&layout=multi&id=125
> 
> Note the space in there. This URL is not in my initial fetchlist so it 
> was found somewhere. Not sure if the space is actually a space or an 
> encoding -> terminal issue, either way I think this has something to do 
> with it. Does anyone know what happens when java/nutch gets a hostname 
> that is obviously malformed?

I believe is should throw a malformed url exception.

Dennis Kubes
> 
> -Brian
> 
> 
> 
> 
> On May 6, 2007, at 11:00 AM, Andrzej Bialecki wrote:
> 
>> Brian Whitman wrote:
>>> Got this segfault + crash when fetching in the middle of a large 
>>> fetch. Seems to be in looking up a hostname?
>>
>> Is this by any chance a FreeBSD machine of 4.x or 5.x vintage? There 
>> was a bug in FreeBSD's getaddrinfo, which would manifest in a very 
>> similar way when running multithreaded apps linked to libc_r or 
>> libpthread.
>>
>> -- 
>> Best regards,
>> Andrzej Bialecki     <><
>>  ___. ___ ___ ___ _ _   __________________________________
>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>> http://www.sigram.com  Contact: info at sigram dot com
>>
> 

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to