After analyzing this a bit more, I think there is a bug in here. The
indexer doesn't like hitting any 404 URLs in the file: protocol setup.
Here's what I have so far:
This is the trace I had before:
> > #0 0xff036f3c in strlen () from /usr/lib/libc.so.1
> > (gdb) bt
> > #0 0xff036f3c in strlen () from /usr/lib/libc.so.1
> > #1 0xff081b68 in _doprnt () from /usr/lib/libc.so.1
> > #2 0xff083bdc in vsnprintf () from /usr/lib/libc.so.1
> > #3 0x23a58 in UdmLog (Agent=0xffbd5d98, level=4,
> > fmt=0x4cf80 "Status: %d %s, type: %s, Size: %d") at log.c:164
> > #4 0x18000 in UdmIndexURL (Indexer=0x1680538, Doc=0x16a1bd0,
> > index_flags=23608904) at indexer.c:879
> > #5 0x18ee4 in UdmIndexNextURL (Indexer=0x1680538, index_flags=4)
> > at indexer.c:1240
> > #6 0x152ec in thread_main (arg=0x1680538) at main.c:252
> > #7 0x15edc in main (argc=0, argv=0xffbefc04) at main.c:634
> > (gdb)
I examined the Doc object at frame 4 to check out the URL that was causing
problems. My first inclination is that one of the fields in this message
is either empty or not set properly:
"Status: %d %s, type: %s, Size: %d"
I haven't been able to find the second %s, but I verified that the third
%s (the content type) was NULL. This is what is causing my seg fault and
core dump. Also for some weird reason, the size field was not zero as
expected but rather it was set to 26. The only string that I found was
anywhere close to 26 was "HTTP/1.0 404 Not found" which would normally be
followed by \r\n which brings the length to 24. With an extra \n it would
be 25 and the \0 would make it 26 I guess.
I'll add a quick hack to fix this for my code, and see if I can write a
nice patch for this. If possible, can someone verify this problem as well?
Just add a random file:/ link to an html document and try to index it.
Thanks,
- Danish
--
On Fri, 11 Jan 2002, Danish Qadri wrote:
>
> Actually :),
>
>
> I wasn't sure if the "file:" protocol URLs needed to be encoded....
> Apparently they do!
>
>
> - Danish
>
> On Fri, 11 Jan 2002, Danish Qadri wrote:
>
> >
> >
> > Hi team,
> >
> >
> > I got a seg fault while doing my inital indexing of my MySQL DB. With our
> > setup, we store a filename in the database, along with some other
> > information. Here's the output from the logs and a core dump back trace.
> > It seems to be dumping core whenever it encounters the file. From the
> > looks of it, the UdmLog function didn't like the spaces in the filename
> > perhaps?
> >
> > I'll go try out a few things to see if I can isolate the problem.
> >
> > - Danish
> >
> > [1] stop: 'on'-''
> > [1] stop: 'new'-'en'
> > [1] stop: 'of'-''
> > [1] stop: 'on'-''
> > [1] stop: 'and'-''
> > [1] stop: 's'-'en'
> > [1] stop: 'last'-'en'
> > [1] stop: 'with'-''
> > [1] stop: 'with'-''
> > [1] stop: 'h'-'en'
> > [1] Link http://www.xlhtml.org/
> > [1] Allow by default
> > [1] URL: file:/h/intranet/files/10/Wise Track Manual Brief.doc
> > [1] Server ''
> > [1] Allow NoCase file:*
> > Segmentation Fault (core dumped)
> > root:shadow [21:23:05] $
> >
> > #0 0xff036f3c in strlen () from /usr/lib/libc.so.1
> > (gdb) bt
> > #0 0xff036f3c in strlen () from /usr/lib/libc.so.1
> > #1 0xff081b68 in _doprnt () from /usr/lib/libc.so.1
> > #2 0xff083bdc in vsnprintf () from /usr/lib/libc.so.1
> > #3 0x23a58 in UdmLog (Agent=0xffbd5d98, level=4,
> > fmt=0x4cf80 "Status: %d %s, type: %s, Size: %d") at log.c:164
> > #4 0x18000 in UdmIndexURL (Indexer=0x1680538, Doc=0x16a1bd0,
> > index_flags=23608904) at indexer.c:879
> > #5 0x18ee4 in UdmIndexNextURL (Indexer=0x1680538, index_flags=4)
> > at indexer.c:1240
> > #6 0x152ec in thread_main (arg=0x1680538) at main.c:252
> > #7 0x15edc in main (argc=0, argv=0xffbefc04) at main.c:634
> > (gdb)
> >
> >
>
>
--
Danish Qadri
Systems Programmer
Globix Corporation
[EMAIL PROTECTED]
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]