On Wednesday 26 October 2005 10:04 pm, boblq wrote:
> On Wednesday 26 October 2005 12:10 pm, Stewart Stremler wrote:
> > begin  quoting boblq as of Wed, Oct 26, 2005 at 11:49:22AM -0700:
> > > On Tuesday 25 October 2005 08:59 am, Stewart Stremler wrote:
> > > > Writing your own XML parser that tries to put out meaningful error
> > > > messages is (a) seen as a waste of time as you're writing a redundant
> > > > parser and (b) is apt to be buggy and error-prone itself, making it
> > > > worse than what you have to deal with already.
> > >
> > > UH, duh. Isn't that one of the reasons why Open Source exists?
> >
> > Depends on who you are. Most folk want open-source because it results in
> > all software being (essentially) free-as-in-beer.
> >
> > > There are pretty decent parsers out there already, e.g.
> > >
> > > expat  http://expat.sourceforge.net/
> > >
> > > SAX  http://www.saxproject.org/
> > >
> > > You could contribute to these projects by improving the
> > > error reporting ...
> >
> > I haven't looked at these,
>
> > > Why would you need to write your own XML parser?
> >
> > Often, good error reporting isn't something that can be bolted on to a
> > system afterwards.
>
> How do you know about these when you have not looked at them ...
> often? Maybe just once you should look at  the code instead of
> blindly citing your prejudices.
>
> Too much to ask I guess.
>
> BobLQ

Ok, I downloaded expat. Did the usual 
./configure
make
make install 

That took about 10 minutes. 

The libs are in /usr/local/lib 
where one would expect them to be. 

I did not compile and run the examples but a glance
at the source code suggests they would likely work.
They certainly look easy enough to understand. 

Looking in the docs I find  these functions which look
like reasonable set of hooks tome on which you can 
build whatever you want ... 
------------------------------------------------------------
Parse position and error reporting functions

These are the functions you'll want to call when 
the parse functions return XML_STATUS_ERROR 
(a parse error has occurred), although the position 
reporting functions are useful outside of errors. The 
position reported is the byte position (in the original 
document or entity encoding) of the first of the 
sequence of characters that generated the current 
event (or the error that caused the parse functions 
to return XML_STATUS_ERROR.) The exceptions are 
callbacks triggered by declarations in the document 
prologue, in which case they exact position reported 
is somewhere in the relevant markup, but not necessarily 
as meaningful as for other events.


The position reporting functions are accurate only 
outside of the DTD. In other words, they usually return 
bogus information when called from within a DTD declaration 
handler.

enum XML_Error XMLCALL
XML_GetErrorCode(XML_Parser p);

 Return what type of error has occurred. 
const XML_LChar * XMLCALL
XML_ErrorString(enum XML_Error code);

 Return a string describing the error corresponding to code. 
The code should be one of the enums that can be returned 
from XML_GetErrorCode. 
long XMLCALL
XML_GetCurrentByteIndex(XML_Parser p);

 Return the byte offset of the position. This always corresponds 
to the values returned by XML_GetCurrentLineNumber and 
XML_GetCurrentColumnNumber. 
int XMLCALL
XML_GetCurrentLineNumber(XML_Parser p);

 Return the line number of the position. The first line is reported as 1. 
int XMLCALL
XML_GetCurrentColumnNumber(XML_Parser p);

 Return the offset, from the beginning of the current line, 
of the position. 
int XMLCALL
XML_GetCurrentByteCount(XML_Parser p);

 Return the number of bytes in the current event. Returns 0 if the 
event is inside a reference to an internal entity and for the end-tag 
event for empty element tags (the later can be used to distinguish 
empty-element tags from empty elements using separate start and 
end tags). 
const char * XMLCALL
XML_GetInputContext(XML_Parser p,
                    int *offset,
                    int *size);

Returns the parser's input buffer, sets the integer pointed at by 
offset to the offset within this buffer of the current parse position, 
and set the integer pointed at by size to the size of the returned buffer.

This should only be called from within a handler during an active parse 
and the returned buffer should only be referred to from within the handler 
that made the call. This input buffer contains the untranslated bytes of 
the input.

Only a limited amount of context is kept, so if the event triggering a 
call spans over a very large amount of input, the actual parse position 
may be before the beginning of the buffer. 
----------------------------------------------------------------------------------

Just my take after say thirty minutes. 

I did use expat four or five years ago on a project and
it worked fine for me then. That was back when James Clark
first released it. I have no idea how much it has evolved 
since then, but certainly a lot of people seem to use it
without much complaint. 

So it goes,

BobLQ



-- 
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to