On Wednesday, December 19, 2012, thomasg wrote:

> On Wed, Dec 19, 2012 at 4:38 AM, Gustavo Sverzut Barbieri <
> barbi...@profusion.mobi <javascript:;>> wrote:
>
> > Hi Thomas,
> >
> > The standard way is pretty fast and lean, but it is a SAX-like parser.
> That
> > mean you only get tokens, for the tags you need to call yet another
> > function to split the tag and arguments.
> >
> > It is good enough to parse svg, as done by Esvg. Should be also enough to
> > parse config files and your chat.xml
> >
> > There is also a version trust creates nodes from XML. It's useful to
> debug
> > and for simple cases without performance worries. As very likely you will
> > store your parsed data in a custom structure than a generic "Dom", I
> > recommend using the sax version.
> >
> > I didn't try the example with your XML, but seems to be okay. The example
> > could use eina_strbuf instead of array of strings, but that's marginal.
> > Also could use the size and avoid strncmp(), but also marginal for an
> > example.
> >
> > What is exactly failing?
> >
>
> As you can see, the tags are totally wrong.
> They are neither corretly aligned (a <foo> can be closed with </bar> and
> not just </foo>), nor do the items correspond with the tags.
> So if the input is not 100% like the parser expects it, say there's an
> additional level, the parser won't fail but just receive totally wrong
> data.
> If I want to make sure that I get the date from tag <baz>DATA</baz>, I have
> to manually compare the string and it seems that I might as well just parse
> it myself alltogether.


That is always the case with sax. It allows you to handle errors yourself,
like abort, auto fix, etc. like parsing bogus HTML that is common in the
Internet.

I don't recall how strict I was with the tree/node version, I guess to make
it usable by Evas textblock u can close tags with </>, but not sure if you
specify an incorrect close tag what it would do. Anyway I'd recommend a
final version to avoid the intermediate node tree and use sax directly,
then you get more eficient data structures.

Also consider always using the size. The original buffer is not modified,
then strings will not be null terminated.

Usually the sax parser will keep a stack, and you can validate based in
that. But just validate if data is untrusted. Same for attributes, you just
pay the price if you expect them for such tag. IOW it can be very
efficient.

The added benefit of using it over manual parse is that it will handle
whitespaces and also do minimal tag boundary match. If > is missing, etc.
that will emit errors.




>
>
> >
> > On Wednesday, December 19, 2012, thomasg wrote:
> >
> > > Hi everyone,
> > >
> > > I was just looking at Eina Simple XML which, at first sight seemed a
> nice
> > > tiny XML library.
> > > However after looking closer, it seems that it is only useful to create
> > > basic XML files, but NOT to read/parse them.
> > >
> > > I used the eina_simple_xml_parse function and realized, that this
> > basically
> > > is it, every single step of parsing has to be done manually and it
> > > basically makes no difference if eina_simple_xml is used or not at all.
> > > I then took a look at the example parser in eina_simple_xml_parser_01.c
> > and
> > > realized that, for the same reason, this is a extremely poor parser,
> > > basically worthless (no offense intended).
> > > Actually it is so poor, it is not even a simple XML parser because all
> it
> > > does is check if the input looks somewhat similar to XML.
> > >
> > > I realize, that this is not meant to be a full featured parser or even
> a
> > > basic parser, but seeing as it is hardly a parser at all, I can't see
> the
> > > point of having it (as an example).
> > >
> > > On the other hand, simple xml does have the concept of nodes using eina
> > > inlists and such, but they seem to be usable only for creating xml, not
> > > reading it.
> > >
> > > So my question is: Am I missing something here?
> > >
> > > Here's a modified/broken chat.xml file to be parsed by the example code
> > to
> > > show how poorly it does: http://bpaste.net/show/65296/
> > > If there's no better way to do it, I'd suggest to make this explicit in
> > the
> > > docs/examples and/or remove the example.
> > >
> > > Regards
> > >
> > > --
> > > thomasg
> >
>
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> enlightenment-devel mailing list
> enlightenment-devel@lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
>


-- 
Gustavo Sverzut Barbieri
http://profusion.mobi embedded systems
--------------------------------------
MSN: barbi...@gmail.com
Skype: gsbarbieri
Mobile: +55 (19) 9225-2202
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to