On Sun, 27 Feb 2011 20:25:06 -0300 Gustavo Sverzut Barbieri
<barbi...@profusion.mobi> said:

well i'll respond to the original post here after having read this entire
thread.

i've actually looked at the parser code gustavo posted - i suspect only few
people did and they just saw "xml" and "eina" and jumped :). let me be blunt -
i am no fan of xml. i despise it like being smeared with fish poop first thing
on a sunday morning. (some people might be into that... and some peolpe are
also into xml... not sure if there's a link...)

anyway...  overall there is something nice about the simplicity of this parser.
it avoids allocating anything itself - it leaves you to do so in the parse
callback. my first reaction was "oh no xml.. not in eina" but having a read and
some time to think...i am thinking - we have to parse xml for efreet already.
we have it forced on us already. moving efreet to use this eina parser should
make life easier and help share a parser for more purposes, so i'd be up for
this going into eina conditionally that efreet's parser uses this instead and
any missing features it needs are moved into ssxmlp. but.. with 2 extra things
(not having looked at efreet's needs)...

1. the ability to quit the current decode and report WHICH byte it quit at
to allow multi-pass parsing (not an all-or none pass or fail). sure u can do
this outside via a global var etc. but it'd be nicer if it were a return from
eina_simple_xml_parse() that said what byte it got up to (if return == buflen
then it got all the way to the end with no quits).

2. i don't like the fact that it doesn't handle encoding. that at least it
doesnt auto-parse the <xml ... encoding="xxxx"> for you - and by this i mean
still hand it to you to parse via the tag handling callback, BUT ALSO have an
internal handler that parses this, stores encoding and then provides a
"encoding decoder" (if encoding is not ascii) that returns utf8 text always
from whatever encoding the document has - and... well.. also handles the &amp;
etc. escapes too (optionally). that's really my only gripe that it handle this
so a simple parser can just, in its Eina_Simple_XML_Cb callback take content
and say "hey - decode this baby given the documents encoding to standard utf8
for me - k.tnx.bi".

so that's my "comment" on the RFC.

comments? (really.. am i asking much? :) )

> Hi all,
> 
> Find attached a test file that implements a very simple and stupid XML
> parser for Eina. It's far from complete or standards compliant, yet it
> should do fine for traditional xml files we have in our systems,
> things such as FreeDesktop or various configuration files.
> 
> It ships with a SAX-like parser that does not allocate any memory and
> depends on nothing other than strings.h and ctype.h. This parser will
> tokenize the given buffer and handle you with pointers/offset/size in
> the originating buffer, with some type hint to make your life easier
> (is it an open tag? close? processing? doctype? data? cdata? comment?
> error), also will strip whitespaces. Your function can then choose to
> abort the parsing at any moment by returning FALSE.
>    Tag attributes are not handled in this phase, you can use other
> functions to parse then if you wish. Also does not allocate memory
> (does a bit on the stack with alloca for ease of use of the api).
> 
> Most people that want to parse configuration files or some definitions
> (ie: xkbd mappings) can use the SAX directly.
> 
> If you like, there is a basic node-tree builder that should be
> efficient (nodes allocated from mempool, inlists to avoid
> fragmentation, data nodes have inlined string contents, most strings
> are stringshared). This also accept user-created nodes and can be then
> serialized into a buffer. Use this if you need to load-modify-save
> files.
> 
> Again, this is far from standards compliant as libxml2, but it's even
> far from the bloat these xml libs carry. I'd propose it to be
> integrated into Eina as it's very useful and small.
> 
> The idea came from quaker66 that is doing xkbd/language module and
> would like to avoid libxml2... he cited using efreet, but that would
> be nonsense and efreet does not expose its parser... actually efreet
> could be converted to use this new parser I'm proposing.
> 
> BR,
> 
> -- 
> Gustavo Sverzut Barbieri
> http://profusion.mobi embedded systems
> --------------------------------------
> MSN: barbi...@gmail.com
> Skype: gsbarbieri
> Mobile: +55 (19) 9225-2202


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    ras...@rasterman.com


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to