Andy Wardley wrote:
Robin Berjon wrote:
I just have yet to see someone point at one place where Perl 5 hinders XML processing in such a way that Perl 6 could help.
(...)
So instead of writing Perl programs to parse and manipulate XML, it should be possible to modify Perl itself so that it parses the XML directly
into some internal form suitable for programmatical manipulation.
(...)
How exactly this will manifest itself, I cannot tell. Nor can I say if this
is actually a sensible thing to do or not. But unless my understanding is
warped, support for parsing XML and other markup languages could be moved
down into the core of the parser internals for Perl 6.


For example, it might be possible to do something like this:

use Perl6::XML;

    <thingy>
        <blah>blah blah</blah>
    </thingy>

use Perl6;

print $thingy.blah;

What you point to in terms both of difficulties with the existing approaches and in terms of solutions makes a *lot* of sense. I'm afraid however that some form of cold is preventing you from smelling the sulfurous fumes emanating from dragons hiding right around the corner :)


I'll leave aside the excellent idea of allowing one to embed XML data into Perl source as you describe it (a nice replacement for __DATA__ for sure) to focus on the rest because if we can do that with external XML documents, the part about inlining XML becomes trivial.

The basic problem is that to produce a data structure you can either know something of the kind of XML you're to be using or you can do it in a generic manner.

The generic manner is simple, in fact it's called XML::Simple. It's great at what it does, but you get a data structure which you need to discover and in many case you probably want something where you have to pay less attention to whether something is a string or a hashref. Ask Nat[0] ;)

The vocabulary specific manner is more complex, because you need something external to the XML to describe how the mapping operates. In your example if I were to add a <blah> element, all of a sudden $thingy.blah might be an array with the two contents. Things get hairy fast without even using anything crufty, especially when you add attribute parsing, namespaces, in-document links...

The data binding folks have tried to address the problem using XML Schema, and the result is, hmmm, "unpleasant" to use something polite. The SOAP and WSDL people have been at it, and I won't even describe the result because I couldn't possibly be polite about it.

Imho a grammar-based approach would likely be too low-level. I'm currently betting on something that would mix XBind[1] and Regular Fragmentations[2]. The first one defines simple mappings as described above, the second tells you how to parse data in XML documents that has structure not expressed in XML (eg <date>2003-03-26</date>) so that it is seen in a structured way, without the need for typing.

These approaches are elegant, and have the advantage of being truly cross-language so that we can let the Python people write the descriptions and use them directly :)

One very cool thing that could be done in Perl 6 would be to take an XBind+RegFrag document and generate a grammar derived from the P6 XML grammar that would 1) be specific to the vocabulary (and thus hopefully faster than a generic XML grammar, though I don't have /too/ much hope) and 2) directly produce the object representation you want and return it in the parse object.

This is all speculation and hand-waving, of course.  But the point is that
Perl 6's extending parsing capabilities could well provide a much greater
level of integration between Perl, XML and various other programming and
markup languages.

Yes certainly, but again we could already go much farther than we are today using Perl 5 (and a lot of tuits).


My rant against the XML machine was really an aside.  Take everything I say
with a pinch of salt.  :-)

I might have overreacted slightly because I'm tired of the xmlHorribleKludges obscuring the coolness that nice and helpful people work on hard. I can't blame anyone for not seeing through the blazing storm of hypish PR...



[0]http://use.perl.org/~gnat/journal/11081 [1]http://www.prescod.net/xml/xbind/ [2]http://www.simonstl.com/projects/fragment/

--
Robin Berjon <[EMAIL PROTECTED]>
Research Engineer, Expway        http://expway.fr/
7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488



Reply via email to