I find myself frobbing trees a lot these days: read in some XML,
wander around in tree-land for a while, then output either more XML
or somesuch. And, quite frankly, it's a bit of a pain.
The issue, as I see it, is that Perl has no "power tools" for dealing
with trees. I will admit that I don't know what these should look
like, but if Perl has them, it's news to me. Here's an example:
Let's say that I've got a daemon which is running ps(1) on a regular
basis and logging the results. A brute force approach would be to
save the raw ASCII output, but these days I'm trying to use XML. So,
I write out the output as (informal) XML:
<log>
<ps time=123456789>
<process>
<pid>123</>
<pcpu>4.6</>
<stat>SN+</>
...
</process>
</ps>
...
</log>
A bit bulky, bit nicely tagged and serialized. Now, I want to do
something with it. OK, the first thing I do is read it in as a tree.
I use my own SAX handler, because I want a pure Perl way to load in
a tree, preserving order. It loads in something like this:
[ 'log', {},
[ 'ps', { time => 123456789 },
[ 'process', {},
[ 'pid', {}, '123' ],
[ 'pcpu', {}, '4.6' ],
[ 'stat', {}, 'SN+' ],
...
],
],
...
]
The problem is that, although the data structure I've loaded in is a
tree, I generally want to use it as something else. For example, let's
say that I want to "boil down" these log files a bit. This means I
have to pick up the static values (e.g., pid), tally the distribution
of the flag values (e.g., stat), and average the numeric snapshots, as:
foreach $time (sort(keys(%ps))) {
$pid = $ps{$time}{pid} unless defined ($pid);
$pcpu += $ps{$time}{pcpu};
$stat{$ps{$time}{stat}}++;
...
}
My approach to this, currently, is to walk the tree, creating the data
structure I'd _like_ to have, before I try to do the actual work. This
isn't TOO painful, but it isn't the sort of DWIMitude I'd like to see.
More to the point, let's say that I simply want to transform the data
into a different order. In a multiply subscripted array, this is just
a matter of swapping subscripts on the output loop(s). Turning the tree
above into something like:
<process pid="123">
<time>123456789,...</>
<pcpu>4.6,...</>
<stat>SN+,...</>
</process>
is not something I want to try in XSLT. I can do it in Perl, of course,
but I end up writing a lot of code. Am I missing something? And, to
bring the posting back on topic, will Perl6 bring anything new to the
campfire?
-r
--
email: [EMAIL PROTECTED]; phone: +1 650-873-7841
http://www.cfcl.com/rdm - my home page, resume, etc.
http://www.cfcl.com/Meta - The FreeBSD Browser, Meta Project, etc.
http://www.ptf.com/dossier - Prime Time Freeware's DOSSIER series
http://www.ptf.com/tdc - Prime Time Freeware's Darwin Collection