On Sun, Nov 29, 2009 at 12:10, Simone Cortesi <[email protected]> wrote:
> On Sun, Nov 29, 2009 at 12:16, Maarten Deen <[email protected]> wrote:
>> I've tried a few things, but I'm not fluent in perl. My problem at the 
>> moment is
>> that splitting a line on the space character seems logical, but you run into
>> problems if a value has a space in it.
>
> wouldnt be wiser to use a DOM/XML parser. which is native able to interpret 
> XML?

Yes it would. Unfortunately some Perl programmers seem to be unaware
of the existence of CPAN and insist on solving non-trivial problems
like XML parsing over and over again with the wrong tools, namely
regular expressions;

If you want a Perl one-liner to get all <tag> values from a OSM file
here's one on the house that isn't insane:

    perl -CI -MXML::Parser -E 'my $x = XML::Parser->new(Handlers => {
Start => sub { my ($p, $e, %kv) = @_; return unless $e eq "tag"; say
"$kv{k} = $kv{v}" } }); $x->parse(*STDIN)' < File.osm

This could probably done in an easier way using something higher level
than XML::Parser (which is just a raw interface to expat) but I'm not
that familiar with Perl XML parsing. If I were to acquaint myself with
it I'd be sure not to start by writing the millionth buggy tagsoup
parser using regexes though.

_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to