O withother datasources

Robert O'Connor Wed, 20 Feb 2002 12:42:29 -0800

Hi Jon,

> > > Amen. An entire tree in memory would bring this laptop of
> > > mine to its knees. While some of the DOM parsers certainly some have
> > > nice features, can't beat expat on the performance end.
>
> But, aren't we talking about describing how and what should be
> gathered and
> not trying to build a big XML document of the gathered HTML? For normal
> "desktop" use in Plucker Desktop this configuration/channels file can't be
> so big?


A channel or two would be no problem. Loading up a list of a few hundred to
look through may or may not be.

> Expat might be the logical choise as it is already in the Plucker Desktop.

Yes. I don't want to go down the mozilla path of a multi meg download and
memory beast. Small, fast, compact is the goal.

> David might need another tool if he chooses to convert his database of
> Pluckable
> links to this channel format...
>
> >     We should be consistant across parser commandline elements, .ini
> > key/value pairs, ~/.pluckerrc keys, and the XML attributes.
> > Let's flatten
> > this so we don't have to re-learn it for each file we deal
> > with and each
> > platform it runs on. Eventually, standardizing on one file
> > format, with the
> > ability to export to another, is ideal.

I agree, as mentioned in my last email, that eventually there should be the
same file format keys used for all. But the problem is that some of our keys
aren't self-descriptive enough in their current state. For example there is
a key called 'icon=' (which is  a boolean key not a link to an image file or
url). Even inside a namespace or as an attribute inside an element, it is
still not self describing that it is a boolean switch.

But on the counter, it is annoying dealing with names as they keep flipping:
An example of this is db_file to doc_file or the zlib compression key.
Supporting depreciated keys makes coding a mess and isn't much fun. I'm in
no hurry to start writing until a strong self-describing format is agreed
upon.

> >     What is the "usual format"? We are defining our own.
> > Who cares what they think is "normal" or not.

> A standard is good, but the Plucker format should concentrate on
> beeing good
> for
> Plucker.

Having been on the receiving end for the last 6 years of having to work
around and do multiple implementations for differeent web browser
manufacturers not agreeing on naming and format of the exact same
functionality and feature for a webpage, I'd rather not be the one doing it
to others, now that the situation is reversed.

> > > Hmmm. Never saw that before in any spec, but things are
> > always changing
> > > and I may well have missed it completely in the texts. Do
> > you have a ref
> > > handy by any chance?
>
> I believe '//' and '/' are allowed inside attribute values.
>
> <xsl:apply-templates select="//some-node"/>

If you find a ref on // being illegal in xml 1.0 documents other than
attributes when you come across it next time, just drop a note. I'm
interested in making compliant documents only.

> Selects all the <some-node/> nodes and applies XSLT templates on
> them in an
> XSLT transformation.
>
> >     XML is meant to be human-readable, but doesn't have to fit in an
> > 80-column witdth, and as you know.. you could use the following:
> >
> >     <images bpp="8"
> >             alt_maxwidth="200"
> >             alt_maxheight="300"
> >             maxwidth="100"
> >             maxheight="100"
> >             other_opt="foo"
> >             and_this="bar"
>       >
> >     </images>

It is the same with source, it doesn't need to be 80 columns either. You
could fit in an entire program without a linebreak or any of the usual
whitespace. It's just a matter of what is clear, and for here, not just
clarity of an isolated element, but in a giant tree of nodes.

> Using namespaces is a good thing if we want compatibility between
> different
> formats and easy conversions from one format to another. Using
> nice aliases
> is good for readability as long as everybody remebers that it is just a
> shorthand for the namespace...
>
> >     It's useful for self-closing elements, like <br> and so on, but for
> > things like multi-line constructs, as above, it makes sense to
> repeat the
> > closing tag in full, since it could be 10 lines away, or nested inside
> > another tag. It's easier to figure out which tags are where
> > when they each have a full close tag, and not a self-closing one.
>
> <some-node></some-node> and <some-node/> are exactly the same things and
> are not necessary preserved in the form they were read in when
> the document
> is saved.

That is a good point, that you would want to be able to see an end tag if
lines become so long that you are now scrolling up and down to try and put
the order of elements back together, but it doesn't seem user-evident to
have empty elements, I think people are going to be typing stuff between the
tags and then write in to plucker-list and wonder why it doesn't work, or
else spend even more time with the manual trying to get up to speed.

> Would it be possible for the parser to use UTF-16 internally when handling
> the document? This would probably make it easier to support different
> codepages (Japaneese etc.). Europeans might want to use ISO-8859-1 and
> Americans might get away without specifying an encoding at all.

Yes. And if the Plucker Desktop is compiled with multibyte support turned
on, on an OS that supports multibyte sets, it will support it as desired.

> I hope nobody objects to me taking part in the discussion!

Not at all! More the merrier! Thanks for your insights, look forward to
more.

Best wishes,
Robert

RE: Request for comments: RDF/XML descriptions for Plucker I/O withother datasources

Reply via email to