On 7/7/06, Jim Fulton <[EMAIL PROTECTED]> wrote:
> > +1 on static pages.  I don't, however, see a reason to require
> > valid XML.  Or rather, I don't expect to implement XML parsing in
> > easy_install; if the spec is too complex to implement with regular
> > expression matching, it's probably too complex for people to throw
> > together an index with what's at hand.  In particular, I'd like it
> > to be practical to put together a simple index just using Apache's
> > built-in directory indexes, as long as they use the right URL
> > hierarchy.  That means that class or rel attributes should only be
> > required for links that are requesting non-index pages to be spidered.
>
> I would find parsing much easier with an XML parser  than with
> regular expressions.
> I  think it would be much more robust too.

XHTML would be best, though I agree we shouldn't care about validity
so much as just well-formedness (which is required).  I think it
should be possible to do it with valid XHTML, though, since whether
that's desired or not is a python.org policy concern.  (Not that I
suspect we'll ever really care about that.)

Of course, it should be possible to parse with htmllib and HTMLParser as well.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Every sin is the result of a collaboration." --Lucius Annaeus Seneca
_______________________________________________
Catalog-sig mailing list
Catalog-sig@python.org
http://mail.python.org/mailman/listinfo/catalog-sig

Reply via email to