On Fri, Aug 21, 2009 at 4:35 PM, Fred Drake<[email protected]> wrote: > On Fri, Aug 21, 2009 at 10:33 AM, "Martin v. Löwis"<[email protected]> wrote: >> Which way should PyPI go: escape all markup if ReST rendering fails? >> Or else allow arbitrary HTML to be embedded? I'm worried that somebody >> would create a cross-site attack out of that... > > Same here; the text in the <pre> should be properly escaped.
FWIW lxml.html is pretty convenient to remove any dangerous tag, it's a one-liner that will get rid of any <form> <script> <embed> etc.. But in any case, I find the current situation fuzzy : The reStructuredText format is an implicit rule from pypi and trying an rst2html process on server side, no matter what long_description contains, seem like a bad practice to me. I'd like to see the nature of long_description explicitely declared in the metadata For example we could have a "long_description_format" field that would be 'text', 'html' or 'restructuredtext' If present, PyPI could use this info to decide what it should do with long_description (although this does not remove the need to clean it up on server side for security reasons of course) Last, notice that there's a new command in distutils called "check" , that can be used to check if the long_description field content compiles well in reStructuredText This client-side process is convenient to avoid any error or warning on the PyPI page. (it's available only docutils is installed of course) > > > -Fred > > -- > Fred L. Drake, Jr. <fdrake at gmail.com> > "Chaos is the score upon which reality is written." --Henry Miller > _______________________________________________ > Catalog-SIG mailing list > [email protected] > http://mail.python.org/mailman/listinfo/catalog-sig > -- Tarek Ziadé | http://ziade.org _______________________________________________ Catalog-SIG mailing list [email protected] http://mail.python.org/mailman/listinfo/catalog-sig
