Le samedi 23 juin 2007, Kenney Westerhof a écrit : > Hervé BOUTEMY wrote: > > Le samedi 23 juin 2007, Brett Porter a écrit : > >> We shouldn't permit relative entities in the POM - it would cause > >> grief when deployed to the repository. > >> > >> - Brett > > > > ok, then the target API is read(InputStream) > > I won't add read(URL). > > I think read(URL) is actually the only good API to add. I agree that read(URL) is the most powerful, that permits encoding resolution and relative entities. I don't think this is the only API that should remain, since every stream does not have an URL. But the question is: should relative entities be permitted? I personally don't have real opinion, I'm interested in encoding, for french accents in my first name, for example :)
> I'm not sure how rome does it - pushbackstream or whatever, but an URL has > 2 advantages: - you can re-open the url to start reading again > - you have the location present to do relative resolution if necessary > (outside of modello). FYI, Rome's XmlStream uses internally a pushback to detect encoding. Hervé > > We could also just use the xml api's for this, since they're designed for > it: javax.xml.transform.Source and implementations (StreamSource comes to > mind). > > -- Kenney > > > regards, > > > > Hervé > > > >> On 23/06/2007, at 5:29 AM, Hervé BOUTEMY wrote: > >>> Le vendredi 22 juin 2007, Arnaud HERITIER a écrit : > >>>> Be careful, because when you read an xml file with a reader (or an > >>>> inputstream) instead of a path (or an url) you can't use relative > >>>> entities > >>>> in xml (because the parser can't know where the main doc is). > >>> > >>> yes, read(InputStream) is better than read(Reader) for encoding, > >>> but does not > >>> help for relative entities. > >>> Do you want that I add read(URL) at the same time, and document it > >>> as the > >>> preferred way of getting the model read? At the first time, for > >>> compatibility > >>> reasons, it would call read(InputStream) then read(Reader), but in > >>> the long > >>> term, it could be coded to permit relative entities. > >>> WDYT? > >>> > >>> Hervé > >>> > >>>> This is not a problem because we discourage the usage of xml > >>>> entities in > >>>> our xml documents, but we have to continue to say it !!! > >>>> > >>>> Arnaud > >>>> > >>>> On 22/06/07, Hervé BOUTEMY <[EMAIL PROTECTED]> wrote: > >>>>> Le vendredi 22 juin 2007, Kenney Westerhof a écrit : > >>>>>> Hi, > >>>>> > >>>>> Hi, > >>>>> > >>>>>> indeed, it's a case of doing new XXXInputstream( something, > >>>>>> "encoding" > >>>>> > >>>>> ), > >>>>> > >>>>>> or a reader. Some work has been done on this, IIRC. > >>>>>> > >>>>>> The problem is that you need to prescan the xml declaration, so you > >>>>> > >>>>> start > >>>>> > >>>>>> parsing until you get the first xml language element that is not a > >>>>> > >>>>> comment, > >>>>> > >>>>>> (an xml element, in which case encoding is utf8, or > >>>>>> a doctype declaration, encoding is utf8, or > >>>>>> a processing instruction, and if it's the xml processing > >>>>>> instruction > >>>>> > >>>>> parse > >>>>> > >>>>>> the encoding attribute and use that, otherwise it's utf8). > >>>>>> > >>>>>> This isn't too hard to do, except you need to restart reading > >>>>>> the xml > >>>>> > >>>>> file > >>>>> > >>>>>> from start, if the encoding is not utf-8. The real problem is in > >>>>>> the > >>>>> > >>>>> API's; > >>>>> > >>>>>> you cannot take a reader and restart that, since you cannot > >>>>>> change the > >>>>>> encoding on an instantiated reader, and you certainly don't want to > >>>>>> wrap it. You'd need access to a raw inputstream that doesn't apply > >>>>>> encoding transformations to the bytes, and wrap that in a Pushback > >>>>>> something and then rewrap it if you found the encoding. > >>>>> > >>>>> exactly, this is the job done by XmlReader in Rome: > >>>>> > >>>>> https://rome.dev.java.net/apidocs/0_5/com/sun/syndication/io/ > >>>>> XmlReader.ht > >>>>> ml > >>>>> > >>>>> I have the class, well written and tested by Rome developers. My > >>>>> first > >>>>> question is then: where to put it, to be able to use it in a lot of > >>>>> places where there are Readers instanciated for XML streams? > >>>>> plexus-utils? or make a dependency on Rome? or another place? > >>>>> > >>>>>> I'm a bit fuzzy on all the java.io api's, so we'll have to find the > >>>>> > >>>>> proper > >>>>> > >>>>>> class to use in the API so we can do this; a File would work. > >>>>>> > >>>>>> Anyway, I once tried to fix this issue but the api had to be > >>>>>> changed > >>>>>> and there were just too many changes across plexus and maven at the > >>>>>> time to push this through. > >>>>> > >>>>> With this class available, the change to Maven model can be backward > >>>>> compatible: > >>>>> - the old read(Reader) API remains for compatibility, but is > >>>>> deprecated > >>>>> - a new read(InputStream) API is added, which calls read(new > >>>>> XmlReader(in)) > >>>>> The whole Maven code can then slowly migrate from deprecated > >>>>> Reader API > >>>>> to the > >>>>> new InputStream one, or use XmlReader if it is too hard to switch to > >>>>> InputStream. > >>>>> The only change is that there is a new dependency to this XmlReader > >>>>> class: I > >>>>> don't know if it is a real problem or not. > >>>>> > >>>>> I searched a little bit, this new API addition could be done > >>>>> individually > >>>>> in > >>>>> each .mdo file. But of course integrating it the code generation > >>>>> mechanism of > >>>>> Modello would be a lot better: Jason, if your proposal to have > >>>>> access to > >>>>> Modello is still valid, I'm interested. > >>>>> > >>>>> Regards > >>>>> > >>>>> Hervé > >>>>> > >>>>>> -- Kenney > >>>>>> > >>>>>> Hervé BOUTEMY wrote: > >>>>>>> Le jeudi 21 juin 2007, Jason van Zyl a écrit : > >>>>>>>> It seems like there are many problems with encoding that could be > >>>>>>>> easily solved with a couple tweaks to modello, specifically the > >>>>>>>> reader and writing so I've scheduled these for 2.0.8. There some > >>>>>>>> patches for these and hopefully Herve will work his magic with > >>>>>>>> his > >>>>>>>> suggested fix. I like the idea of borrowing the idea from the > >>>>>>>> Rome > >>>>>>>> IO utils to find the right encoding by default. That could > >>>>>>>> easily be > >>>>>>>> integrated into modello. Herve if you need access to Modello > >>>>>>>> we can > >>>>>>>> set you up. > >>>>>>> > >>>>>>> I'm interested at working on that. Do I need Modello access, or > >>>>>>> other > >>>>>>> components? I don't really know, these Modello things are the > >>>>>>> parts I > >>>>>>> didn't really dive into for the moment. > >>>>>>> The magic of the idea is that the encoding handling is not done by > >>>>>>> the parser, but by the reader. Then, the code that has to > >>>>>>> change is > >>>>>>> the > >>>>> > >>>>> code > >>>>> > >>>>>>> creating the Reader from a File: it must be changed from "new > >>>>>>> FileReader(file)" to "new XmlReader(file)". > >>>>>>> > >>>>>>> We need to: > >>>>>>> 1. choose where we put the XmlReader so that any code can use > >>>>>>> it when > >>>>>>> necessary. Or have a dependency on Rome: but all Rome for only 1 > >>>>>>> class (even if this class is really great)... > >>>>>>> 2. change every code that creates a Reader for XML parsing > >>>>>>> > >>>>>>> WDYT? > >>>>>>> > >>>>>>>> Thanks, > >>>>>>>> > >>>>>>>> Jason > >>>>>>>> > >>>>>>>> ---------------------------------------------------------- > >>>>>>>> Jason van Zyl > >>>>>>>> Founder and PMC Chair, Apache Maven > >>>>>>>> jason at sonatype dot com > >>>>>>>> ---------------------------------------------------------- > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> ----------------------------------------------------------------- > >>>>>>>> --- > >>>>>>>> - To unsubscribe, e-mail: [EMAIL PROTECTED] > >>>>>>>> For additional commands, e-mail: [EMAIL PROTECTED] > >>>>>>> > >>>>>>> ------------------------------------------------------------------ > >>>>>>> --- > >>>>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] > >>>>>>> For additional commands, e-mail: [EMAIL PROTECTED] > >>>>>> > >>>>>> ------------------------------------------------------------------- > >>>>>> -- > >>>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] > >>>>>> For additional commands, e-mail: [EMAIL PROTECTED] > >>>>> > >>>>> -------------------------------------------------------------------- > >>>>> - > >>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] > >>>>> For additional commands, e-mail: [EMAIL PROTECTED] > >>> > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [EMAIL PROTECTED] > >>> For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]