El mié, 31-05-2006 a las 09:45 +0300, Sjur Moshagen escribió: > Den 31. mai. 2006 kl. 04.10 skrev David Crossley: > > > Sjur N. Moshagen (JIRA) wrote: > >> > >> Sjur N. Moshagen commented on FOR-435: > >> -------------------------------------- > >> > >> Earlier investigation in our project has shown that the Chaperon > >> grammar is using the default Java file encoding when reading > >> files, and that the default Java encoding is given by the OS, in > >> our case MacOS X, which has MacRoman as default. Reading UTF-8 > >> encoded files as MacRoman will of course garble non-ASCII characters. > > > > Perhaps this issue also affects the core of Forrest. > > IIUC from our sitemaps, we use Chaperon to extract > > links from CSS files. > > It sounds at least like a potential source of problems. >
I am not 100% sure but this observation could explain as well FOR-492. > >> Today I put some effort into finding a work-around based on this > >> insight, and the result is the following command line argument: > >> > >> forrest run -Dforrest.jvmargs="-Dfile.encoding=utf-8" > > > > Perhaps this should be an available forrest property. > > That would be very nice, although it should be made clear in the > documentation that it can affect more than Chaperon. The parameter > overrides the OS-provided default file encoding, and sets the > specified file encoding as default for the Java VM. Thus, all file > readers not specifying the encoding will use it. We should make a test to generate our site on windows and linux with -Dforrest.jvmargs="-Dfile.encoding=utf-8" set. Maybe for the time being we can add this to our forrest.properties of the site-author and see whether we can omit FOR-492. > > >> It doesn't really solve the underlying problem of configuring > >> Chaperon from within Forrest (or Cocoon), but it does solve our > >> actual problem through a work-around. > > > > Does the Cocoon chaperon block need some configurability > > added? > > AFAIR (it is a long time since I tried this), the Chaperon > documentation claims the file reading encoding to be configurable, > but I could not get it to work. Whether that was my mistake or a bug > in Chaperon is beyond me:-) > > > Also does our Chaperon jar need updating? > > > > You mentioned an important mail thread below, but could > > not provide the link at the time. > > The link is provided in the first comment in the issue, just below > the "empty link" text. > > > Thanks very much for you investigation and other effort. > > Thank you (and all the others) for your work with Forrest! > > > -David > > Sjur Thanks very much, this findings may be the solution of FOR-492. :) salu2 > > >>> Wiki input files (*.jspwiki) is not correctly read when in UTF-8 > >>> ---------------------------------------------------------------- > >>> > >>> Key: FOR-435 > >>> URL: http://issues.apache.org/jira/browse/FOR-435 > >>> Project: Forrest > >>> Type: Bug > >> > >>> Components: Plugin: input.wiki > >>> Versions: 0.8-dev, 0.7 > >>> Environment: MacOS X, 10.3.8, Java 1.4.2 > >>> Reporter: Sjur N. Moshagen > >> > >>> > >>> According to the documentation at: > >>> http://chaperon.sourceforge.net/using-cocoon.html > >>> it should be possible to configure the Wiki plugin (or any plugin > >>> based on Chaperon) for different encodings of the input file, in > >>> my case UTF-8. > >>> But this does not work. I have: > >>> <map:transformer name="lexer" > >>> > >>> src="org.apache.cocoon.transformation.LexicalTransformer" > >>> logger="sitemap.transformer.lexer"> > >>> <map:parameter name="localizable" value="true"/> > >>> <map:parameter name="encoding" value="UTF-8"/> > >>> </map:transformer> > >>> in the input.xmap file in $FORREST_HOME/plugins/wiki, and I have > >>> run "ant local-deploy", but to no avail: multibyte UTF-8 > >>> sequences come out as the Latin-1 counterpart of each byte in the > >>> sequence. > >>> A discussion about this bug can be found at: > >>> [mail archive not yet updated, will add link here later] -- Thorsten Scherler COO Spain Wyona Inc. - Open Source Content Management - Apache Lenya http://www.wyona.com http://lenya.apache.org [EMAIL PROTECTED] [EMAIL PROTECTED]
