Re: resource encoding troubles

2014-09-20 Thread Garret Wilson

I'm finally able to trace the code, and this is getting very odd.

I use a hex editor, and the bytes in the properties file are ... 3D A9 
... (=©), just as I expect.


But when I trace through the Wicket code, the 
IsoPropertiesFilePropertiesLoader is using a UrlResourceStream which 
uses a URLConnection, which under the hood uses a BufferedInputStream to 
a FileInputStream. This in turn is wrapped in another 
BufferedInputStream. When the Properties class (from 
IsoPropertiesFilePropertiesLoader) parses the file, the internal 
Properties.LineReader reads into its inByteBuf variable the sequence ... 
3D EF BF BD ...! As mentioned below, EF BF BD is the UTF-8 sequence for 
U+FFFD, which is the Unicode replacement character.


So it appears that the UrlResourceStream/URLConnection for the 
properties file is somewhere trying to open the stream as UTF-8. 
Therefore the A9 © character gets converted into the EF BF BD sequence 
before it even gets to the parser in 
IsoPropertiesFilePropertiesLoader/Properties!


But what would be causing the UrlResourceStream/URLConnection to default 
to UTF-8 when opening my properties file? This seems to be the answer 
that lies at the heart of this problem. Is there some Wicket or Java 
setting that is defaulting a URLConnection to use UTF-8 encoding? (As I 
mentioned above, the underlying input stream seems to be a 
FileInputStream wrapped in two layers of BufferedInputStream.)


Garret

On 8/29/2014 1:15 PM, Garret Wilson wrote:
Hi, all. Thanks Andrew for that attempt to reproduce this. I have 
verified this on Wicket 6.16.0 and 7.0.0-M2.


I have checked out the latest code from 
https://git-wip-us.apache.org/repos/asf/wicket.git . I was going to 
trace this down in the code, but then I was stopped in my tracks with 
an Eclipse m2e bug 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=371618 that won't even 
let me clean/compile the project. Argg!! Always something, huh?


But I did start looking in the code. IsoPropertiesFileLoader looks 
completely OK; it uses Properties.load(InputStream), and the file even 
indicates that the input encoding must be ISO-8859-1. Not much could 
go wrong there. I back-referenced the calls up the chain to 
WicketMessageTagHandler.onComponentTag(Component, ComponentTag), and 
it looks straightforward there---but that's for message tags, not 
message body.


I investigated downwards from WicketMessageResolver.resolve(...) 
(which I presume is what is at play here), which has this code:


   MessageContainer label = new MessageContainer(id, messageKey);

The MessageContainer.onComponentTagBody(...) simply looks up the value 
and calls renderMessage(), which in turn does some complicated ${var} 
replacement using MapVariableInterpolator and then write out the 
result using getResponse().write(text). Unless MapVariableInterpolator 
messes up the value during variable replacement (but there are no 
variables to replace in this situation), then on the surface 
everything looks OK.


So I decided to do an experiment; I changed the HTML to this:

   pThis a © copyright. smallwicket:message key=copyrightdummy
   text/wicket:message/small/p

And I changed the properties to this:

   copyright=This a © copyright.


Here is what was produced:

   This a © copyright. This a � copyright.


So something is going on here in the generation of the included 
message, because as you can see the content from XML gets produced 
correctly. It turns out http://stackoverflow.com/a/6367675/421049 
that � is the UTF-8 sequence for U+FFFD, which is the Unicode 
replacement character when an invalid UTF-8 sequence is encountered. 
And of course, the copyright symbol U+00A9 is not a valid UTF-8 value, 
even thought it is fine as part of ISO-8859-1.


So here is the problem: something is taking the string generated by 
the message (which was parsed correctly from the properties file) and 
writing it to the output stream, not in UTF-8 as it should, but in 
some other encoding. If I were to guess here, I would say that the 
embedded message is writing out in Windows cp1252 (more or less 
ISO-8859-1), which is my default encoding (which would explain why 
Andrew didn't see this, if his system is Linux and the default 
encoding happens to be UTF-8 for example). This seems incorrect to me; 
the embedded message should know that it is writing into a UTF-8 
output stream and should use that instead of the system encoding.


Remember that I can't even compile the code because of an m2e bug, so 
all of this is highly conjectural, just from visually inspecting the 
code and doing a few experiments. But I have a hunch that if you 
switch to a machine that has a default system encoding that isn't 
UTF-8, you'll reproduce this issue. And I further predict that if you 
trace through the code, the embedded wicket:message tag is 
incorrectly injecting its contents using the system encoding rather 
than the entire output stream encoding (however that is configured in 
Wicket). Put 

Re: resource encoding troubles

2014-09-20 Thread Garret Wilson

Hahahaha! I found the problem!

When I looked at the HomePage.properties file in a hex editor, I was 
looking at the HomePage.properties file in my source tree. But remember 
that this file isn't the one that Wicket loads! After a Maven build, 
Wicket will load the HomePage.properties file that Maven copies the 
target directory!! (I should have paid closer attention to the URL used 
by URLConnection.) And sure enough, when I open that copied version of 
HomePage.properties, it contains the sequence EF BF BD! In other words, 
when Maven copied the HomePage.properties file from the source tree to 
the target directory, it must have opened it up as UTF-8, converting the 
A9 © character (not valid UTF-8) into EF BF BD, the UTF-8 sequence for 
U+FFFD, the Unicode replacement character. Thus when Wicket came along 
to read the file from the target directory, it (correctly) loaded it as 
ISO-8859-1, interpreting EF BF BD as three characters, �.


But why did Maven use UTF-8 when it copied my HomePage.properties source 
file to the target directory? Ummm... because I told it to, sort of:


  properties
project.build.sourceEncodingUTF-8/project.build.sourceEncoding
  /properties

  build
resources
  resource
directorysrc/main/resources/directory
filteringtrue/filtering
includes
  include**/*.properties/include
/includes

Apparently when Maven copies resources using filtering, it opens and 
parses them using the ${project.build.sourceEncoding} setting, which of 
course I had set to UTF-8. I probably I need to set the encoding 
parameter of the maven-resources-plugin 
http://maven.apache.org/plugins/maven-resources-plugin/copy-resources-mojo.html#encoding.


Argg!! So much pain and agony for such a tiny mistake! But I'm glad I 
found it. I'll fix it... another day. Right now I'm going to grab some 
tequila and celebrate!!


Have a great rest of the weekend, everybody!

Garret

On 9/20/2014 4:14 PM, Garret Wilson wrote:

I'm finally able to trace the code, and this is getting very odd.

I use a hex editor, and the bytes in the properties file are ... 3D A9 
... (=©), just as I expect.


But when I trace through the Wicket code, the 
IsoPropertiesFilePropertiesLoader is using a UrlResourceStream which 
uses a URLConnection, which under the hood uses a BufferedInputStream 
to a FileInputStream. This in turn is wrapped in another 
BufferedInputStream. When the Properties class (from 
IsoPropertiesFilePropertiesLoader) parses the file, the internal 
Properties.LineReader reads into its inByteBuf variable the sequence 
... 3D EF BF BD ...! As mentioned below, EF BF BD is the UTF-8 
sequence for U+FFFD, which is the Unicode replacement character.


So it appears that the UrlResourceStream/URLConnection for the 
properties file is somewhere trying to open the stream as UTF-8. 
Therefore the A9 © character gets converted into the EF BF BD sequence 
before it even gets to the parser in 
IsoPropertiesFilePropertiesLoader/Properties!


But what would be causing the UrlResourceStream/URLConnection to 
default to UTF-8 when opening my properties file? This seems to be the 
answer that lies at the heart of this problem. Is there some Wicket or 
Java setting that is defaulting a URLConnection to use UTF-8 encoding? 
(As I mentioned above, the underlying input stream seems to be a 
FileInputStream wrapped in two layers of BufferedInputStream.)


Garret

On 8/29/2014 1:15 PM, Garret Wilson wrote:
Hi, all. Thanks Andrew for that attempt to reproduce this. I have 
verified this on Wicket 6.16.0 and 7.0.0-M2.


I have checked out the latest code from 
https://git-wip-us.apache.org/repos/asf/wicket.git . I was going to 
trace this down in the code, but then I was stopped in my tracks with 
an Eclipse m2e bug 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=371618 that won't 
even let me clean/compile the project. Argg!! Always something, huh?


But I did start looking in the code. IsoPropertiesFileLoader looks 
completely OK; it uses Properties.load(InputStream), and the file 
even indicates that the input encoding must be ISO-8859-1. Not much 
could go wrong there. I back-referenced the calls up the chain to 
WicketMessageTagHandler.onComponentTag(Component, ComponentTag), and 
it looks straightforward there---but that's for message tags, not 
message body.


I investigated downwards from WicketMessageResolver.resolve(...) 
(which I presume is what is at play here), which has this code:


   MessageContainer label = new MessageContainer(id, messageKey);

The MessageContainer.onComponentTagBody(...) simply looks up the 
value and calls renderMessage(), which in turn does some complicated 
${var} replacement using MapVariableInterpolator and then write out 
the result using getResponse().write(text). Unless 
MapVariableInterpolator messes up the value during variable 
replacement (but there are no variables to replace in this 
situation), then on the surface everything looks OK.


So I 

Re: resource encoding troubles

2014-09-20 Thread Sven Meier

Hi Garret,

I'm glad you found the culprit. Thanks for keeping us updated, we all 
learn something new each day.


Have fun
Sven


On 09/20/2014 10:28 PM, Garret Wilson wrote:

Hahahaha! I found the problem!

When I looked at the HomePage.properties file in a hex editor, I was 
looking at the HomePage.properties file in my source tree. But 
remember that this file isn't the one that Wicket loads! After a Maven 
build, Wicket will load the HomePage.properties file that Maven copies 
the target directory!! (I should have paid closer attention to the URL 
used by URLConnection.) And sure enough, when I open that copied 
version of HomePage.properties, it contains the sequence EF BF BD! In 
other words, when Maven copied the HomePage.properties file from the 
source tree to the target directory, it must have opened it up as 
UTF-8, converting the A9 © character (not valid UTF-8) into EF BF BD, 
the UTF-8 sequence for U+FFFD, the Unicode replacement character. Thus 
when Wicket came along to read the file from the target directory, it 
(correctly) loaded it as ISO-8859-1, interpreting EF BF BD as three 
characters, �.


But why did Maven use UTF-8 when it copied my HomePage.properties 
source file to the target directory? Ummm... because I told it to, 
sort of:


  properties
project.build.sourceEncodingUTF-8/project.build.sourceEncoding
  /properties

  build
resources
  resource
directorysrc/main/resources/directory
filteringtrue/filtering
includes
  include**/*.properties/include
/includes

Apparently when Maven copies resources using filtering, it opens and 
parses them using the ${project.build.sourceEncoding} setting, which 
of course I had set to UTF-8. I probably I need to set the encoding 
parameter of the maven-resources-plugin 
http://maven.apache.org/plugins/maven-resources-plugin/copy-resources-mojo.html#encoding.


Argg!! So much pain and agony for such a tiny mistake! But I'm glad I 
found it. I'll fix it... another day. Right now I'm going to grab some 
tequila and celebrate!!


Have a great rest of the weekend, everybody!

Garret

On 9/20/2014 4:14 PM, Garret Wilson wrote:

I'm finally able to trace the code, and this is getting very odd.

I use a hex editor, and the bytes in the properties file are ... 3D 
A9 ... (=©), just as I expect.


But when I trace through the Wicket code, the 
IsoPropertiesFilePropertiesLoader is using a UrlResourceStream which 
uses a URLConnection, which under the hood uses a BufferedInputStream 
to a FileInputStream. This in turn is wrapped in another 
BufferedInputStream. When the Properties class (from 
IsoPropertiesFilePropertiesLoader) parses the file, the internal 
Properties.LineReader reads into its inByteBuf variable the sequence 
... 3D EF BF BD ...! As mentioned below, EF BF BD is the UTF-8 
sequence for U+FFFD, which is the Unicode replacement character.


So it appears that the UrlResourceStream/URLConnection for the 
properties file is somewhere trying to open the stream as UTF-8. 
Therefore the A9 © character gets converted into the EF BF BD 
sequence before it even gets to the parser in 
IsoPropertiesFilePropertiesLoader/Properties!


But what would be causing the UrlResourceStream/URLConnection to 
default to UTF-8 when opening my properties file? This seems to be 
the answer that lies at the heart of this problem. Is there some 
Wicket or Java setting that is defaulting a URLConnection to use 
UTF-8 encoding? (As I mentioned above, the underlying input stream 
seems to be a FileInputStream wrapped in two layers of 
BufferedInputStream.)


Garret

On 8/29/2014 1:15 PM, Garret Wilson wrote:
Hi, all. Thanks Andrew for that attempt to reproduce this. I have 
verified this on Wicket 6.16.0 and 7.0.0-M2.


I have checked out the latest code from 
https://git-wip-us.apache.org/repos/asf/wicket.git . I was going to 
trace this down in the code, but then I was stopped in my tracks 
with an Eclipse m2e bug 
https://bugs.eclipse.org/bugs/show_bug.cgi?id=371618 that won't 
even let me clean/compile the project. Argg!! Always something, huh?


But I did start looking in the code. IsoPropertiesFileLoader looks 
completely OK; it uses Properties.load(InputStream), and the file 
even indicates that the input encoding must be ISO-8859-1. Not much 
could go wrong there. I back-referenced the calls up the chain to 
WicketMessageTagHandler.onComponentTag(Component, ComponentTag), and 
it looks straightforward there---but that's for message tags, not 
message body.


I investigated downwards from WicketMessageResolver.resolve(...) 
(which I presume is what is at play here), which has this code:


   MessageContainer label = new MessageContainer(id, messageKey);

The MessageContainer.onComponentTagBody(...) simply looks up the 
value and calls renderMessage(), which in turn does some complicated 
${var} replacement using MapVariableInterpolator and then write out 
the result using getResponse().write(text). Unless 

Re: resource encoding troubles

2014-08-29 Thread Sven Meier

Thanks Andrew!

Sven

On 08/29/2014 05:22 AM, Andrew Geery wrote:

I created a Wicket quickstart (from
http://wicket.apache.org/start/quickstart.html) [this is Wicket 6.16.0] and
made two simple changes:

1) I created a HomePage.properties file, encoded as ISO-8859-1, with a
single line as per the example above: copyright=© 2014 Example, Inc.

2) I added a line to the HomePage.html file as per the example
above: psmallwicket:message key=copyright©
Example/wicket:message/small/p

The content is served as UTF-8 and the copyright symbol is rendered
correctly on the page.

It doesn't look like the problem is in Wicket (at least not in 6.16).  I
guess your next steps would be to verify that you get the same results and,
assuming that you do, start removing things from your page that has the
problem until you find an element that is causing the problem.

Thanks
Andrew


On Thu, Aug 28, 2014 at 5:38 PM, Garret Wilson gar...@globalmentor.com
wrote:


On 8/28/2014 12:08 PM, Sven Meier wrote:


...



My configuration, as far as I can tell, is correct.

 From what you've written, I'd agree.

You should create a quickstart. This will easily allow us to find a
possible bug.


Better than that, I'd like to trace down the bug, fix it, and file a
patch. But currently I'm blocked from working with Wicket on Eclipse 
https://issues.apache.org/jira/browse/WICKET-5649.

Garret




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-29 Thread Garret Wilson
Hi, all. Thanks Andrew for that attempt to reproduce this. I have 
verified this on Wicket 6.16.0 and 7.0.0-M2.


I have checked out the latest code from 
https://git-wip-us.apache.org/repos/asf/wicket.git . I was going to 
trace this down in the code, but then I was stopped in my tracks with an 
Eclipse m2e bug https://bugs.eclipse.org/bugs/show_bug.cgi?id=371618 
that won't even let me clean/compile the project. Argg!! Always 
something, huh?


But I did start looking in the code. IsoPropertiesFileLoader looks 
completely OK; it uses Properties.load(InputStream), and the file even 
indicates that the input encoding must be ISO-8859-1. Not much could go 
wrong there. I back-referenced the calls up the chain to 
WicketMessageTagHandler.onComponentTag(Component, ComponentTag), and it 
looks straightforward there---but that's for message tags, not message body.


I investigated downwards from WicketMessageResolver.resolve(...) (which 
I presume is what is at play here), which has this code:


   MessageContainer label = new MessageContainer(id, messageKey);

The MessageContainer.onComponentTagBody(...) simply looks up the value 
and calls renderMessage(), which in turn does some complicated ${var} 
replacement using MapVariableInterpolator and then write out the result 
using getResponse().write(text). Unless MapVariableInterpolator messes 
up the value during variable replacement (but there are no variables to 
replace in this situation), then on the surface everything looks OK.


So I decided to do an experiment; I changed the HTML to this:

   pThis a © copyright. smallwicket:message key=copyrightdummy
   text/wicket:message/small/p

And I changed the properties to this:

   copyright=This a © copyright.


Here is what was produced:

   This a © copyright. This a � copyright.


So something is going on here in the generation of the included message, 
because as you can see the content from XML gets produced correctly. It 
turns out http://stackoverflow.com/a/6367675/421049 that � is the 
UTF-8 sequence for U+FFFD, which is the Unicode replacement character 
when an invalid UTF-8 sequence is encountered. And of course, the 
copyright symbol U+00A9 is not a valid UTF-8 value, even thought it is 
fine as part of ISO-8859-1.


So here is the problem: something is taking the string generated by the 
message (which was parsed correctly from the properties file) and 
writing it to the output stream, not in UTF-8 as it should, but in some 
other encoding. If I were to guess here, I would say that the embedded 
message is writing out in Windows cp1252 (more or less ISO-8859-1), 
which is my default encoding (which would explain why Andrew didn't see 
this, if his system is Linux and the default encoding happens to be 
UTF-8 for example). This seems incorrect to me; the embedded message 
should know that it is writing into a UTF-8 output stream and should use 
that instead of the system encoding.


Remember that I can't even compile the code because of an m2e bug, so 
all of this is highly conjectural, just from visually inspecting the 
code and doing a few experiments. But I have a hunch that if you switch 
to a machine that has a default system encoding that isn't UTF-8, you'll 
reproduce this issue. And I further predict that if you trace through 
the code, the embedded wicket:message tag is incorrectly injecting its 
contents using the system encoding rather than the entire output stream 
encoding (however that is configured in Wicket). Put another way, 
whatever is producing the bytes from the main HTML page is using UTF-8 
(as it should), but whatever is taking the message tag output is 
spitting out its bytes using cp1252 or something similar.


As soon as I can get Eclipse to be happier with the Wicket build, I'll 
give you some more exact details. But I'll have to take a break and get 
back to main my work for a while---we're nearing a big deadline and I 
have some actual functionality to implement! :)


Thanks again for investigating, Andrew.

Garret

On 8/28/2014 8:22 PM, Andrew Geery wrote:

I created a Wicket quickstart (from
http://wicket.apache.org/start/quickstart.html) [this is Wicket 6.16.0] and
made two simple changes:

1) I created a HomePage.properties file, encoded as ISO-8859-1, with a
single line as per the example above: copyright=© 2014 Example, Inc.

2) I added a line to the HomePage.html file as per the example
above: psmallwicket:message key=copyright©
Example/wicket:message/small/p

The content is served as UTF-8 and the copyright symbol is rendered
correctly on the page.

It doesn't look like the problem is in Wicket (at least not in 6.16).  I
guess your next steps would be to verify that you get the same results and,
assuming that you do, start removing things from your page that has the
problem until you find an element that is causing the problem.

Thanks
Andrew


On Thu, Aug 28, 2014 at 5:38 PM, Garret Wilson gar...@globalmentor.com
wrote:


On 8/28/2014 12:08 PM, Sven Meier 

Re: resource encoding troubles

2014-08-29 Thread Garret Wilson

On 8/29/2014 9:15 AM, Garret Wilson wrote:

...
So here is the problem: something is taking the string generated by 
the message (which was parsed correctly from the properties file) and 
writing it to the output stream, not in UTF-8 as it should, but in 
some other encoding.


Hmmm... the sequence of events would have to be a little more 
complicated than that. If somehow the properties file were being read as 
UTF-8 (it shouldn't be), then when U+00A9 it would be mapped to the 
replacement character U+FFFD. Then if /that/ UTF-8 stream were in turn 
interpreted as cp1252/ISO-8859-1, then it would produce the sequence 
�, which I'm seeing. But that would require two levels of errors, it 
would seem. And the code looks like the properties file is being read 
correctly in IsoPropertiesFilePropertiesLoader.


(Maybe something is being cached in the system encoding, and then being 
read from the cache using UTF-8.)


So I can sense the problem here, but I don't yet see where it's 
happening in the code. As soon as I'm able to trace the code, I would 
imagine I could find it pretty quickly.


Garret


resource encoding troubles

2014-08-28 Thread Garret Wilson
I have Wicket 7.0.0-M2 running on embedded Jetty, which is correctly 
returning a content type of UTF-8 for my Wicket page:


   Date: Thu, 28 Aug 2014 15:37:52 GMT
   Expires: Thu, 01 Jan 1970 00:00:00 GMT
   Pragma: no-cache
   Cache-Control: no-cache, no-store
   Content-Type: text/html; charset=UTF-8
   Transfer-Encoding: chunked
   Server: Jetty(9.1.0.v20131115)


I have a properties file FooterPanel.properties that contains the 
following line (encoded in ISO-8859-1, as properties files unfortunately 
require):


   copyright=© 2014 Example, Inc.


FooterPanel.html is encoded in UTF-8, has the appropriate XML prolog, 
and contains the following reference to the property resource:


   ?xml version=1.0 encoding=utf-8?
   ...
   psmallwicket:message key=copyright©
   Example/wicket:message/small/p


When this all is rendered, here is what I see in Firefox 31 and Chrome 37:

   � 2014 Example, Inc.


I thought I had all the correct encoding indicators at each stage in the 
pipeline. But somebody blinked. Where is the problem?


Garret


Re: resource encoding troubles

2014-08-28 Thread Francois Meillet
Look at 
http://apache-wicket.1842946.n4.nabble.com/How-to-localize-options-in-drop-down-tt4661751.html#a4661768


François Meillet
Formation Wicket - Développement Wicket





Le 28 août 2014 à 17:47, Garret Wilson gar...@globalmentor.com a écrit :

 I have Wicket 7.0.0-M2 running on embedded Jetty, which is correctly 
 returning a content type of UTF-8 for my Wicket page:
 
   Date: Thu, 28 Aug 2014 15:37:52 GMT
   Expires: Thu, 01 Jan 1970 00:00:00 GMT
   Pragma: no-cache
   Cache-Control: no-cache, no-store
   Content-Type: text/html; charset=UTF-8
   Transfer-Encoding: chunked
   Server: Jetty(9.1.0.v20131115)
 
 
 I have a properties file FooterPanel.properties that contains the following 
 line (encoded in ISO-8859-1, as properties files unfortunately require):
 
   copyright=© 2014 Example, Inc.
 
 
 FooterPanel.html is encoded in UTF-8, has the appropriate XML prolog, and 
 contains the following reference to the property resource:
 
   ?xml version=1.0 encoding=utf-8?
   ...
   psmallwicket:message key=copyright©
   Example/wicket:message/small/p
 
 
 When this all is rendered, here is what I see in Firefox 31 and Chrome 37:
 
   � 2014 Example, Inc.
 
 
 I thought I had all the correct encoding indicators at each stage in the 
 pipeline. But somebody blinked. Where is the problem?
 
 Garret



Re: resource encoding troubles

2014-08-28 Thread Garret Wilson
Please explain explicitly what you are trying to say. I don't see how 
that link is relevant.


* I am using FooterPanel.properties.
* Java properties files, as per the specification 
http://docs.oracle.com/javase/8/docs/api/java/util/Properties.html, 
are (and always have been) encoded in ISO-8859-1. (I don't like this 
either, but that's how it is.)
* My FooBar.properties file is encoded in ISO-8859-1, and there is 
nothing to indicate otherwise. There is no BOM present. There is no 
utf in the filename.
* The character © is U+00A9, which takes up exactly one byte in 
ISO-8859-1. It is correctly encoded in FooterPanel.properties.



So what specifically are you implying by the link? Are you implying that 
Wicket does not support the Java properties specification? Are you 
implying I did something incorrectly in my properties file? Please 
elaborate.


Garret

On 8/28/2014 9:57 AM, Francois Meillet wrote:

Look at 
http://apache-wicket.1842946.n4.nabble.com/How-to-localize-options-in-drop-down-tt4661751.html#a4661768


François Meillet
Formation Wicket - Développement Wicket





Le 28 août 2014 à 17:47, Garret Wilson gar...@globalmentor.com a écrit :


I have Wicket 7.0.0-M2 running on embedded Jetty, which is correctly returning 
a content type of UTF-8 for my Wicket page:

   Date: Thu, 28 Aug 2014 15:37:52 GMT
   Expires: Thu, 01 Jan 1970 00:00:00 GMT
   Pragma: no-cache
   Cache-Control: no-cache, no-store
   Content-Type: text/html; charset=UTF-8
   Transfer-Encoding: chunked
   Server: Jetty(9.1.0.v20131115)


I have a properties file FooterPanel.properties that contains the following 
line (encoded in ISO-8859-1, as properties files unfortunately require):

   copyright=© 2014 Example, Inc.


FooterPanel.html is encoded in UTF-8, has the appropriate XML prolog, and 
contains the following reference to the property resource:

   ?xml version=1.0 encoding=utf-8?
   ...
   psmallwicket:message key=copyright©
   Example/wicket:message/small/p


When this all is rendered, here is what I see in Firefox 31 and Chrome 37:

   � 2014 Example, Inc.


I thought I had all the correct encoding indicators at each stage in the 
pipeline. But somebody blinked. Where is the problem?

Garret






Re: resource encoding troubles

2014-08-28 Thread Francois Meillet
use *.utf8.properties

François Meillet
Formation Wicket - Développement Wicket





Le 28 août 2014 à 17:47, Garret Wilson gar...@globalmentor.com a écrit :

 I have Wicket 7.0.0-M2 running on embedded Jetty, which is correctly 
 returning a content type of UTF-8 for my Wicket page:
 
   Date: Thu, 28 Aug 2014 15:37:52 GMT
   Expires: Thu, 01 Jan 1970 00:00:00 GMT
   Pragma: no-cache
   Cache-Control: no-cache, no-store
   Content-Type: text/html; charset=UTF-8
   Transfer-Encoding: chunked
   Server: Jetty(9.1.0.v20131115)
 
 
 I have a properties file FooterPanel.properties that contains the following 
 line (encoded in ISO-8859-1, as properties files unfortunately require):
 
   copyright=© 2014 Example, Inc.
 
 
 FooterPanel.html is encoded in UTF-8, has the appropriate XML prolog, and 
 contains the following reference to the property resource:
 
   ?xml version=1.0 encoding=utf-8?
   ...
   psmallwicket:message key=copyright©
   Example/wicket:message/small/p
 
 
 When this all is rendered, here is what I see in Firefox 31 and Chrome 37:
 
   � 2014 Example, Inc.
 
 
 I thought I had all the correct encoding indicators at each stage in the 
 pipeline. But somebody blinked. Where is the problem?
 
 Garret



Re: resource encoding troubles

2014-08-28 Thread Garret Wilson
So are you saying that Wicket does not support ISO-8859-1 properties 
files that adhere do the Java standard? Or are you saying, I don't know 
what the problem is, I'm just giving you a workaround? If so, I 
appreciate the workaround tip, but that still doesn't explain what the 
problem is.


I'm the sort of person who doesn't like to wave my hands as we say. I 
like to find the source of the problem. My configuration, as far as I 
can tell, is correct. Moreover, it is technically more correct than 
the *.utf8.properties approach, as my approach follows the standard. 
In fact my approach should be the default. So does anyone know why my 
configuration does not work? What am I doing wrong?


Sincerely,

Garret

On 8/28/2014 10:18 AM, Francois Meillet wrote:

use *.utf8.properties

François Meillet
Formation Wicket - Développement Wicket





Le 28 août 2014 à 17:47, Garret Wilson gar...@globalmentor.com a écrit :


I have Wicket 7.0.0-M2 running on embedded Jetty, which is correctly returning 
a content type of UTF-8 for my Wicket page:

   Date: Thu, 28 Aug 2014 15:37:52 GMT
   Expires: Thu, 01 Jan 1970 00:00:00 GMT
   Pragma: no-cache
   Cache-Control: no-cache, no-store
   Content-Type: text/html; charset=UTF-8
   Transfer-Encoding: chunked
   Server: Jetty(9.1.0.v20131115)


I have a properties file FooterPanel.properties that contains the following 
line (encoded in ISO-8859-1, as properties files unfortunately require):

   copyright=© 2014 Example, Inc.


FooterPanel.html is encoded in UTF-8, has the appropriate XML prolog, and 
contains the following reference to the property resource:

   ?xml version=1.0 encoding=utf-8?
   ...
   psmallwicket:message key=copyright©
   Example/wicket:message/small/p


When this all is rendered, here is what I see in Firefox 31 and Chrome 37:

   � 2014 Example, Inc.


I thought I had all the correct encoding indicators at each stage in the 
pipeline. But somebody blinked. Where is the problem?

Garret





-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Francois Meillet
http://wicket.apache.org/guide/guide/

François Meillet
Formation Wicket - Développement Wicket





Le 28 août 2014 à 19:24, Garret Wilson gar...@globalmentor.com a écrit :

 So are you saying that Wicket does not support ISO-8859-1 properties files 
 that adhere do the Java standard? Or are you saying, I don't know what the 
 problem is, I'm just giving you a workaround? If so, I appreciate the 
 workaround tip, but that still doesn't explain what the problem is.
 
 I'm the sort of person who doesn't like to wave my hands as we say. I like 
 to find the source of the problem. My configuration, as far as I can tell, is 
 correct. Moreover, it is technically more correct than the 
 *.utf8.properties approach, as my approach follows the standard. In fact my 
 approach should be the default. So does anyone know why my configuration does 
 not work? What am I doing wrong?
 
 Sincerely,
 
 Garret
 
 On 8/28/2014 10:18 AM, Francois Meillet wrote:
 use *.utf8.properties
 
 François Meillet
 Formation Wicket - Développement Wicket
 
 
 
 
 
 Le 28 août 2014 à 17:47, Garret Wilson gar...@globalmentor.com a écrit :
 
 I have Wicket 7.0.0-M2 running on embedded Jetty, which is correctly 
 returning a content type of UTF-8 for my Wicket page:
 
   Date: Thu, 28 Aug 2014 15:37:52 GMT
   Expires: Thu, 01 Jan 1970 00:00:00 GMT
   Pragma: no-cache
   Cache-Control: no-cache, no-store
   Content-Type: text/html; charset=UTF-8
   Transfer-Encoding: chunked
   Server: Jetty(9.1.0.v20131115)
 
 
 I have a properties file FooterPanel.properties that contains the following 
 line (encoded in ISO-8859-1, as properties files unfortunately require):
 
   copyright=© 2014 Example, Inc.
 
 
 FooterPanel.html is encoded in UTF-8, has the appropriate XML prolog, and 
 contains the following reference to the property resource:
 
   ?xml version=1.0 encoding=utf-8?
   ...
   psmallwicket:message key=copyright©
   Example/wicket:message/small/p
 
 
 When this all is rendered, here is what I see in Firefox 31 and Chrome 37:
 
   � 2014 Example, Inc.
 
 
 I thought I had all the correct encoding indicators at each stage in the 
 pipeline. But somebody blinked. Where is the problem?
 
 Garret
 
 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
 For additional commands, e-mail: users-h...@wicket.apache.org
 



Re: resource encoding troubles

2014-08-28 Thread Garret Wilson
Exactly! Quoting from the page you provided: Java uses the standard 
character set ISO 8859-11 to encode text files like properties files. 
... (Note that this is a typo above---the author meant to say ISO 
8859-1, not ISO 8859-11. The link to 
http://en.wikipedia.org/wiki/ISO/IEC_8859-1 in the text is correct, 
however.)


So according to that description, my FooterPanel.properties file is 
expected to be encoded in ISO-8859-1. And indeed it is, as I have 
repeatedly explained.


So I ask again: what is wrong with my current configuration?

Garret

On 8/28/2014 10:25 AM, Francois Meillet wrote:

http://wicket.apache.org/guide/guide/

François Meillet
Formation Wicket - Développement Wicket





Le 28 août 2014 à 19:24, Garret Wilson gar...@globalmentor.com a écrit :


So are you saying that Wicket does not support ISO-8859-1 properties files that adhere do 
the Java standard? Or are you saying, I don't know what the problem is, I'm just 
giving you a workaround? If so, I appreciate the workaround tip, but that still 
doesn't explain what the problem is.

I'm the sort of person who doesn't like to wave my hands as we say. I like to find the source of 
the problem. My configuration, as far as I can tell, is correct. Moreover, it is technically more 
correct than the *.utf8.properties approach, as my approach follows the standard. In fact 
my approach should be the default. So does anyone know why my configuration does not work? What am I doing 
wrong?

Sincerely,

Garret

On 8/28/2014 10:18 AM, Francois Meillet wrote:

use *.utf8.properties

François Meillet
Formation Wicket - Développement Wicket





Le 28 août 2014 à 17:47, Garret Wilson gar...@globalmentor.com a écrit :


I have Wicket 7.0.0-M2 running on embedded Jetty, which is correctly returning 
a content type of UTF-8 for my Wicket page:

   Date: Thu, 28 Aug 2014 15:37:52 GMT
   Expires: Thu, 01 Jan 1970 00:00:00 GMT
   Pragma: no-cache
   Cache-Control: no-cache, no-store
   Content-Type: text/html; charset=UTF-8
   Transfer-Encoding: chunked
   Server: Jetty(9.1.0.v20131115)


I have a properties file FooterPanel.properties that contains the following 
line (encoded in ISO-8859-1, as properties files unfortunately require):

   copyright=© 2014 Example, Inc.


FooterPanel.html is encoded in UTF-8, has the appropriate XML prolog, and 
contains the following reference to the property resource:

   ?xml version=1.0 encoding=utf-8?
   ...
   psmallwicket:message key=copyright©
   Example/wicket:message/small/p


When this all is rendered, here is what I see in Firefox 31 and Chrome 37:

   � 2014 Example, Inc.


I thought I had all the correct encoding indicators at each stage in the 
pipeline. But somebody blinked. Where is the problem?

Garret


-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org






-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Andrea Del Bene

Have you tried using directly unicode character? i.e.:

copyright=\u00A9 2014 Example, Inc.

If you don't want to use unicode characters you should use an xml file 
as bundle file.
Exactly! Quoting from the page you provided: Java uses the standard 
character set ISO 8859-11 to encode text files like properties files. 
... (Note that this is a typo above---the author meant to say ISO 
8859-1, not ISO 8859-11. The link to 
http://en.wikipedia.org/wiki/ISO/IEC_8859-1 in the text is correct, 
however.)


So according to that description, my FooterPanel.properties file is 
expected to be encoded in ISO-8859-1. And indeed it is, as I have 
repeatedly explained.


So I ask again: what is wrong with my current configuration?

Garret




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Stefan Renz
Hi Garret,

Garret Wilson wrote:
 Exactly! Quoting from the page you provided: Java uses the standard
 character set ISO 8859-11 to encode text files like properties files.
 ... (Note that this is a typo above---the author meant to say ISO
 8859-1, not ISO 8859-11. The link to
 http://en.wikipedia.org/wiki/ISO/IEC_8859-1 in the text is correct,
 however.)
 
 So according to that description, my FooterPanel.properties file is
 expected to be encoded in ISO-8859-1. And indeed it is, as I have
 repeatedly explained.
 
 So I ask again: what is wrong with my current configuration?

if I read your original post correctly, you have not used ISO-8859-1
encoding in your property file, as I clearly see a (C) symbol. As others
already pointed out, you should use the \u... coded (c)-symbol.

However, as Francois stated, you can use a property file called
FooterPanel.utf8.properties, in which you can use UTF8-encoding and
nicely write all your labels and/or translations.

I don't know about Wicket 7, but in Wicket 6 this works like a charm,
whereever the .utf8.properties-stuff comes from.

Cheers,
Stefan

 
 Garret
 
 On 8/28/2014 10:25 AM, Francois Meillet wrote:
 http://wicket.apache.org/guide/guide/

 François Meillet
 Formation Wicket - Développement Wicket





 Le 28 août 2014 à 19:24, Garret Wilson gar...@globalmentor.com a
 écrit :

 So are you saying that Wicket does not support ISO-8859-1 properties
 files that adhere do the Java standard? Or are you saying, I don't
 know what the problem is, I'm just giving you a workaround? If so, I
 appreciate the workaround tip, but that still doesn't explain what
 the problem is.

 I'm the sort of person who doesn't like to wave my hands as we say.
 I like to find the source of the problem. My configuration, as far as
 I can tell, is correct. Moreover, it is technically more correct
 than the *.utf8.properties approach, as my approach follows the
 standard. In fact my approach should be the default. So does anyone
 know why my configuration does not work? What am I doing wrong?

 Sincerely,

 Garret

 On 8/28/2014 10:18 AM, Francois Meillet wrote:
 use *.utf8.properties

 François Meillet
 Formation Wicket - Développement Wicket





 Le 28 août 2014 à 17:47, Garret Wilson gar...@globalmentor.com a
 écrit :

 I have Wicket 7.0.0-M2 running on embedded Jetty, which is
 correctly returning a content type of UTF-8 for my Wicket page:

Date: Thu, 28 Aug 2014 15:37:52 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Pragma: no-cache
Cache-Control: no-cache, no-store
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Server: Jetty(9.1.0.v20131115)


 I have a properties file FooterPanel.properties that contains the
 following line (encoded in ISO-8859-1, as properties files
 unfortunately require):

copyright=© 2014 Example, Inc.


 FooterPanel.html is encoded in UTF-8, has the appropriate XML
 prolog, and contains the following reference to the property resource:

?xml version=1.0 encoding=utf-8?
...
psmallwicket:message key=copyright©
Example/wicket:message/small/p


 When this all is rendered, here is what I see in Firefox 31 and
 Chrome 37:

� 2014 Example, Inc.


 I thought I had all the correct encoding indicators at each stage
 in the pipeline. But somebody blinked. Where is the problem?

 Garret

 -
 To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
 For additional commands, e-mail: users-h...@wicket.apache.org


 
 
 -
 To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
 For additional commands, e-mail: users-h...@wicket.apache.org
 

-- 
im Auftrag der eFonds Solutions AG, +49-89-579494-3417


-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Garret Wilson
I appreciate all the workarounds suggested. But no one has addressed the 
core issue: Is this a Wicket bug, or am I using standard property files 
incorrectly?


Garret

On 8/28/2014 10:42 AM, Andrea Del Bene wrote:

Have you tried using directly unicode character? i.e.:

copyright=\u00A9 2014 Example, Inc.

If you don't want to use unicode characters you should use an xml file 
as bundle file.
Exactly! Quoting from the page you provided: Java uses the standard 
character set ISO 8859-11 to encode text files like properties files. 
... (Note that this is a typo above---the author meant to say ISO 
8859-1, not ISO 8859-11. The link to 
http://en.wikipedia.org/wiki/ISO/IEC_8859-1 in the text is correct, 
however.)


So according to that description, my FooterPanel.properties file is 
expected to be encoded in ISO-8859-1. And indeed it is, as I have 
repeatedly explained.


So I ask again: what is wrong with my current configuration?

Garret




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Garret Wilson

On 8/28/2014 10:53 AM, Stefan Renz wrote:

...
if I read your original post correctly, you have not used ISO-8859-1
encoding in your property file, as I clearly see a (C) symbol.


Since when is © (U+00A9) not part of ISO-8859-1?

http://en.wikipedia.org/wiki/ISO/IEC_8859-1

Garret

-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Andrea Del Bene
It's just an encoding conflict: your properties uses ISO-8859-1, your 
page UTF-8. The result is a bad rendering, as you can see. When Java 
designers decided to adopt ISO-8859-1 they didn't consider most of the 
Asian languages...

PS: just as a personal advice, try to be less rude in your answers ;)
I appreciate all the workarounds suggested. But no one has addressed 
the core issue: Is this a Wicket bug, or am I using standard property 
files incorrectly?


Garret




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Sven Meier

Hi Garret,

 I like to find the source of the problem.

Me too :).

 My configuration, as far as I can tell, is correct.

From what you've written, I'd agree.

You should create a quickstart. This will easily allow us to find a 
possible bug.


Regards
Sven


On 08/28/2014 07:56 PM, Garret Wilson wrote:
I appreciate all the workarounds suggested. But no one has addressed 
the core issue: Is this a Wicket bug, or am I using standard property 
files incorrectly?


Garret

On 8/28/2014 10:42 AM, Andrea Del Bene wrote:

Have you tried using directly unicode character? i.e.:

copyright=\u00A9 2014 Example, Inc.

If you don't want to use unicode characters you should use an xml 
file as bundle file.
Exactly! Quoting from the page you provided: Java uses the standard 
character set ISO 8859-11 to encode text files like properties 
files. ... (Note that this is a typo above---the author meant to 
say ISO 8859-1, not ISO 8859-11. The link to 
http://en.wikipedia.org/wiki/ISO/IEC_8859-1 in the text is correct, 
however.)


So according to that description, my FooterPanel.properties file is 
expected to be encoded in ISO-8859-1. And indeed it is, as I have 
repeatedly explained.


So I ask again: what is wrong with my current configuration?

Garret




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org




-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Garret Wilson

On 8/28/2014 11:14 AM, Andrea Del Bene wrote:
It's just an encoding conflict: your properties uses ISO-8859-1, your 
page UTF-8. The result is a bad rendering, as you can see. When Java 
designers decided to adopt ISO-8859-1 they didn't consider most of the 
Asian languages...

PS: just as a personal advice, try to be less rude in your answers ;)


Andrea, I'm sorry, I'll really try. My answers were probably terse 
(short and to the point), and you probably sense a frustration on my 
part with the lack of basic understanding in the software development 
world on the fundamentals of software encoding.


For example, your answer seems to assume that some function simply loads 
two sets of bytes and merges them together. That's not what happens at 
all. (Or at least I hope that's not what happens---it would indicate 
that the coder had no idea how to approach the task.) In fact their are 
two layers to the encoding stack: the byte-level processing, and the 
character level processing. The Java Properties class should correctly 
take the bytes in the character file and do the ISO 8859-1 encoding, 
producing a character stream to be parsed. This is already implemented 
in Java, and has been for well over a decade, I believe.


Similarly, an XML processor will take the bytes in an XML file and 
transform them based upon the encoding (in this case, UTF-8) and produce 
a stream of characters. All XML processors are required to be able to 
perform this transformation, and have been for well over a decade.


Now that bother input sources produce data and the character level, the 
original byte-level encoding is irrelevant. At the character level, 
there is no encoding conflict, because there is no encoding. (There 
exists the in-memory encoding used by the JVM, but that's irrelevant to 
the discussion and will certainly be the same for all strings used.) 
Thus the two input streams can be mixed together without worry of 
encoding. If this is not what happens within Wicket, there is a software 
bug---but not an encoding conflict.


I recommend you start by reading read 
http://www.joelonsoftware.com/articles/Unicode.html . If you have any 
questions, I'll be happy to answer any specific questions.


I apologize again for being brusk, but I'll do my best to explain things 
if others honestly have questions.


Garret

-
To unsubscribe, e-mail: users-unsubscr...@wicket.apache.org
For additional commands, e-mail: users-h...@wicket.apache.org



Re: resource encoding troubles

2014-08-28 Thread Garret Wilson

On 8/28/2014 12:08 PM, Sven Meier wrote:

...

 My configuration, as far as I can tell, is correct.

From what you've written, I'd agree.

You should create a quickstart. This will easily allow us to find a 
possible bug.


Better than that, I'd like to trace down the bug, fix it, and file a 
patch. But currently I'm blocked from working with Wicket on Eclipse 
https://issues.apache.org/jira/browse/WICKET-5649.


Garret


Re: resource encoding troubles

2014-08-28 Thread Andrew Geery
I created a Wicket quickstart (from
http://wicket.apache.org/start/quickstart.html) [this is Wicket 6.16.0] and
made two simple changes:

1) I created a HomePage.properties file, encoded as ISO-8859-1, with a
single line as per the example above: copyright=© 2014 Example, Inc.

2) I added a line to the HomePage.html file as per the example
above: psmallwicket:message key=copyright©
Example/wicket:message/small/p

The content is served as UTF-8 and the copyright symbol is rendered
correctly on the page.

It doesn't look like the problem is in Wicket (at least not in 6.16).  I
guess your next steps would be to verify that you get the same results and,
assuming that you do, start removing things from your page that has the
problem until you find an element that is causing the problem.

Thanks
Andrew


On Thu, Aug 28, 2014 at 5:38 PM, Garret Wilson gar...@globalmentor.com
wrote:

 On 8/28/2014 12:08 PM, Sven Meier wrote:

 ...


  My configuration, as far as I can tell, is correct.

 From what you've written, I'd agree.

 You should create a quickstart. This will easily allow us to find a
 possible bug.


 Better than that, I'd like to trace down the bug, fix it, and file a
 patch. But currently I'm blocked from working with Wicket on Eclipse 
 https://issues.apache.org/jira/browse/WICKET-5649.

 Garret