Re: Character Encoding problems

2001-12-07 Thread Nikola Milutinovic

Ivo Panacek wrote:

>> Hmmm, since I'm in a JSP page and not Servlet, this should be done 
>> automagically. When I run Jasper on the "test.jsp" file, this is what I 
> 
> 
> Aha. I use JSP only too and it works for me.
> My configuration is:
> 
> RedHat 7.1 + Apache + jdk 1.3 (blackdown) + Tomcat 4.0.1 + webapp module.
> 
> This was my testing page:
> 
> <%--
> <%@ page contentType="text/html;charset=ISO-8859-2" %>
> <%@ page pageEncoding="ISO-8859-2" %>
> --%>
> <%@ page contentType="text/html;charset=UTF-8" %>
> <%@ page pageEncoding="UTF-8" %>
> 
> <%@ page import="HelloBean3" %>
> 
> 
> 
> 
> 
> mojeAhojFazole3
> 
> 
> 
> Gratuluji, tohle opravdu je funkc(ní JSP aplikace,
> která dokonce ne(co de(lá.
> 
> 
> ... form testing


Similar stuff here, I just stuff all "page" directives into one, but it sets all 
it should set in a resulting Java file.

The trouble is, I've tested a Servlet, too. Again, a "failure to communicate" 
occurs - I get "?". The same ones I get when I retrieve data via JDBC and 
display it with default JVM encoding.


> Both UTF-8 and ISO-8859-2 versions worked well,
> tested with Mozilla (0.9.x) on Linux and IE on Windows.


They should.


> ISO-8859-2 characters were in html text, but relevant
> part of resulting java is:
> 
> out.write("\r\n\r\n\r\nmojeAhojFazole3\r\n bgcolor=\"white\">\r\n\r\n\r\nGratuluji, tohle opravdu je funkÄnA(­ 
> JSP aplikace,\r\nkterA(A; dokonce nÄ?co 
> dÄ?lA(A;.\r\n\r\n\r\n\r\n");
> 
> ... so simple out.write, text is in default UTF-8.


So, Jasper transformed ISO-8859-2 to UTF-8? It never did that for me...


> Problem was only with input from forms. Testing showed, that
> browser does not write encoding in mime headers, but it use
> the same encoding as in original page. So I use simple
> filter (found via this mail-list -- name was
> SetCharacterEncodingFilter.java -- I can send it)
> and now I have no problems.


I still haven't come that far :-<


>> %>
>> <%!
>> String testText;
>> %>
>> <%
>> testText = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D \uC490 \uC491 
>> \uC5BD \uC5BE";
>> %>
> 
> Just one idea:
> what are those unicode characters?


Those are Unicode characters for: S-caron, s-caron, C-acsan, c-acsan, C-caron, 
c-caron, D-slash, d-slash, Z-caron and z-caron.


> I've just tested this code:
> 
>public static void main( String args[] ) {
>String text1 = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D \uC490 
> \uC491 \uC5BD \uC5BE";
>String text2 = "e(s(c(r(z(ýáíé";
> 
>try {
>PrintWriter wr = new PrintWriter(
>  new OutputStreamWriter( System.out, "8859_2" ) );
>wr.println("\""+text1+"\"");
>wr.println("\""+text2+"\"");
>wr.close();
>}
>catch( Exception e ) {
>System.err.println("Error: "+e);
>}
>}
> }
> 
> 
> And result is:
> 
> "? ? ? ? ? ? ? ? ? ?"
> "e(s(c(r(z(ýáíé"


The trouth is, a JDBC driver expert from PostgreSQL said the only way to produce 
correct output with my test string was to use:

wr.write( text1.getBytes( "ISO-8859-2" ) );

I'm quite sure that I'm not messing anything up. Or if I am, then so is 
PostgreSQL. I have entered Win-1250 data into a database, which had an 
ISO-8859-2 internal encoding. I was using correct client-to-server encoding and 
the data ended up as ISO-8859-2 formated strings. Then I used JDBC to access the 
data and I really got UTF-8 with our alphabet specific characters. Only 
converting that string to a byte array with specific encoding would do the trick.

I'll try UTF-8 next time.

Nix.





--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Character Encoding problems

2001-12-07 Thread Ivo Panacek

Nikola Milutinovic wrote:

> Ivo Panacek wrote:
> 
>>> response.setContentType("text/html; charset=ISO-8859-2");
>>>
>>> So, no trouble there. How do I get a (Unicode) string to convert to a 
>>> ISO-8859-2 encoded byte stream? Because, eventually, that is what the 
>>> browser should get. I cannot use the method from above, since 
>>> JspWriter doesn't accept byte[] as an argument.
>>
>>
>>
>> Java does it for you. If you retrieve output stream AFTER setting 
>> content type
>> with out = pageContext.getOut(); you write to out in UNICODE and output
>> is in ISO-8859-2.
> 
> 
> Hmmm, since I'm in a JSP page and not Servlet, this should be done 
> automagically. When I run Jasper on the "test.jsp" file, this is what I 

Aha. I use JSP only too and it works for me.
My configuration is:

RedHat 7.1 + Apache + jdk 1.3 (blackdown) + Tomcat 4.0.1 + webapp module.

This was my testing page:

<%--
<%@ page contentType="text/html;charset=ISO-8859-2" %>
<%@ page pageEncoding="ISO-8859-2" %>
--%>
<%@ page contentType="text/html;charset=UTF-8" %>
<%@ page pageEncoding="UTF-8" %>

<%@ page import="HelloBean3" %>





mojeAhojFazole3



Gratuluji, tohle opravdu je funkc(ní JSP aplikace,
která dokonce ne(co de(lá.


... form testing

Both UTF-8 and ISO-8859-2 versions worked well,
tested with Mozilla (0.9.x) on Linux and IE on Windows.

ISO-8859-2 characters were in html text, but relevant
part of resulting java is:

out.write("\r\n\r\n\r\nmojeAhojFazole3\r\n\r\n\r\n\r\nGratuluji, tohle opravdu je funkÄnA(­ JSP 
aplikace,\r\nkterA(A; dokonce nÄ?co dÄ?lA(A;.\r\n\r\n\r\n\r\n");

... so simple out.write, text is in default UTF-8.

Problem was only with input from forms. Testing showed, that
browser does not write encoding in mime headers, but it use
the same encoding as in original page. So I use simple
filter (found via this mail-list -- name was
SetCharacterEncodingFilter.java -- I can send it)
and now I have no problems.


> %>
> <%!
> String testText;
> %>
> <%
> testText = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D \uC490 \uC491 \uC5BD \uC5BE";
> %>



Just one idea:
what are those unicode characters ?
I've just tested this code:

public static void main( String args[] ) {
String text1 = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D \uC490 \uC491 \uC5BD 
\uC5BE";
String text2 = "e(s(c(r(z(ýáíé";

try {
PrintWriter wr = new PrintWriter(
  new OutputStreamWriter( System.out, "8859_2" ) );
wr.println("\""+text1+"\"");
wr.println("\""+text2+"\"");
wr.close();
}
catch( Exception e ) {
System.err.println("Error: "+e);
}
}
}


And result is:

"? ? ? ? ? ? ? ? ? ?"
"e(s(c(r(z(ýáíé"

ivo
-- 
E-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]
WWW:http://ivop.regionet.cz
Mobile: +420 602 337776


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Character Encoding problems 2

2001-12-06 Thread Gregor Kovac(

Hi!

Nikola Milutinovic wrote:

I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You 
have to have in mind this was not done on Tomcat 4.x, but 3.x.
This is what I have done:
- <%@page contentType="text/html; charset=windows-1250"%> on top of 
every JSP file


>>>I don't think that is a correct character encoding as far as Java is concerned. I 
>think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong
>>>
>>>
>>
>>I'm sorry, butr you are wrong. You can convert between numerous 
>>encodings, but you have to have i18n.jar in your classpath.
>>
> 
> Hmm, I thought that Java community loathed anything but ISO, where can I find 
>i18n.jar? I'll look for it on Sun's site, but if it is not there, drop me a line.
> 

You can get it in jre/lib directory of your JDK install directory.


> 
>>>What I'm looking for is a "politically correct" solution. I have so far:
>>>
>>>- PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same data 
>in correct form.
>>>- JDBC driver which is acting OK.
>>>- JSP pages with correctly set pageEncoding
>>>- Java Servlet with correctly set contentType/encoding
>>>
>>>Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.
>>>
>>>
>>Have you tried putting %@page contentType="text/html; 
>>charset=iso8859-2"%> on top of your JSP's ?
>>
> 
> Always. And that is what is driving me crazy. I have even tested what is the 
>character encoding of the ServletResponse object - it was OK, ISO-8859-2. The trouth 
>is I'm running 4.0.1 and I have been looking at sources for 4.0. I'll test 4.0 and if 
>it displays characters correctly, there's gonna be a bug report.
> 
> Nix.
> 

Best regards,
Kovi


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Character Encoding problems 2

2001-12-05 Thread Nikola Milutinovic

> >>I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You 
> >>have to have in mind this was not done on Tomcat 4.x, but 3.x.
> >>This is what I have done:
> >>- <%@page contentType="text/html; charset=windows-1250"%> on top of 
> >>every JSP file
> >>
> > 
> > I don't think that is a correct character encoding as far as Java is concerned. I 
>think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong
> > 
> 
> 
> I'm sorry, butr you are wrong. You can convert between numerous 
> encodings, but you have to have i18n.jar in your classpath.

Hmm, I thought that Java community loathed anything but ISO, where can I find 
i18n.jar? I'll look for it on Sun's site, but if it is not there, drop me a line.

> > What I'm looking for is a "politically correct" solution. I have so far:
> > 
> > - PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same 
>data in correct form.
> > - JDBC driver which is acting OK.
> > - JSP pages with correctly set pageEncoding
> > - Java Servlet with correctly set contentType/encoding
> > 
> > Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.
> > 
> 
> Have you tried putting %@page contentType="text/html; 
> charset=iso8859-2"%> on top of your JSP's ?

Always. And that is what is driving me crazy. I have even tested what is the character 
encoding of the ServletResponse object - it was OK, ISO-8859-2. The trouth is I'm 
running 4.0.1 and I have been looking at sources for 4.0. I'll test 4.0 and if it 
displays characters correctly, there's gonna be a bug report.

Nix.



Re: Character Encoding problems 2

2001-12-05 Thread Gregor Kovac(

Hi!

Nikola Milutinovic wrote:

>>I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You 
>>have to have in mind this was not done on Tomcat 4.x, but 3.x.
>>This is what I have done:
>>- <%@page contentType="text/html; charset=windows-1250"%> on top of 
>>every JSP file
>>
> 
> I don't think that is a correct character encoding as far as Java is concerned. I 
>think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong
> 


I'm sorry, butr you are wrong. You can convert between numerous 
encodings, but you have to have i18n.jar in your classpath.


> 
>>- default_character_set=latin2 in my.cnf
>>
> 
> Is there a way to set defaul character encoding for Tomcat? Setting LOCALLE on Unix?
> 


Hmm, I wouldn't know Sorry.


> 
>>- created new database so it gets created in latin2 character set
>>
> 
> Done that with PostgreSQL.
> 
> 
>>- when I connected to MySQL I was using mm.mysql driver and the database 
>>URL was 
>>jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true
>>
> 
> I've never used MySQL, just PostgreSQL. So, the database is ISO-8859-2 and this 
>converts it to CP-1250, which goes by as Latin-1, as far as Tomcat is concerned.
> 
> I have had a similar "success" with my setup: the database was Latin-1, the data in 
>it was win-1250 and when I forced JDBC connection to Latin-1 charset, it would pass 
>through JSP. But that is such a hack...
> 
> 
>>Then all characters were correctly displayed on JSP pages.
>>
> 
> What I'm looking for is a "politically correct" solution. I have so far:
> 
> - PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same data 
>in correct form.
> - JDBC driver which is acting OK.
> - JSP pages with correctly set pageEncoding
> - Java Servlet with correctly set contentType/encoding
> 
> Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.
> 

Have you tried putting %@page contentType="text/html; 
charset=iso8859-2"%> on top of your JSP's ?


> Any help?
> 
> Nix.
> 


Best regards,
Kovi





--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Character Encoding problems

2001-12-05 Thread dimiter

I was in the same situation. But look the attached file and you'll have a
way to set any kind of character set.

Dimiter



NS.zip
Description: application/compressed

--
To unsubscribe:   
For additional commands: 
Troubles with the list: 


Re: Character Encoding problems 2

2001-12-05 Thread Martin Fekete

i got some problems with encoding too ... but there was problem when i
submit data from forms (page was in cp1250 submited data were iso-8859-?)
... when i submited and writed to DB characters was wrong ... solution was
to add filter which sets encoding of each request ...

more here ..
http://marc.theaimsgroup.com/?l=tomcat-user&m=100679292919360&w=2

feky

- Original Message -
From: "Nikola Milutinovic" <[EMAIL PROTECTED]>
To: "Tomcat Users List" <[EMAIL PROTECTED]>
Sent: Wednesday, December 05, 2001 2:19 PM
Subject: Re: Character Encoding problems 2


> > I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You
> > have to have in mind this was not done on Tomcat 4.x, but 3.x.
> > This is what I have done:
> > - <%@page contentType="text/html; charset=windows-1250"%> on top of
> > every JSP file
>
> I don't think that is a correct character encoding as far as Java is
concerned. I think Java supports only ISO-8859-* and UTF-*. Please correct
me if I'm wrong
>
> > - default_character_set=latin2 in my.cnf
>
> Is there a way to set defaul character encoding for Tomcat? Setting
LOCALLE on Unix?
>
> > - created new database so it gets created in latin2 character set
>
> Done that with PostgreSQL.
>
> > - when I connected to MySQL I was using mm.mysql driver and the database
> > URL was
> >
jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true
>
> I've never used MySQL, just PostgreSQL. So, the database is ISO-8859-2 and
this converts it to CP-1250, which goes by as Latin-1, as far as Tomcat is
concerned.
>
> I have had a similar "success" with my setup: the database was Latin-1,
the data in it was win-1250 and when I forced JDBC connection to Latin-1
charset, it would pass through JSP. But that is such a hack...
>
> > Then all characters were correctly displayed on JSP pages.
>
> What I'm looking for is a "politically correct" solution. I have so far:
>
> - PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the
same data in correct form.
> - JDBC driver which is acting OK.
> - JSP pages with correctly set pageEncoding
> - Java Servlet with correctly set contentType/encoding
>
> Still, Tomcat goes for default charset encoding and screwes up Latin-2
characters.
>
> Any help?
>
> Nix.
>



--
To unsubscribe:   <mailto:[EMAIL PROTECTED]>
For additional commands: <mailto:[EMAIL PROTECTED]>
Troubles with the list: <mailto:[EMAIL PROTECTED]>




Re: Character Encoding problems 2

2001-12-05 Thread Nikola Milutinovic

> I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You 
> have to have in mind this was not done on Tomcat 4.x, but 3.x.
> This is what I have done:
> - <%@page contentType="text/html; charset=windows-1250"%> on top of 
> every JSP file

I don't think that is a correct character encoding as far as Java is concerned. I 
think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong

> - default_character_set=latin2 in my.cnf

Is there a way to set defaul character encoding for Tomcat? Setting LOCALLE on Unix?

> - created new database so it gets created in latin2 character set

Done that with PostgreSQL.

> - when I connected to MySQL I was using mm.mysql driver and the database 
> URL was 
> jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true

I've never used MySQL, just PostgreSQL. So, the database is ISO-8859-2 and this 
converts it to CP-1250, which goes by as Latin-1, as far as Tomcat is concerned.

I have had a similar "success" with my setup: the database was Latin-1, the data in it 
was win-1250 and when I forced JDBC connection to Latin-1 charset, it would pass 
through JSP. But that is such a hack...

> Then all characters were correctly displayed on JSP pages.

What I'm looking for is a "politically correct" solution. I have so far:

- PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same data in 
correct form.
- JDBC driver which is acting OK.
- JSP pages with correctly set pageEncoding
- Java Servlet with correctly set contentType/encoding

Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.

Any help?

Nix.



Re: Character Encoding problems 2

2001-12-05 Thread Gregor Kovac(

Hi!

I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You 
have to have in mind this was not done on Tomcat 4.x, but 3.x.
This is what I have done:
- <%@page contentType="text/html; charset=windows-1250"%> on top of 
every JSP file
- default_character_set=latin2 in my.cnf
- created new database so it gets created in latin2 character set
- when I connected to MySQL I was using mm.mysql driver and the database 
URL was 
jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true

Then all characters were correctly displayed on JSP pages.

I hope this helps.

Best regards,
Kovi

Nikola Milutinovic wrote:

> Hi all.
> 
> It's me again and troubles are not resolved. I've created a simple test servlet:
> 
> 
> import javax.servlet.*;
> import java.io.*;
> 
> public class TestServlet extends GenericServlet {
>   private static final String testText = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D 
>\uC490 \uC491 \uC5BD \uC5BE";
>   PrintWriter out;
> 
>   public void service( ServletRequest req, ServletResponse res )
> throws javax.servlet.ServletException, java.io.IOException
> {
> res.setContentType("text/html; charset=ISO-8859-2");
> out = res.getWriter();
> out.print( "\r\nTest servlet\r\n" );
> out.print( "charset=iso-8859-2\">\r\n\r\n" );
> out.print( "\r\nTest\r\nLet us see how this gets 
>out\r\n\r\n" );
> out.print( testText );
> out.print( "\r\n\r\n" );
>   }
> }
> 
> 
> This prints "?" instead of characters. The string in question prints desired 
>characters in an ordinary Java application.
> 
> QUESTION 1
> ---
> How can I get Tomcat to honour "charset=ISO-8859-2"?
> 
> QUESTION 2
> ---
> What about static HTML? Suppose I should enter a part of static HTML data in Latin-2 
>encoding. That translates to a string. A string is supposed to be Unicode. Do those 
>strings get translated from "pageEncoding" to Unicode?
> 
> Nix.
> 



--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Character Encoding problems 2

2001-12-05 Thread Nikola Milutinovic

Hi all.

It's me again and troubles are not resolved. I've created a simple test servlet:


import javax.servlet.*;
import java.io.*;

public class TestServlet extends GenericServlet {
  private static final String testText = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D 
\uC490 \uC491 \uC5BD \uC5BE";
  PrintWriter out;

  public void service( ServletRequest req, ServletResponse res )
throws javax.servlet.ServletException, java.io.IOException
{
res.setContentType("text/html; charset=ISO-8859-2");
out = res.getWriter();
out.print( "\r\nTest servlet\r\n" );
out.print( "\r\n\r\n" );
out.print( "\r\nTest\r\nLet us see how this gets 
out\r\n\r\n" );
out.print( testText );
out.print( "\r\n\r\n" );
  }
}


This prints "?" instead of characters. The string in question prints desired 
characters in an ordinary Java application.

QUESTION 1
---
How can I get Tomcat to honour "charset=ISO-8859-2"?

QUESTION 2
---
What about static HTML? Suppose I should enter a part of static HTML data in Latin-2 
encoding. That translates to a string. A string is supposed to be Unicode. Do those 
strings get translated from "pageEncoding" to Unicode?

Nix.



Re: Character Encoding problems

2001-12-04 Thread Nikola Milutinovic

Marcin Jaskula wrote:

>>So, no trouble there. How do I get a (Unicode) string to convert
>>to a ISO-8859-2 encoded byte stream? Because, eventually, that is
>>what the browser should get. I cannot use the method from above,
>>since JspWriter doesn't accept byte[] as an argument.


> I managed it by:
> 1. The page with the input form MUST have encoding=iso-8859-2
> 
> 2. The DB encoding LATIN-2
> 3. In the source of the jsp page:
> <%@ page
> contentType="text/html; charset=iso-8859-2"
> 
> %>
> request.setCharacterEncoding("iso-8859-2"); // just after <%@page !! before
> you
>  // read any
> argument from request
> 
> and in the http headers:
> 
> // ^ it looks unnecessary but otherwise it doesn't work ???

Still no go. I'm not even trying to POST anything yet - just GET the 
data from a database. I know for sure that the data is OK and that JDBC 
is giving a correct string. The problem is in Tomcat using the (system?) 
default character encoding ISO-8859-1.

Would changing local settings of the OS result in a change? Is this a 
JVM or Tomcat issue? I think Tomcat should convert characters.

Nix.


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Character Encoding problems

2001-12-04 Thread Nikola Milutinovic

Marcin Jaskula wrote:

> 
> I used to have similar problem.
> I'm using Tomcat 4.0 with PostgreSQL, DB encoding LATIN-2.
> I couldn't get proper characters from the page source, an input form (post
> method) and from the DB.
> I managed it by:
> 1. The page with the input form MUST have encoding=iso-8859-2
> 
> 2. The DB encoding LATIN-2
> 3. In the source of the jsp page:
> <%@ page
> contentType="text/html; charset=iso-8859-2"
> 
> %>
> request.setCharacterEncoding("iso-8859-2"); // just after <%@page !! before
> you
>  // read any
> argument from request
> 
> and in the http headers:
> 
> // ^ it looks unnecessary but otherwise it doesn't work ???

Thanks.

Tomcat developers... any comment?

Nix.


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Character Encoding problems

2001-12-04 Thread Nikola Milutinovic

Ivo Panacek wrote:

>> response.setContentType("text/html; charset=ISO-8859-2");
>>
>> So, no trouble there. How do I get a (Unicode) string to convert to a 
>> ISO-8859-2 encoded byte stream? Because, eventually, that is what the 
>> browser should get. I cannot use the method from above, since 
>> JspWriter doesn't accept byte[] as an argument.
> 
> 
> Java does it for you. If you retrieve output stream AFTER setting 
> content type
> with out = pageContext.getOut(); you write to out in UNICODE and output
> is in ISO-8859-2.

Hmmm, since I'm in a JSP page and not Servlet, this should be done 
automagically. When I run Jasper on the "test.jsp" file, this is what I get:

TEST.JSP

<%@ page
   info="A test page"
   contentType="text/html; charset=ISO-8859-2"
%>
<%!
String testText;
%>
<%
testText = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D \uC490 \uC491 \uC5BD \uC5BE";
%>


Test page




Test page for encoding issues
And this is generated: <%= testText %>
And this is from a scriptlet: <% out.write( testText.getBytes("LATIN2") ); %>
Nix.



TEST.JAVA
-

[SNIP]
response.setContentType("text/html; charset=ISO-8859-2");
pageContext = _jspxFactory.getPageContext(this, request, response,
 "", true, 8192, true);

application = pageContext.getServletContext();
config = pageContext.getServletConfig();
session = pageContext.getSession();
out = pageContext.getOut();
[SNIP]

---

So, the "out" object IS obtained AFTER the content type has been set. Still, 
both <%= testText %> and the scriptlet produce the same result - "?" instead of 
Latin-2 characters.

It might be the JVM issue and I'll look into the possible options for setting 
the default output encoding, tomorrow.

BTW, I'd say you're using LATIN-2, so how do browser go about it? I have tried 
to create a static HTML page with what I believe is Latin-2 text (I have the 
right keyboard mapping). The result is frustrating, all characters are there, 
except small s-caron and small z-caron (capitals are OK). Any idea?

Nix.


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Character Encoding problems

2001-12-04 Thread Ivo Panacek

> response.setContentType("text/html; charset=ISO-8859-2");
> 
> So, no trouble there. How do I get a (Unicode) string to convert to a ISO-8859-2 
>encoded byte stream? Because, eventually, that is what the browser should get. I 
>cannot use the method from above, since JspWriter doesn't accept byte[] as an 
>argument.

Java does it for you. If you retrieve output stream AFTER setting content type
with out = pageContext.getOut(); you write to out in UNICODE and output
is in ISO-8859-2.

ivo
-- 
E-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]
WWW:http://ivop.regionet.cz
Mobile: +420 602 337776


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




RE: Character Encoding problems

2001-12-04 Thread Marcin Jaskula

> I'm developing a web application that uses textual data for
> Central&Eastern Europe. The text is in a database which is
> internally UNICODE. I also have another DB instance which is
> ISO-8859-2 encoded, so all options are in the play.
>
> I thought I should set contentType="text/html;
> charset=ISO-8859-2", declare a page to the same (just in case)
> and "sit back and enjoy myself". Unfortunately, I was wrong.
>
> Not only is the Latin-2 support in both IE and Netscape buggy
> (they wouldn't display "s-caron" and "z-caron", but would display
> Caps versions of those characters), but Java is bugging me, too.
> Instead of letters specific to our alphabet, I'm getting "?".
>
> With the help of a dedicated PostgreSQL JDBC developer, I have
> tracked this problem down to JVM, which has a default encoding of
> "ISO-8859-1". In a standalone Java application I can do explicit
> encoding, like this:
>
> System.out.write( testString.getBytes( "ISO-8859-2" ) );
>
> and it will print the characters I expect, instead of "?".
>
> What do I do in Tomcat?
>
> I have set contentType to "text/html; charset=ISO-8859-2" and in
> a generated Servlet code it really has:
>
> response.setContentType("text/html; charset=ISO-8859-2");
>
> So, no trouble there. How do I get a (Unicode) string to convert
> to a ISO-8859-2 encoded byte stream? Because, eventually, that is
> what the browser should get. I cannot use the method from above,
> since JspWriter doesn't accept byte[] as an argument.

Hi

I used to have similar problem.
I'm using Tomcat 4.0 with PostgreSQL, DB encoding LATIN-2.
I couldn't get proper characters from the page source, an input form (post
method) and from the DB.
I managed it by:
1. The page with the input form MUST have encoding=iso-8859-2

2. The DB encoding LATIN-2
3. In the source of the jsp page:
<%@ page
contentType="text/html; charset=iso-8859-2"

%>
request.setCharacterEncoding("iso-8859-2"); // just after <%@page !! before
you
 // read any
argument from request

and in the http headers:

// ^ it looks unnecessary but otherwise it doesn't work ???

All the best
Marcin


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Character Encoding problems

2001-12-04 Thread Nikola Milutinovic

Hi all.

I'm developing a web application that uses textual data for Central&Eastern Europe. 
The text is in a database which is internally UNICODE. I also have another DB instance 
which is ISO-8859-2 encoded, so all options are in the play.

I thought I should set contentType="text/html; charset=ISO-8859-2", declare a page to 
the same (just in case) and "sit back and enjoy myself". Unfortunately, I was wrong.

Not only is the Latin-2 support in both IE and Netscape buggy (they wouldn't display 
"s-caron" and "z-caron", but would display Caps versions of those characters), but 
Java is bugging me, too. Instead of letters specific to our alphabet, I'm getting "?".

With the help of a dedicated PostgreSQL JDBC developer, I have tracked this problem 
down to JVM, which has a default encoding of "ISO-8859-1". In a standalone Java 
application I can do explicit encoding, like this:

System.out.write( testString.getBytes( "ISO-8859-2" ) );

and it will print the characters I expect, instead of "?".

What do I do in Tomcat?

I have set contentType to "text/html; charset=ISO-8859-2" and in a generated Servlet 
code it really has:

response.setContentType("text/html; charset=ISO-8859-2");

So, no trouble there. How do I get a (Unicode) string to convert to a ISO-8859-2 
encoded byte stream? Because, eventually, that is what the browser should get. I 
cannot use the method from above, since JspWriter doesn't accept byte[] as an 
argument.

Nix.