Re: encoding problems with Jakarta-site

2001-03-17 Thread Robert Burrell Donkin

here's a fix which works for me

Jon Stevens wrote:

 Here is the little test program:

 import java.io.*;
 import java.lang.*;
 import java.util.*;
 import org.jdom.*;
 import org.jdom.input.*;
 import org.jdom.output.*;

 public class Test
 {
 public static void main (String[] args)
 {
 try
 {
 Document d = new SAXBuilder().build(args[0]);
 XMLOutputter outp = new XMLOutputter("", false);
 outp.setEncoding("ISO-8859-1");
 FileWriter fw = new FileWriter("test.html");

FileWriter uses the system default encoding which is MacRoman (latin-1)
roll your own "ISO-8859-1" FileWriter

OutputStreamWriter fw=new OutputStreamWriter(
new FileOutputStream("text.html"),"ISO-8859-1");


 outp.output(d, fw);
 fw.close();
 }
 catch (Exception e)
 {
 }
 }
 }

- robert



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: encoding problems with Jakarta-site

2001-03-17 Thread Jon Stevens

on 3/17/01 10:02 AM, "Robert Burrell Donkin" [EMAIL PROTECTED]
wrote:

 FileWriter uses the system default encoding which is MacRoman (latin-1)
 roll your own "ISO-8859-1" FileWriter
 
 OutputStreamWriter fw=new OutputStreamWriter(
 new FileOutputStream("text.html"),"ISO-8859-1");

Ahh...ok...I will try that...

-jon

-- 
If you come from a Perl or PHP background, JSP is a way to take
your pain to new levels. --Anonymous
http://jakarta.apache.org/velocity/ymtd/ymtd.html


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: encoding problems with Jakarta-site

2001-03-15 Thread Ceki Gülcü

At 22:12 14.03.2001 -0800, Jon Stevens wrote:
Ok,

I think this encoding stuff with Ceki's name is a bug in the OSX JVM that
I'm using, so we may have to revert back to using "u" for a bit until the
OSX GM is out and I can test/use that. :-( The weird thing is that I'm not
convinced that it is the OSX JVM though...here is why:

Here is the input file:

p
bCeki G#252;lc#252;/b (ceki at apache.org)
br/
Ceki is the founder of the log4j project. Time permitting, he also does
custom development for clients. See a
href="http://www.qos.ch"www.qos.ch/a for more info.
/p

Here is the little test program:

import java.io.*;
import java.lang.*;
import java.util.*;
import org.jdom.*;
import org.jdom.input.*;
import org.jdom.output.*;

public class Test
{
public static void main (String[] args)
{
try
{
Document d = new SAXBuilder().build(args[0]);
XMLOutputter outp = new XMLOutputter("", false);
outp.setEncoding("ISO-8859-1");
FileWriter fw = new FileWriter("test.html");
outp.output(d, fw);
fw.close();
}
catch (Exception e)
{
}
}
}

java Test input.txt

produces:

?xml version="1.0" encoding="ISO-8859-1"?
p
bCeki Glc/b (ceki at apache.org)
br /
Ceki is the founder of the
log4j project. Time permitting, he also does
custom development for clients.
See a href="http://www.qos.ch"www.qos.ch/a for more info.
/p

As you can see, Ceki's name is correctly shown. The weird thing is that if I
take the file and load it in my browser, the weird characters show up for
the "u"'s, not the correct ones...

Can you send file produced by "java Test input.txt "



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: encoding problems with Jakarta-site

2001-03-15 Thread Ceki Gülcü


Sorry, my previous note was incomplete. I've been bitten by Eudora's CTRL-E syndorome.

At 22:12 14.03.2001 -0800, Jon Stevens wrote:

java Test input.txt

produces:

?xml version="1.0" encoding="ISO-8859-1"?
p
bCeki Glc/b (ceki at apache.org)
br /
Ceki is the founder of the
log4j project. Time permitting, he also does
custom development for clients.
See a href="http://www.qos.ch"www.qos.ch/a for more info.
/p

As you can see, Ceki's name is correctly shown. The weird thing is that if I
take the file and load it in my browser, the weird characters show up for
the "u"'s, not the correct ones...

Could we look at the file produced by "java Test input.txt" as a binary? Could it be 
that your OS (not the JVM) is screwing things up? Cheers, Ceki



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: encoding problems with Jakarta-site

2001-03-14 Thread Sam Ruby

Jon Stevens wrote:

 I think this encoding stuff with Ceki's name is a bug in the
 OSX JVM that I'm using, so we may have to revert back to
 using "u" for a bit until the OSX GM is out and I can test/use
 that. :-( The weird thing is that I'm not convinced that it
 is the OSX JVM though...here is why:

;-)

I hereby propose that at the same time, we revert back to adding the "h"
back into John's name too.

;-)

Jon, have you ever considered doing your builds on dev.apache.org?

;-)

- Sam Ruby

P.S.  I personally think the right fix is to not use the default mechanisms
to serialize this to XML, but to write one that replaces characters outside
of the 0-127 range with html entities, uuml.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]