Here's a patch I think will work (I did some minimal testing and it worked out ok) :
RCS file: /cvsroot/inline-java/Inline-Java/Java/Protocol.pm,v retrieving revision 1.33 diff -r1.33 Protocol.pm 413c413 < return join(".", unpack("C*", $s)) ; --- > return join(".", unpack("U*", $s)) ; 420c420 < return pack("C*", split(/\./, $s)) ; --- > return pack("U*", split(/\./, $s)) ; and RCS file: /cvsroot/inline-java/Inline-Java/Java/sources/InlineJavaProtocol.java,v retrieving revision 1.2 diff -r1.2 InlineJavaProtocol.java 614,615c614,615 < byte b[] = {(byte)Integer.parseInt(ss)} ; < sb.append(new String(b)) ; --- > char c = (char)Integer.parseInt(ss) ; > sb.append(new String(new char [] {c})) ; 623c623,624 < byte b[] = s.getBytes() ; --- > char c[] = new char[s.length()] ; > s.getChars(0, c.length, c, 0) ; 625c626 < for (int i = 0 ; i < b.length ; i++){ --- > for (int i = 0 ; i < c.length ; i++){ 629c630,631 < sb.append(String.valueOf(b[i])) ; --- > sb.append((int)c[i]) ; Let me know how it turns out. Patrick --------------------- Patrick LeBoutillier Laval, Quebec, Canada ----- Original Message ----- From: "Patrick LeBoutillier" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Thursday, June 05, 2003 8:28 AM Subject: Re: Inline::Java and utf8 > Dave, > > It's possible that there is a problem here. Inline::Java uses a very simple > (and somewhat inefficient) encoding to pass the data between Perl and Java. > > Here is the corresponding code: > > sub encode { > my $s = shift ; > > return join(".", unpack("C*", $s)) ; > } > > and > > String Decode(String s){ > StringTokenizer st = new StringTokenizer(s, ".") ; > StringBuffer sb = new StringBuffer() ; > while (st.hasMoreTokens()){ > String ss = st.nextToken() ; > byte b[] = {(byte)Integer.parseInt(ss)} ; > sb.append(new String(b)) ; > } > > > It breaks up the string byte by byte and reconstructs it on the other side. > It's probable that this doesn't work > with multibyte characters since it's probably creating a character for each > byte. > > If you have time to check this out and send me a patch that would be great, > but I don't have the time currently to investigate this. I have no problem > reviewing the encoding completely, I did like this to make sure I could > implement the protocol line by line. Maybe only escaping the \n's would have > been sufficient. > > Anyways comments/suggestions are welcome. > > > --------------------- > Patrick LeBoutillier > Laval, Quebec, Canada > ----- Original Message ----- > From: "Dave LaMacchia" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Wednesday, June 04, 2003 9:19 PM > Subject: Inline::Java and utf8 > > > > > > I'm working on some code that uses Inline::Java to parse user input in > > order to make calls to a corba interface in front of an oracle > > database. > > > > I found when I fetch utf8 data from the database, all is well > > (assuming I've set my locale -- this is on Solaris 2.8 -- to > > en_US.UTF-8). When I go the other way, however, passing data from > > perl to Java via Inline, I get data corruption in the non-ASCII > > characters. > > > > I thought that I might have to convert the strings to UCS2, since > > that's what Java uses internally, but this results in java errors due > > to embedded null characters. > > > > Has anyone run into this problem before? Any suggestions how to get > > around it? I'm using perl 5.8 so I shouldn't have to insert a use > > utf8 pragma. Note also that I've confirmed the data is correct in the > > perl code before the embedded Java is called. > > > > Thanks! > > > > --dave > > >