Patrick,

Thanks for the patch -- I'd actually tried a similar change; I found,
though, when tracing through the code, that the encode/decode routines
weren't even getting called, so I think the problem is also in how I'm
passing data between perl and java (currently all arguments are
encapsulated in an array that contains a hash table).

I'll let you know what happens when I get it fixed!

Thanks again,

dave

>>>>> "Patrick" == Patrick LeBoutillier <[EMAIL PROTECTED]> writes:

    Patrick> Here's a patch I think will work (I did some minimal
    Patrick> testing and it worked out ok) :

    Patrick> RCS file:
    Patrick> /cvsroot/inline-java/Inline-Java/Java/Protocol.pm,v
    Patrick> retrieving revision 1.33 diff -r1.33 Protocol.pm 413c413
    Patrick> < return join(".", unpack("C*", $s)) ; ---
    >> return join(".", unpack("U*", $s)) ;
    Patrick> 420c420 < return pack("C*", split(/\./, $s)) ; ---
    >> return pack("U*", split(/\./, $s)) ;


    Patrick> and


    Patrick> RCS file:
    Patrick> /cvsroot/inline-java/Inline-Java/Java/sources/InlineJavaProtocol.java,v
    Patrick> retrieving revision 1.2 diff -r1.2
    Patrick> InlineJavaProtocol.java 614,615c614,615 < byte b[] =
    Patrick> {(byte)Integer.parseInt(ss)} ; < sb.append(new String(b))
    Patrick> ; ---
    >> char c = (char)Integer.parseInt(ss) ; sb.append(new String(new
    >> char [] {c})) ;
    Patrick> 623c623,624 < byte b[] = s.getBytes() ; ---
    >> char c[] = new char[s.length()] ; s.getChars(0, c.length, c, 0)
    >> ;
    Patrick> 625c626 < for (int i = 0 ; i < b.length ; i++){ ---
    >> for (int i = 0 ; i < c.length ; i++){
    Patrick> 629c630,631 < sb.append(String.valueOf(b[i])) ; ---
    >> sb.append((int)c[i]) ;


    Patrick> Let me know how it turns out.

    Patrick> Patrick

    Patrick> --------------------- Patrick LeBoutillier Laval, Quebec,
    Patrick> Canada ----- Original Message ----- From: "Patrick
    Patrick> LeBoutillier" <[EMAIL PROTECTED]> To:
    Patrick> <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Thursday,
    Patrick> June 05, 2003 8:28 AM Subject: Re: Inline::Java and utf8


    >> Dave,
    >> 
    >> It's possible that there is a problem here. Inline::Java uses a
    >> very
    Patrick> simple
    >> (and somewhat inefficient) encoding to pass the data between
    >> Perl and
    Patrick> Java.
    >>  Here is the corresponding code:
    >> 
    >> sub encode { my $s = shift ;
    >> 
    >> return join(".", unpack("C*", $s)) ; }
    >> 
    >> and
    >> 
    >> String Decode(String s){ StringTokenizer st = new
    >> StringTokenizer(s, ".") ; StringBuffer sb = new StringBuffer()
    >> ; while (st.hasMoreTokens()){ String ss = st.nextToken() ; byte
    >> b[] = {(byte)Integer.parseInt(ss)} ; sb.append(new String(b)) ;
    >> }
    >> 
    >> 
    >> It breaks up the string byte by byte and reconstructs it on the
    >> other
    Patrick> side.
    >> It's probable that this doesn't work with multibyte characters
    >> since it's probably creating a character for
    Patrick> each
    >> byte.
    >> 
    >> If you have time to check this out and send me a patch that
    >> would be
    Patrick> great,
    >> but I don't have the time currently to investigate this. I have
    >> no problem reviewing the encoding completely, I did like this
    >> to make sure I could implement the protocol line by line. Maybe
    >> only escaping the \n's would
    Patrick> have
    >> been sufficient.
    >> 
    >> Anyways comments/suggestions are welcome.
    >> 
    >> 
    >> --------------------- Patrick LeBoutillier Laval, Quebec,
    >> Canada ----- Original Message ----- From: "Dave LaMacchia"
    >> <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, June
    >> 04, 2003 9:19 PM Subject: Inline::Java and utf8
    >> 
    >> 
    >> >
    >> > I'm working on some code that uses Inline::Java to parse user
    >> input in > order to make calls to a corba interface in front of
    >> an oracle > database.
    >> >
    >> > I found when I fetch utf8 data from the database, all is well
    >> > (assuming I've set my locale -- this is on Solaris 2.8 -- to
    >> > en_US.UTF-8).  When I go the other way, however, passing data
    >> from > perl to Java via Inline, I get data corruption in the
    >> non-ASCII > characters.
    >> >
    >> > I thought that I might have to convert the strings to UCS2,
    >> since > that's what Java uses internally, but this results in
    >> java errors due > to embedded null characters.
    >> >
    >> > Has anyone run into this problem before?  Any suggestions how
    >> to get > around it?  I'm using perl 5.8 so I shouldn't have to
    >> insert a use > utf8 pragma.  Note also that I've confirmed the
    >> data is correct in the > perl code before the embedded Java is
    >> called.
    >> >
    >> > Thanks!
    >> >
    >> > --dave
    >> >
    >> 


-- 

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Dave LaMacchia                            http://www.sleepwalk.org/      
[EMAIL PROTECTED]                  

Reply via email to