Re: Encoding issue

Romain Manni-Bucau Mon, 29 Jun 2015 10:12:58 -0700

well, while HTTP spec (not any java spec) considers UTF8 cant be a default
I guess well be there...



Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Tomitriber
<http://www.tomitribe.com>

2015-06-29 10:09 GMT-07:00 Jean-Louis Monteiro <[email protected]>:

> I've been creating such a trick for year, can't believe we are still here
> today.
> Not a useful comment but this thread made me react again.
>
> --
> Jean-Louis Monteiro
> http://twitter.com/jlouismonteiro
> http://www.tomitribe.com
>
> On Sat, Jun 27, 2015 at 4:35 PM, Romain Manni-Bucau <[email protected]
> >
> wrote:
>
> > Hi
> >
> > sounds normal (http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q1),
> > maybe add a filter setting the encoding (request.setCharacterEncoding(
> > "UTF-8");)
> >
> > Le 27 juin 2015 12:17, "using namespace" <[email protected]> a écrit :
> >
> > > I have deployed a web-service on TomEE 1.7.1 and currently having
> > encoding
> > > problem when I work with request xml data. The web-service implements
> one
> > > method, which receives and xml data inside a SOAP message like
> following:
> > >
> > > <soapenv:Envelope xmlns:soapenv="
> > http://schemas.xmlsoap.org/soap/envelope/
> > > "
> > > xmlns:soap="http://tempuri.org/soaprequest";>
> > >    <soapenv:Header/>
> > >    <soapenv:Body>
> > >       <soap:soaprequest>
> > >          <soap:streams>
> > >             <soap:soapin contentType="?">
> > >                <soap:Value>
> > >
> > >
> > >                      <tag_a>cyrillic text here...</tag_a>
> > >
> > >
> > >                </soap:Value>
> > >             </soap:soapin>
> > >          </soap:streams>
> > >       </soap:soaprequest>
> > >    </soapenv:Body>
> > > </soapenv:Envelope>
> > >
> > > Inside the web-service implementation class I retrieve everything from
> > >  tag and cast it to String:
> > >
> > >                         Element soapinElement = (Element)
> > > streams.getSoapin().getValue().getAny();
> > >                         Node node = (Node) soapinElement;
> > >                         Document document = node.getOwnerDocument();
> > >                         DOMImplementationLS domImplLS =
> > > (DOMImplementationLS)
> > > document.getImplementation();
> > >                         LSSerializer serializer =
> > > domImplLS.createLSSerializer();
> > >                         LSOutput output = domImplLS.createLSOutput();
> > >                         output.setEncoding("UTF-8");
> > >                         Writer stringWriter = new StringWriter();
> > >                         output.setCharacterStream(stringWriter);
> > >                         serializer.write(document, output);
> > >                         String soapinString = stringWriter.toString();
> > >
> > > And then I put soapinString into Oracle database CLOB column.
> > >
> > > Everything is great when SOAP message is encoded in UTF-8, but I get
> > > unreadable characters when SOAP message has different encoding, like
> > CP1251
> > > and what I see in Oracle as a result is:
> > >
> > >
> > >
> > >                      <tag_a>РћР’Р” Р’РћР</tag_a>
> > >
> > >
> > >
> > > I tried encoding conversion like this:
> > >
> > >                         Element soapinElement = (Element)
> > > streams.getSoapin().getValue().getAny();
> > >                         Node node = (Node) soapinElement;
> > >                         Document document = node.getOwnerDocument();
> > >                         DOMImplementationLS domImplLS =
> > > (DOMImplementationLS)
> > > document.getImplementation();
> > >                         LSSerializer serializer =
> > > domImplLS.createLSSerializer();
> > >                         LSOutput output = domImplLS.createLSOutput();
> > >                         ByteArrayOutputStream byteArrayOutputStream =
> new
> > > ByteArrayOutputStream();
> > >                         output.setByteStream(byteArrayOutputStream);
> > >                         byte[] result =
> > > byteArrayOutputStream.toByteArray();
> > >                         InputStream is = new
> > ByteArrayInputStream(result);
> > >                         Reader reader = new InputStreamReader(is,
> > > "windows-1251");
> > >                         OutputStream out = new ByteArrayOutputStream();
> > >                         Writer writer = new OutputStreamWriter(out,
> > > "UTF-8");
> > >                         writer.write("\uFEFF");
> > >             char[] buffer = new char[10];
> > >             int read;
> > >             while ((read = reader.read(buffer)) != -1) {
> > >                 writer.write(buffer, 0, read);
> > >             }
> > >             reader.close();
> > >             writer.close();
> > >             serializer.write((Node) out, output);
> > >             String soapinString = output.toString();
> > >
> > > But it produces something that looks like byte code.
> > > I would like to ask for some suggestions on possible ways to resolve
> > > encoding conversion to UTF-8.
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> http://tomee-openejb.979440.n4.nabble.com/Encoding-issue-tp4675408.html
> > > Sent from the TomEE Users mailing list archive at Nabble.com.
> > >
> >
>

Re: Encoding issue

Reply via email to