well, while HTTP spec (not any java spec) considers UTF8 cant be a default I guess well be there...
Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Tomitriber <http://www.tomitribe.com> 2015-06-29 10:09 GMT-07:00 Jean-Louis Monteiro <[email protected]>: > I've been creating such a trick for year, can't believe we are still here > today. > Not a useful comment but this thread made me react again. > > -- > Jean-Louis Monteiro > http://twitter.com/jlouismonteiro > http://www.tomitribe.com > > On Sat, Jun 27, 2015 at 4:35 PM, Romain Manni-Bucau <[email protected] > > > wrote: > > > Hi > > > > sounds normal (http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q1), > > maybe add a filter setting the encoding (request.setCharacterEncoding( > > "UTF-8");) > > > > Le 27 juin 2015 12:17, "using namespace" <[email protected]> a écrit : > > > > > I have deployed a web-service on TomEE 1.7.1 and currently having > > encoding > > > problem when I work with request xml data. The web-service implements > one > > > method, which receives and xml data inside a SOAP message like > following: > > > > > > <soapenv:Envelope xmlns:soapenv=" > > http://schemas.xmlsoap.org/soap/envelope/ > > > " > > > xmlns:soap="http://tempuri.org/soaprequest"> > > > <soapenv:Header/> > > > <soapenv:Body> > > > <soap:soaprequest> > > > <soap:streams> > > > <soap:soapin contentType="?"> > > > <soap:Value> > > > > > > > > > <tag_a>cyrillic text here...</tag_a> > > > > > > > > > </soap:Value> > > > </soap:soapin> > > > </soap:streams> > > > </soap:soaprequest> > > > </soapenv:Body> > > > </soapenv:Envelope> > > > > > > Inside the web-service implementation class I retrieve everything from > > > tag and cast it to String: > > > > > > Element soapinElement = (Element) > > > streams.getSoapin().getValue().getAny(); > > > Node node = (Node) soapinElement; > > > Document document = node.getOwnerDocument(); > > > DOMImplementationLS domImplLS = > > > (DOMImplementationLS) > > > document.getImplementation(); > > > LSSerializer serializer = > > > domImplLS.createLSSerializer(); > > > LSOutput output = domImplLS.createLSOutput(); > > > output.setEncoding("UTF-8"); > > > Writer stringWriter = new StringWriter(); > > > output.setCharacterStream(stringWriter); > > > serializer.write(document, output); > > > String soapinString = stringWriter.toString(); > > > > > > And then I put soapinString into Oracle database CLOB column. > > > > > > Everything is great when SOAP message is encoded in UTF-8, but I get > > > unreadable characters when SOAP message has different encoding, like > > CP1251 > > > and what I see in Oracle as a result is: > > > > > > > > > > > > <tag_a>РћР’Р” Р’РћР</tag_a> > > > > > > > > > > > > I tried encoding conversion like this: > > > > > > Element soapinElement = (Element) > > > streams.getSoapin().getValue().getAny(); > > > Node node = (Node) soapinElement; > > > Document document = node.getOwnerDocument(); > > > DOMImplementationLS domImplLS = > > > (DOMImplementationLS) > > > document.getImplementation(); > > > LSSerializer serializer = > > > domImplLS.createLSSerializer(); > > > LSOutput output = domImplLS.createLSOutput(); > > > ByteArrayOutputStream byteArrayOutputStream = > new > > > ByteArrayOutputStream(); > > > output.setByteStream(byteArrayOutputStream); > > > byte[] result = > > > byteArrayOutputStream.toByteArray(); > > > InputStream is = new > > ByteArrayInputStream(result); > > > Reader reader = new InputStreamReader(is, > > > "windows-1251"); > > > OutputStream out = new ByteArrayOutputStream(); > > > Writer writer = new OutputStreamWriter(out, > > > "UTF-8"); > > > writer.write("\uFEFF"); > > > char[] buffer = new char[10]; > > > int read; > > > while ((read = reader.read(buffer)) != -1) { > > > writer.write(buffer, 0, read); > > > } > > > reader.close(); > > > writer.close(); > > > serializer.write((Node) out, output); > > > String soapinString = output.toString(); > > > > > > But it produces something that looks like byte code. > > > I would like to ask for some suggestions on possible ways to resolve > > > encoding conversion to UTF-8. > > > > > > > > > > > > -- > > > View this message in context: > > > > http://tomee-openejb.979440.n4.nabble.com/Encoding-issue-tp4675408.html > > > Sent from the TomEE Users mailing list archive at Nabble.com. > > > > > >
