Hi sounds normal (http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q1), maybe add a filter setting the encoding (request.setCharacterEncoding( "UTF-8");)
Le 27 juin 2015 12:17, "using namespace" <[email protected]> a écrit : > I have deployed a web-service on TomEE 1.7.1 and currently having encoding > problem when I work with request xml data. The web-service implements one > method, which receives and xml data inside a SOAP message like following: > > <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/ > " > xmlns:soap="http://tempuri.org/soaprequest"> > <soapenv:Header/> > <soapenv:Body> > <soap:soaprequest> > <soap:streams> > <soap:soapin contentType="?"> > <soap:Value> > > > <tag_a>cyrillic text here...</tag_a> > > > </soap:Value> > </soap:soapin> > </soap:streams> > </soap:soaprequest> > </soapenv:Body> > </soapenv:Envelope> > > Inside the web-service implementation class I retrieve everything from > tag and cast it to String: > > Element soapinElement = (Element) > streams.getSoapin().getValue().getAny(); > Node node = (Node) soapinElement; > Document document = node.getOwnerDocument(); > DOMImplementationLS domImplLS = > (DOMImplementationLS) > document.getImplementation(); > LSSerializer serializer = > domImplLS.createLSSerializer(); > LSOutput output = domImplLS.createLSOutput(); > output.setEncoding("UTF-8"); > Writer stringWriter = new StringWriter(); > output.setCharacterStream(stringWriter); > serializer.write(document, output); > String soapinString = stringWriter.toString(); > > And then I put soapinString into Oracle database CLOB column. > > Everything is great when SOAP message is encoded in UTF-8, but I get > unreadable characters when SOAP message has different encoding, like CP1251 > and what I see in Oracle as a result is: > > > > <tag_a>РћР’Р” Р’РћР</tag_a> > > > > I tried encoding conversion like this: > > Element soapinElement = (Element) > streams.getSoapin().getValue().getAny(); > Node node = (Node) soapinElement; > Document document = node.getOwnerDocument(); > DOMImplementationLS domImplLS = > (DOMImplementationLS) > document.getImplementation(); > LSSerializer serializer = > domImplLS.createLSSerializer(); > LSOutput output = domImplLS.createLSOutput(); > ByteArrayOutputStream byteArrayOutputStream = new > ByteArrayOutputStream(); > output.setByteStream(byteArrayOutputStream); > byte[] result = > byteArrayOutputStream.toByteArray(); > InputStream is = new ByteArrayInputStream(result); > Reader reader = new InputStreamReader(is, > "windows-1251"); > OutputStream out = new ByteArrayOutputStream(); > Writer writer = new OutputStreamWriter(out, > "UTF-8"); > writer.write("\uFEFF"); > char[] buffer = new char[10]; > int read; > while ((read = reader.read(buffer)) != -1) { > writer.write(buffer, 0, read); > } > reader.close(); > writer.close(); > serializer.write((Node) out, output); > String soapinString = output.toString(); > > But it produces something that looks like byte code. > I would like to ask for some suggestions on possible ways to resolve > encoding conversion to UTF-8. > > > > -- > View this message in context: > http://tomee-openejb.979440.n4.nabble.com/Encoding-issue-tp4675408.html > Sent from the TomEE Users mailing list archive at Nabble.com. >
