[ https://issues.apache.org/jira/browse/XERCESJ-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Costanzo updated XERCESJ-1574: ------------------------------------ Fix Version/s: 2.12.0 > Problem with detected encoding for UTF-16 encoded as Unicode Little > ------------------------------------------------------------------- > > Key: XERCESJ-1574 > URL: https://issues.apache.org/jira/browse/XERCESJ-1574 > Project: Xerces2-J > Issue Type: Bug > Components: DOM (Level 3 Core) > Affects Versions: 2.11.0 > Reporter: Radu Coravu > Assignee: Michael Glavassevich > Fix For: 2.12.0 > > Attachments: patch.txt > > Original Estimate: 2h > Remaining Estimate: 2h > > I have the following test case: > ByteArrayInputStream bis = new ByteArrayInputStream( > "<?xml version=\"1.0\" encoding=\"UTF-16\"?> > <a/>".getBytes("UnicodeLittle")); > InputSource is = new InputSource(bis); > DOMParser dp = new DOMParser(); > dp.parse(is); > assertEquals("UTF-16LE", dp.getDocument().getInputEncoding()); > The input stream is encoded as "UnicodeLittle" and " > dp.getDocument().getInputEncoding()" should return "UTF-16LE" (at least it > did so in the previous Xerces version). Right now it returns "UTF-16" > regardless of the byte order mark in the input stream. > So a developer using the information from > "dp.getDocument().getInputEncoding()" information does not know how to save > the document in order to preserve the same BOM. > This problem is related to the modifications which were made in the > XMLEntityManager related to encoding detection. > As a proposed modification, in the method: > org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(String, > XMLInputSource, boolean, boolean) > before the code: > fCurrentEntity = new ScannedEntity(name,.... > we could add the following code: > if("UTF-16".equals(encoding)) { > if(isBigEndian != null) { > if(isBigEndian) { > encoding = "UTF-16BE"; > } else { > encoding = "UTF-16LE"; > } > } > } -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: j-dev-h...@xerces.apache.org