[jira] [Updated] (XALANJ-2436) Xalan must not expose bundled classes (bcel, regexp)
[ https://issues.apache.org/jira/browse/XALANJ-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jérôme Leroux updated XALANJ-2436: -- Attachment: XALAN-2436.patch Attach patch: - Fix for XALAN-2436: use {{jarjar}} task in the {{build.xml}} instead of {{jar}} task To make it works, download _Jar Jar Links_ jar in {{tools/}} directory. It can be downloaded here: https://code.google.com/p/jarjar/downloads/detail?name=jarjar-src-1.4.zipcan=2q= Some links: _Jar Jar Links_ project: https://code.google.com/p/jarjar/ _Jar Jar Links_ license: http://www.apache.org/licenses/LICENSE-2.0 Xalan must not expose bundled classes (bcel, regexp) Key: XALANJ-2436 URL: https://issues.apache.org/jira/browse/XALANJ-2436 Project: XalanJ2 Issue Type: Bug Components: Xalan Affects Versions: 2.7.1 Environment: any Reporter: Holger Hoffstätte Priority: Critical Attachments: XALAN-2436.patch, rewrite-packages.rules I just spent the better part of half a day figuring out what caused the problem outlined in https://sourceforge.net/tracker/?func=detailatid=614693aid=1902137group_id=96405. Xalan bundles regexp and bcel, however since one of the recommened ways of installing xalan is via the endorsed mechanism this will wreak serious havoc on any other apps that use bcel. That would be less of a problem is xalan's version were up to date, but as of 2.7.1 it still includes a version from the early stone age (see XALANJ-2423). The solution is easy: when building the aggregate jar, add an ant task to rewrite the bundled packages via jarjar (http://code.google.com/p/jarjar/). This can be trivially added to the build and creates a completely self-contained xalan jar that will not blow up the world when endorsed. I will attach a trivial rule file for jarjar that rewrites the embedded packages which should immediately fix any collision problems. For more information about how to use jarjar, see http://code.google.com/p/jarjar/wiki/GettingStarted -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANC-761) xalan segfaults on xsl:include statement
Axel Söding-Freiherr von Blomberg created XALANC-761: Summary: xalan segfaults on xsl:include statement Key: XALANC-761 URL: https://issues.apache.org/jira/browse/XALANC-761 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: 1.11 Environment: Linux 3.13.0-46-generic #77-Ubuntu SMP i686 GNU/Linux sudo apt-get install xalan Reporter: Axel Söding-Freiherr von Blomberg Assignee: Steven J. Hathaway The bug can be reproduced with a minimal setup: stylesheet2.xsl: ?xml version=1.0 encoding=UTF-8? xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform; xsl:output method=xml/ xsl:template name=foofoo/xsl:template /xsl:stylesheet stylesheet1.xsl: ?xml version=1.0 encoding=UTF-8? xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform; xsl:output method=xml/ xsl:include href=stylesheet2.xsl/ xsl:template match=foo /xsl:template /xsl:stylesheet foo.xml: ?xml version='1.0' encoding='UTF-8'? foo /foo $/usr/bin/Xalan foo.xml stylesheet1.xsl segmentation fault -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Closed] (XALANJ-131) Xalan doesn't work properly when the qName parameter in SAX2 is empty
[ https://issues.apache.org/jira/browse/XALANJ-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikael Ståldal closed XALANJ-131. - Xalan doesn't work properly when the qName parameter in SAX2 is empty - Key: XALANJ-131 URL: https://issues.apache.org/jira/browse/XALANJ-131 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.0.1 Environment: Operating System: All Platform: All Reporter: Mikael Ståldal I'm using programatically generated SAX input to Xalan. Xalan doesn't seem to work properly when the qName paramterer to the ContentHandler methods is the empty string. I think that Xalan should work properly even when the qName is not given. Here is a test program and a stylesheet: // DirTest.java import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.AttributesImpl; import javax.xml.transform.*; import javax.xml.transform.Source; import javax.xml.transform.sax.*; import javax.xml.transform.stream.StreamSource; import org.apache.xml.serialize.*; public class DirTest { public static void main(String[] args) throws Exception { if (args.length 3) { System.out.println(Syntax: DirTest input_dir stylesheet output); return; } String inDir = args[0]; String xslFile = args[1]; String outFile = args[2]; TransformerFactory tf = TransformerFactory.newInstance(); if (!(tf.getFeature(SAXTransformerFactory.FEATURE) tf.getFeature(SAXResult.FEATURE) tf.getFeature(StreamSource.FEATURE))) { System.out.println(The transformer factory + tf.getClass().getName() + doesn't support SAX); return; } SAXTransformerFactory tfactory = (SAXTransformerFactory)tf; System.out.println(Read stylesheet + xslFile); Templates stylesheet = tfactory.newTemplates(new StreamSource(xslFile)); System.out.println(Transforming + inDir + to + outFile); XMLSerializer ser = new XMLSerializer( new FileOutputStream(outFile), new OutputFormat()); TransformerHandler th = tfactory.newTransformerHandler(stylesheet); th.setResult(new SAXResult(ser.asContentHandler())); generateDirListing(new File(inDir), th); System.out.println(Finished); } static void generateDirListing(File dir, ContentHandler sax) throws SAXException { String[] files = dir.list(); sax.startDocument(); sax.startElement(, dirlist, , new AttributesImpl()); // sax.startElement(, dirlist, dirlist, new AttributesImpl()); for (int i = 0; i files.length; i++) { File file = new File(dir, files[i]); AttributesImpl atts = new AttributesImpl(); atts.addAttribute(, filename, , CDATA, files[i]); // atts.addAttribute(, filename, filename, CDATA, files[i]); if (file.isFile()) { sax.startElement(, file, , atts); // sax.startElement(, file, file, atts); sax.endElement(, file, ); // sax.endElement(, file, file); } else if (file.isDirectory()) { sax.startElement(, directory, , atts); // sax.startElement(, directory, directory, atts); sax.endElement(, directory, ); // sax.endElement(, directory, directory); } else ; } sax.endElement(, dirlist, ); // sax.endElement(, dirlist, dirlist); sax.endDocument(); } } ?xml version=1.0 encoding=iso-8859-1? xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; version=1.0 xsl:template match=/ Directory listing /xsl:template xsl:template match=file lixsl:value-of select=@filename//li /xsl:template /xsl:stylesheet -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Closed] (XALANJ-378) Xalan still doesn't handle an empty qName parameter in SAX in all cases
[ https://issues.apache.org/jira/browse/XALANJ-378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikael Ståldal closed XALANJ-378. - Xalan still doesn't handle an empty qName parameter in SAX in all cases --- Key: XALANJ-378 URL: https://issues.apache.org/jira/browse/XALANJ-378 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: transformation, Xalan-interpretive Affects Versions: 2.2.x Environment: Operating System: All Platform: All Reporter: Mikael Ståldal Xalan still doesn't handle an empty qName parameter in SAX in all cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681738#comment-14681738 ] Jesper Steen Møller commented on XALANJ-2419: - I followed the instructions on https://xalan.apache.org/xalan-j/downloads.html#buildmyself on my Mac OS X 10.10.4 with Xcode developer tools installed. I had to add execute permissions on test/build.sh, and temporarily change my locale to All American (or the test Extension test of javaSample3.xsl fails) That worked, and I got 2 x CONGRATULATIONS I then applied the tests-patch (using svn patch), and then ToStreamTest.runTest() and StreamResultAPITest.runTest() both failed, as was expected. I then applied the fix, and the tests were once again OK. So, yes, the fix still applies. This was on Java 1.7. Hope this helps! Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8 - Key: XALANJ-2419 URL: https://issues.apache.org/jira/browse/XALANJ-2419 Project: XalanJ2 Issue Type: Bug Components: Serialization Affects Versions: 2.7.1 Reporter: Henri Sivonen Attachments: XALANJ-2419-fix.txt, XALANJ-2419-tests.txt org.apache.xml.serializer.ToStream contains the following code: else if (m_encodingInfo.isInEncoding(ch)) { // If the character is in the encoding, and // not in the normal ASCII range, we also // just leave it get added on to the clean characters } else { // This is a fallback plan, we should never get here // but if the character wasn't previously handled // (i.e. isn't in the encoding, etc.) then what // should we do? We choose to write out an entity writeOutCleanChars(chars, i, lastDirtyCharProcessed); writer.write(#); writer.write(Integer.toString(ch)); writer.write(';'); lastDirtyCharProcessed = i; } This leads to the wrong (latter) if branch running for surrogates, because isInEncoding() for UTF-8 returns false for surrogates. It is always wrong (regardless of encoding) to escape a surrogate as an NCR. The practical effect of this bug is that any document with astral characters in it ends up in an ill-formed serialization and does not parse back using an XML parser. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692362#comment-14692362 ] Jesper Steen Møller commented on XALANJ-2419: - I noticed a confusing comment in the ToStreamTest, where the test method *testCase2()* asserts that {code} reporter.check(actual2, AELIG_OSLASH_ARING, ISO-8859-1 characters should come out as entities); {code} This should read {code} reporter.check(actual2, AELIG_OSLASH_ARING, ISO-8859-1 characters should come out unscathed); {code} ... as it's also what's asserted. Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8 - Key: XALANJ-2419 URL: https://issues.apache.org/jira/browse/XALANJ-2419 Project: XalanJ2 Issue Type: Bug Components: Serialization Affects Versions: 2.7.1 Reporter: Henri Sivonen Attachments: XALANJ-2419-fix.txt, XALANJ-2419-tests.txt org.apache.xml.serializer.ToStream contains the following code: else if (m_encodingInfo.isInEncoding(ch)) { // If the character is in the encoding, and // not in the normal ASCII range, we also // just leave it get added on to the clean characters } else { // This is a fallback plan, we should never get here // but if the character wasn't previously handled // (i.e. isn't in the encoding, etc.) then what // should we do? We choose to write out an entity writeOutCleanChars(chars, i, lastDirtyCharProcessed); writer.write(#); writer.write(Integer.toString(ch)); writer.write(';'); lastDirtyCharProcessed = i; } This leads to the wrong (latter) if branch running for surrogates, because isInEncoding() for UTF-8 returns false for surrogates. It is always wrong (regardless of encoding) to escape a surrogate as an NCR. The practical effect of this bug is that any document with astral characters in it ends up in an ill-formed serialization and does not parse back using an XML parser. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2600) Memory leak in TransformerIdentityImpl
[ https://issues.apache.org/jira/browse/XALANJ-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samuel Brezáni updated XALANJ-2600: --- Attachment: xalan_memory_leak.jpg > Memory leak in TransformerIdentityImpl > -- > > Key: XALANJ-2600 > URL: https://issues.apache.org/jira/browse/XALANJ-2600 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Xalan >Affects Versions: 2.7.2 >Reporter: Samuel Brezáni >Assignee: Steven J. Hathaway > Attachments: TransformerIdentityImpl.java.patch, xalan_memory_leak.jpg > > > Hi. > I found a serious memory leak in the Xalan library. It is caused by the > org.apache.xalan.transformer.TransformerIdentityImpl class. > I try to explain mechanism how the memory leak is caused: > Web application is deployed on SAP NetWeaver AS with Java 1.6 (1.6.0_95). The > application uses Spring WS library but also another libraries with dependency > to Xalan library. > When the web service is invoked, then the TransformerIdentityImpl class is > used. This class is used because it's extends javax.xml.transform.Transformer > and it is created by Java core method - > javax.xml.transform.TransformerFactory.newInstance(). > A class com.sun.xml.internal.messaging.saaj.soap.EnvelopeFactory is used for > handling web services. This object also contains cache (ParserPool) with > SAXParser objects (com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl). As > key for this cache is used an application class loader. > EnvelopeFactory object and SAXParserImpl objects are loaded by a system class > loader, but TransformerIdentityImpl class is loaded by an application class > loader. > During handling of web service a method > org.apache.xalan.transformer.TransformerIdentityImpl.transform(Source, > Result) is invoked. This method uses SAXParser as a reader. Problem is that > this method register self for handling properties (eg: > "http://xml.org/sax/handlers/DeclHandler;) but after processing method > doesn't unregister self from SAXParser. It means that objects loaded by the > system class loaded has dependencies on objects loaded by the application > class loader. > Objects are still loaded after application is undeployed because cached > SAXParser references TransformerIdentityImpl. > I prepared very simple patch to fix this problem. In the attachment is also > picture which demonstrates situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2609) A xalan(2.7.2) XSL transformation problem
梁雨石 created XALANJ-2609: --- Summary: A xalan(2.7.2) XSL transformation problem Key: XALANJ-2609 URL: https://issues.apache.org/jira/browse/XALANJ-2609 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: transformation Affects Versions: 2.7.2 Environment: windows server 2008/2012 jdk1.6 Reporter: 梁雨石 Assignee: Steven J. Hathaway Attachments: log.txt There is a problem that we use xalan(2.7.2) XSL transformation to transforming a XML document into a HTML file . The problem is as follows: "- java.lang.VerifyError: (class: GregorSamsa, method: WhatsNewJavaScriptTemplate signature: (Lcom/sun/org/apache/xalan/internal/xsltc/DOM;Lcom/sun/org/apache/xml/internal/dtm/DTMAxisIterator;Lcom/sun/org/apache/xml/internal/serializer/SerializationHandler;I)V) Illegal target of jump or branch". We'd like to know the solution to the problem. The Log file has been uploaded as an attachment. Thanks for all your trouble. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2609) A xalan(2.7.2) XSL transformation problem
[ https://issues.apache.org/jira/browse/XALANJ-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104391#comment-16104391 ] 梁雨石 commented on XALANJ-2609: - Have you made any progress now? > A xalan(2.7.2) XSL transformation problem > -- > > Key: XALANJ-2609 > URL: https://issues.apache.org/jira/browse/XALANJ-2609 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: transformation >Affects Versions: 2.7.2 > Environment: windows server 2008/2012 jdk1.6 >Reporter: 梁雨石 >Assignee: Steven J. Hathaway > Labels: XSL, transformation > Attachments: log.txt > > Original Estimate: 1h > Remaining Estimate: 1h > > There is a problem that we use xalan(2.7.2) XSL transformation to > transforming a XML document into a HTML file . > The problem is as follows: > "- java.lang.VerifyError: (class: GregorSamsa, method: > WhatsNewJavaScriptTemplate signature: > (Lcom/sun/org/apache/xalan/internal/xsltc/DOM;Lcom/sun/org/apache/xml/internal/dtm/DTMAxisIterator;Lcom/sun/org/apache/xml/internal/serializer/SerializationHandler;I)V) > Illegal target of jump or branch". > We'd like to know the solution to the problem. > The Log file has been uploaded as an attachment. > Thanks for all your trouble. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Comment Edited] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164487#comment-16164487 ] Jesper Steen Møller edited comment on XALANJ-2419 at 9/13/17 11:04 AM: --- The Xalan project appears quite dormant, which is sad, but understandable. I just came across this quite old posting on the subject: https://intellectualcramps.wordpress.com/2011/06/03/xalan-a-step-closer-to-the-attic/ I suggest one of two courses of action: * Contact the Xalan PMC (use the mailing list, not just JIRA) and volunteer to help in putting out a new release (i.e. look for bugs with patches, or related Unicode issues, e.g. XALANJ-2610). You can find about the current PMC members and committers here: https://projects.apache.org/committee.html?xalan - ASF house rules say that you need three positive PMC votes to allow a new release. (Perhaps economic incentives work, i.e. pay existing committers to work on the release) * Fork Xalan-J on GitHub or similar a place. You'll likely have to rename the project so Apache's trademarks aren't infringed, but the but it should be possible to keep the package names, thus allowing for backwards compatibility (But I'm not a lawyer!) I won't be able to participate - I don't even code much in Java anymore. was (Author: jespersm): The Xalan project appears quite dormant, which is sad, but understandable. I just came across this quite old posting on the subject: https://intellectualcramps.wordpress.com/2011/06/03/xalan-a-step-closer-to-the-attic/ I suggest one of two courses of action: * Contact the Xalan PMC (use the mailing list, not just JIRA) and volunteer to help in putting out a new release (i.e. look for bugs with patches, or related Unicode issues, e.g. XALANJ-2610). You can find about the current PMC members and committers here: https://projects.apache.org/committee.html?xalan - ASF house rules say that you need three positive PMC votes to allow a new release. (Perhaps economic incentives work, i.e. pay existing committers to work on the release) * Fork Xalan-J on GitHub or similar a place. You'll likely have to rename the project so Apache's trademarks aren't infringed, but the but it should be possible to keep the package names, thus allowing for backwards compatibility (But I'm not a lawyer!) > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen > Attachments: XALANJ-2419-fix.txt, XALANJ-2419-tests.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Commented] (XALANJ-2593) Incorrect showing of supplementary characters in attributes
[ https://issues.apache.org/jira/browse/XALANJ-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16486869#comment-16486869 ] Jesper Steen Møller commented on XALANJ-2593: - This is also fixed in the patch in XALANJ-2419. > Incorrect showing of supplementary characters in attributes > --- > > Key: XALANJ-2593 > URL: https://issues.apache.org/jira/browse/XALANJ-2593 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Affects Versions: 2.7.2 > Environment: Win 7 x64, Java 1.6 >Reporter: Eugene Shkel >Assignee: Steven J. Hathaway >Priority: Major > Original Estimate: 24h > Remaining Estimate: 24h > > In Xalan 2.7.2 the supplementary characters (see > http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html > for details) shown incorrectly in attributes . > For example, I need to show symbols ㎴ (& # 144308 ; ) or ب (& # 132648 ; ) in > attribute "y" of element "x" > Expected result: {code}{code} > Actual result for Xalan 2.7.2 is:{code} encoding="UTF-8"?>{code} > Code snippet for test: > {code} > public static void main(String[] argv) throws Exception { > TransformerFactory tFactory = TransformerFactory.newInstance(); > StreamSource stylesource = new StreamSource(new StringReader(" version=\"1.0\" encoding=\"UTF-8\"?> xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\; version=\"1.0\" > > />")); > Transformer transformer = tFactory.newTransformer(stylesource); > StreamSource source = new StreamSource(new StringReader(" version=\"1.0\"?>㎴ - ب")); > Result result = new StreamResult(System.out); > transformer.transform(source, result); > } > {code} > The problem relates to the method > org.apache.xml.serializer.ToStream.writeAttrString(Writer, String, String). > {code} > if (m_charInfo.shouldMapAttrChar(ch)) { > // The character is supposed to be replaced by a String > // e.g. '&' --> "" > // e.g. '<' --> "" > accumDefaultEscape(writer, ch, i, stringChars, len, false, > true); > } > {code} > this part doesn't process multicharacter sequences like supplementary > characters within Java platform and this leads to executing next part within > same method > {code} > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out a character ref > writer.write("!13 <
[jira] [Updated] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesper Steen Møller updated XALANJ-2419: Attachment: (was: XALANJ-2419-fix-v2.txt) > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439335#comment-16439335 ] Jesper Steen Møller commented on XALANJ-2419: - Hi [~thetaphi] - Version 3 adds the fix for normal HTML attribute content as well as URL attributes encoded without URL escapes. A ToHTMLStream test is added, which also tests these corner cases (the UTF-8+URL-escaped byte sequences were as expected, but now has a test). But how do we get anybody to cut a new release? (Or are you using a jarjar'ed build inside Solr or Lucene?) > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438878#comment-16438878 ] Jesper Steen Møller commented on XALANJ-2419: - [~thetaphi]: Now I see what you mean (perhaps): Yes, there is a very tricky similar bug in the attribute values of ToHTMLStream, but not in the general case (I think it's OK due to line 1440-1447 in writeAttrString, but have *not* tested this.) I only see the issue for ToHTMLStream in the case of URL attributes such as A#HREF, where the output has explicitly been set to *not* encoded as am URL (line 1294 in writeAttrURI). The default is to escape HTML attributes containing URLs using URL-encoding, unless overridden with xalan:use-url-escaping=yes in the XSLT output options. (As an aside: I'm no expert, but the UTF-8 encoder inside the URL-encoding (line 1208-1285 in writeAttrURI) seems legit, if a little verbose, instead of just doing String.getBytes(UTF_8) and hexing that) My v2 fix above does *not* address the corner case in line 1294. > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v2.txt, XALANJ-2419-tests-v2.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Comment Edited] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438864#comment-16438864 ] Jesper Steen Møller edited comment on XALANJ-2419 at 4/15/18 11:22 PM: --- [~thetaphi]: So - I fixed it, but will anybody care? I mean, it's been almost 8 years... was (Author: jespersm): [~thetaphi]: I'm sure it could be fixed, but would anybody care? I mean, it's been almost 8 years... > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v2.txt, XALANJ-2419-tests-v2.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Updated] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesper Steen Møller updated XALANJ-2419: Attachment: XALANJ-2419-tests-v2.txt > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v2.txt, XALANJ-2419-fix.txt, > XALANJ-2419-tests-v2.txt, XALANJ-2419-tests.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Updated] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesper Steen Møller updated XALANJ-2419: Attachment: (was: XALANJ-2419-fix.txt) > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v2.txt, XALANJ-2419-tests-v2.txt, > XALANJ-2419-tests.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Updated] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesper Steen Møller updated XALANJ-2419: Attachment: XALANJ-2419-fix-v2.txt > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v2.txt, XALANJ-2419-fix.txt, > XALANJ-2419-tests-v2.txt, XALANJ-2419-tests.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Created] (XALANJ-2614) Serializer 2.7.2 / Xalan 2.7.2 - Bug using Mime-Encoding 'ISO-8859-1'
Jens Annighöfer created XALANJ-2614: --- Summary: Serializer 2.7.2 / Xalan 2.7.2 - Bug using Mime-Encoding 'ISO-8859-1' Key: XALANJ-2614 URL: https://issues.apache.org/jira/browse/XALANJ-2614 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Serialization Affects Versions: 2.7.2 Environment: Windows 10, Linux Java 9, Java 10 Reporter: Jens Annighöfer Assignee: Steven J. Hathaway Fix For: The Latest Development Code Attachments: test.zip We found a problem using Xalan / Serializer with Java 9 and 10 when transforming an XML document with a styleheet containing an output-encoding. {code:xml|title=Simple input|borderStyle=solid} This is a test input. {code} {code:xml|title=Simple stylesheet containing an output-encoding|borderStyle=solid} http://www.w3.org/1999/XSL/Transform; > Tramsformed text: {code} {code:java|title=Simple transformation code|borderStyle=solid} @Test public void test2() throws Exception { final InputStream is1 = Java9Test.class.getClassLoader().getResourceAsStream("test/Input.xml"); assertNotNull(is1); final Document input = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is1); assertNotNull(input); final InputStream is2 = Java9Test.class.getClassLoader().getResourceAsStream("test/Transform.xsl"); assertNotNull(is2); final OutputStream os = new FileOutputStream("Output-" + System.getProperty("java.version") + ".txt", false); StreamSource xsl = new StreamSource(is2); Transformer t = TransformerFactory.newInstance().newTransformer(xsl); DOMSource src = new DOMSource(input); t.transform(src, new StreamResult(os)); } {code} Using Java 7 or Java 8 the result is correct: \{{Tramsformed text: This is a test input.}}. Using Java 9 or Java 10 the result is not correct: \{{}} indicating an invalid or unknown encoding. In Java 7 or Java 8 _org.apache.xml.serializer.Encodings.getEncodingInfo("ISO-8859-1")_ returns "ISO8859_1" which is a valid Java encoding name. In Java 9 oder Java 10 the method returns "8859-1" which is not a valid name. The problem is caused by a change to the method _keys()_ in the _java.util.Properties_ class. This method returns die entries of the _Encodings.properties_ in a different order since Java 9. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2614) Serializer 2.7.2 / Xalan 2.7.2 - Bug using Mime-Encoding 'ISO-8859-1'
[ https://issues.apache.org/jira/browse/XALANJ-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jens Annighöfer updated XALANJ-2614: Description: We found a problem using Xalan / Serializer with Java 9 and 10 when transforming an XML document with a styleheet containing an output-encoding. {code:xml|title=Simple input|borderStyle=solid} This is a test input. {code} {code:xml|title=Simple stylesheet containing an output-encoding|borderStyle=solid} http://www.w3.org/1999/XSL/Transform; > Tramsformed text: {code} {code:java|title=Simple transformation code|borderStyle=solid} @Test public void test2() throws Exception { final InputStream is1 = Java9Test.class.getClassLoader().getResourceAsStream("test/Input.xml"); assertNotNull(is1); final Document input = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is1); assertNotNull(input); final InputStream is2 = Java9Test.class.getClassLoader().getResourceAsStream("test/Transform.xsl"); assertNotNull(is2); final OutputStream os = new FileOutputStream("Output-" + System.getProperty("java.version") + ".txt", false); StreamSource xsl = new StreamSource(is2); Transformer t = TransformerFactory.newInstance().newTransformer(xsl); DOMSource src = new DOMSource(input); t.transform(src, new StreamResult(os)); } {code} Using Java 7 or Java 8 the result is correct: \{{Tramsformed text: This is a test input.}}. Using Java 9 or Java 10 the result is not correct: {noformat} {noformat} indicating an invalid or unknown encoding. In Java 7 or Java 8 _org.apache.xml.serializer.Encodings.getEncodingInfo("ISO-8859-1")_ returns "ISO8859_1" which is a valid Java encoding name. In Java 9 oder Java 10 the method returns "8859-1" which is not a valid name. The problem is caused by a change to the method _keys()_ in the _java.util.Properties_ class. This method returns die entries of the _Encodings.properties_ in a different order since Java 9. was: We found a problem using Xalan / Serializer with Java 9 and 10 when transforming an XML document with a styleheet containing an output-encoding. {code:xml|title=Simple input|borderStyle=solid} This is a test input. {code} {code:xml|title=Simple stylesheet containing an output-encoding|borderStyle=solid} http://www.w3.org/1999/XSL/Transform; > Tramsformed text: {code} {code:java|title=Simple transformation code|borderStyle=solid} @Test public void test2() throws Exception { final InputStream is1 = Java9Test.class.getClassLoader().getResourceAsStream("test/Input.xml"); assertNotNull(is1); final Document input = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(is1); assertNotNull(input); final InputStream is2 = Java9Test.class.getClassLoader().getResourceAsStream("test/Transform.xsl"); assertNotNull(is2); final OutputStream os = new FileOutputStream("Output-" + System.getProperty("java.version") + ".txt", false); StreamSource xsl = new StreamSource(is2); Transformer t = TransformerFactory.newInstance().newTransformer(xsl); DOMSource src = new DOMSource(input); t.transform(src, new StreamResult(os)); } {code} Using Java 7 or Java 8 the result is correct: \{{Tramsformed text: This is a test input.}}. Using Java 9 or Java 10 the result is not correct: \{{}} indicating an invalid or unknown encoding. In Java 7 or Java 8 _org.apache.xml.serializer.Encodings.getEncodingInfo("ISO-8859-1")_ returns "ISO8859_1" which is a valid Java encoding name. In Java 9 oder Java 10 the method returns "8859-1" which is not a valid name. The problem is caused by a change to the method _keys()_ in the _java.util.Properties_ class. This method returns die entries of the _Encodings.properties_ in a different order since Java 9. > Serializer 2.7.2 / Xalan 2.7.2 - Bug using Mime-Encoding 'ISO-8859-1' > - > > Key: XALANJ-2614 > URL: https://issues.apache.org/jira/browse/XALANJ-2614 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Affects Versions: 2.7.2 > Environment: Windows 10, Linux > Java 9, Java 10 >Reporter: Jens Annighöfer >Assignee: Steven J. Hathaway >Priority: Critical > Fix For: The Latest
[jira] [Updated] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesper Steen Møller updated XALANJ-2419: Attachment: XALANJ-2419-tests-v3.txt > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v2.txt, XALANJ-2419-fix-v3.txt, > XALANJ-2419-tests-v2.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Updated] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesper Steen Møller updated XALANJ-2419: Attachment: XALANJ-2419-fix-v3.txt > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v2.txt, XALANJ-2419-fix-v3.txt, > XALANJ-2419-tests-v2.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write(" <
[jira] [Comment Edited] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775205#comment-16775205 ] Jesper Steen Møller edited comment on XALANJ-2419 at 2/22/19 2:43 PM: -- Ok, now I get what's wrong with the encoding. It's bad. The file `Encodings.property` contains mappings between several "Java names" (which probably made sense in the last millenium) and MIME names (which are what should be in the XML inputs). There's some logic to only register the first, but since that's iterating from a Properties object, that will NOT represent the order in the file. In other words, unpredictable. That's why it suddenly worked when you specified ISO8859_1. I don't know what changed for Java 11, it could be the hashtable ordering, or the accepted charset names, but the crux is that "8859-1" is NOT an acceptable encoding name in Java 11: {code:java} jshell> "\u00e8".getBytes("ISO-8859-1") $1 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("8859_1") $2 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("8859-1") | Exception java.io.UnsupportedEncodingException: 8859-1 | at StringCoding.encode (StringCoding.java:427) | at String.getBytes (String.java:941) | at (#3:1) jshell> "\u00e8".getBytes("ISO8859-1") $4 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("ISO8859_1") $5 ==> byte[1] { -24 } jshell> {code} Possible fix: Remove the line "8859-1 ISO-8859-1 0x00FF" and similar patterns from `Encodings.property`? was (Author: jespersm): Ok, now I get what's wrong with the encoding. It's bad. The file `Encodings.property` contains mappings between several "Java names" (which probably made sense in the last millenium) and MIME names (which are what should be in the XML inputs). There's some logic to only register the first, but since that's iterating from a Properties object, that will NOT represent the order in the file. In other words, unpredictable. That's why it suddenly worked when you specified ISO8859_1. I don't know what changed for Java 11, it could be the hashtable ordering, or the accepted charset names, but the crux is that "8859-1" is NOT an acceptable Java name: {code:java} jshell> "\u00e8".getBytes("ISO-8859-1") $1 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("8859_1") $2 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("8859-1") | Exception java.io.UnsupportedEncodingException: 8859-1 | at StringCoding.encode (StringCoding.java:427) | at String.getBytes (String.java:941) | at (#3:1) jshell> "\u00e8".getBytes("ISO8859-1") $4 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("ISO8859_1") $5 ==> byte[1] { -24 } jshell> {code} Possible fix: Remove the line "8859-1 ISO-8859-1 0x00FF" and similar patterns from `Encodings.property`? > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775205#comment-16775205 ] Jesper Steen Møller commented on XALANJ-2419: - Ok, now I get what's wrong with the encoding. It's bad. The file `Encodings.property` contains mappings between several "Java names" (which probably made sense in the last millenium) and MIME names (which are what should be in the XML inputs). There's some logic to only register the first, but since that's iterating from a Properties object, that will NOT represent the order in the file. In other words, unpredictable. That's why it suddenly worked when you specified ISO8859_1. I don't know what changed for Java 11, it could be the hashtable ordering, or the accepted charset names, but the crux is that "8859-1" is NOT an acceptable Java name: {code:java} jshell> "\u00e8".getBytes("ISO-8859-1") $1 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("8859_1") $2 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("8859-1") | Exception java.io.UnsupportedEncodingException: 8859-1 | at StringCoding.encode (StringCoding.java:427) | at String.getBytes (String.java:941) | at (#3:1) jshell> "\u00e8".getBytes("ISO8859-1") $4 ==> byte[1] { -24 } jshell> "\u00e8".getBytes("ISO8859_1") $5 ==> byte[1] { -24 } jshell> {code} Possible fix: Remove the line "8859-1 ISO-8859-1 0x00FF" and similar patterns from `Encodings.property`? > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > ----- > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANC-803) MinGW build is broken
[ https://issues.apache.org/jira/browse/XALANC-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419673#comment-17419673 ] Bence Blázsovics commented on XALANC-803: - ^^fixed the issue, thank you! ||=== Build finished: 0 error(s), 120 warning(s) (0 minute(s), 44 second(s)) ===| > MinGW build is broken > - > > Key: XALANC-803 > URL: https://issues.apache.org/jira/browse/XALANC-803 > Project: XalanC > Issue Type: Bug > Components: XalanC >Affects Versions: 1.12 >Reporter: Roger Leigh >Assignee: Roger Leigh >Priority: Major > Fix For: 1.13 > > Original Estimate: 48h > Remaining Estimate: 48h > > HANDLE and other macros not defined. Likely need to check if it should be > using the Unix or Windows includes and types, and adjust to make MinGW work. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810119#comment-17810119 ] Cédric Damioli commented on XALANJ-2419: I totally agree with you in theory, but the fact is that 2.7.2 and 2.7.3 were *not* released from master, or am I wrong here ? There is a 2_7_x_maint, but with no commits in 5 years I'm afraid we've lost some commits here with the lost of 2_7_1_maint ? Or were all commits on 2_7_1_maint from the last years actually backports from master ? > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Fix For: The Latest Development Code > > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810114#comment-17810114 ] Cédric Damioli commented on XALANJ-2419: I may be wrong here, but I think 2.7.2 and 2.7.3 were not released from master but from some maintenance branch. Something like 2_7_1_maint IIRC By the way, I can't find that branch anymore in the github repo. Do you know where is it ? > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809615#comment-17809615 ] Cédric Damioli commented on XALANJ-2419: Good to here [~kesh...@alum.mit.edu]! I can confirm that the patch provided here actually works but that at the same time the issue pointed by [~maxfortun] still exists. Feel free to ask if you want some help reviewing a patch or test something? I think we are many around here really glad to see this issue finally resolved! > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810526#comment-17810526 ] Jesper Steen Møller commented on XALANJ-2419: - Great work getting this forward, Joe! Which test(s) are you seeing the problem with? ISO-8859-1 output should not be affected by the patch. > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Fix For: The Latest Development Code > > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2618) Error in org/apache/xml/serializer/Encodings.properties
[ https://issues.apache.org/jira/browse/XALANJ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811445#comment-17811445 ] Cédric Damioli commented on XALANJ-2618: Hi [~kesh...@alum.mit.edu] I can confim that with your change, my use case is now working (tested in Java 17) > Error in org/apache/xml/serializer/Encodings.properties > --- > > Key: XALANJ-2618 > URL: https://issues.apache.org/jira/browse/XALANJ-2618 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization, transformation >Affects Versions: 2.7.2 > Environment: Java 11 >Reporter: Simon Schaarschmidt >Assignee: Steven J. Hathaway >Priority: Major > Labels: Java11 > > We transform and serialize using encoding ISO-8859-1. With JDK 1.8 all is > fine, but with OpenJDK 11 the result will be written (from class > ToTextStream) in character references, e.g. > "*#105;#100;#61;#49;*" instead of "*id=1*". > In org/apache/xml/serializer/Encodings.properties (serializer.jar) are > various encodings defined, e.g. > {{ISO8859-1 ISO-8859-1 0x00FF}} > {{ISO8859_1 ISO-8859-1 0x00FF}} > {{{color:#ff}8859-1{color} ISO-8859-1 0x00FF}} > {{{color:#ff}8859_1{color} ISO-8859-1 0x00FF}} > First value: Java encoding name > Second value: comma separated preferred mime names. > The class org.apache.xml.serializer.Encodings reads this file in a Properties > object and processes the definitions to create EncodingInfo objects and puts > them (see method loadEncodingInfo()) into the member fields > __encodingTableKeyJava_ and __encodingTableKeyMime_ (both Hashtable). > Especially putting Elements into _encodingTableKeyMime is critical because > there is not a 1:1 mapping and the latest returned Properties.keys() element > replaces the previous ElementInfo object. > Until Java 1.8 the first line from above is the latest entry in Enumeration, > therefor _encodingTableKeyMime returns the EncodingInfo object with Java > encoding "{color:#14892c}ISO8859-1{color}" for encoding "ISO-8859-1". With > Java 11 the elements of the Enumeration returned by Properties.keys() has a > different order: the third line from above is the latest entry! Therefor > _encodingTableKeyMime returns the EncodingInfo object with Java encoding > "*{color:#ff}8859-1{color}*" when asking for encoding "ISO-8859-1". But: > "8859-1" ist not a valid Java encoding name! Method > EncodingInfo.inEncoding(char,String) fails internally with an > *UnsupportedEncodingException* and returns false. > The methods in class Encodings first searches EncodingInfo object in > _encodingTableKeyJava and uses elements from _encodingTableKeyMime as > fallback. > I suggest the definitions in Encodings.properties must be extended with > additional lines, e.g. > {{*{color:#14892c}ISO-8859-1{color}* ISO-8859-1 0x00FF}} > Also for encodings ISO-8859-2..9. Or all entries with Java encoding name > "8859*" should be removed. (They are not valid Java encoding names - > UnsupportedEncodingException!) > Finally I think, the current mechanism of collecting the EncodingInfo objects > using two Hashtables is critical. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2625) Text output in ISO-8859-1 in Java 11
[ https://issues.apache.org/jira/browse/XALANJ-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811569#comment-17811569 ] Cédric Damioli commented on XALANJ-2625: [~kesh...@alum.mit.edu] this one should be resolved as duplicate of XALANJ-2618 > Text output in ISO-8859-1 in Java 11 > - > > Key: XALANJ-2625 > URL: https://issues.apache.org/jira/browse/XALANJ-2625 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Xalan >Affects Versions: 2.7.2 >Reporter: Daniel van den Ouden >Assignee: Gary D. Gregory >Priority: Minor > > We're currently in the process of upgrading our builds from Java 8 to Java 11 > and we've run into the following issue: > Given the following XML > {noformat} > > http://www.w3.org/2001/XMLSchema-instance; > xsi:noNamespaceSchemaLocation="../xsd/DBSettings.xsd"> > > > > > > > > {noformat} > and the following XSL > {noformat} > > xmlns:xsl="http://www.w3.org/1999/XSL/Transform; > xmlns:fo="http://www.w3.org/1999/XSL/Format;> >indent="yes"/> > > db:// > > > : > > / >select="/Settings/Database/User/@password" /> > @ >select="/Settings/Database/Database/@value" /> > > > {noformat} > We would expect the output to be > {noformat} > db://Oracle:fgi_user/fgi@UTF8 > {noformat} > But with Java11, the output becomes > {noformat} > > {noformat} > And the console gets flooded with messages like > {noformat} > Attempt to output character of integral value 100 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 98 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 58 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 47 that is not represented in > specified output encoding of ISO-8859-1. > Attempt to output character of integral value 47 that is not represented in > specified output encoding of ISO-8859-1. > {noformat} > The problem seems to be caused by org.apache.xml.serializer.Encodings.java. > In loadEncodingInfo(), a properties file is read > (org.apache.xml.serializer.Encodings.properties) containing a Java encoding > name and the associated MIME name that may appear in a stylesheet. For > ISO-8859-1, it contains the following entries in this order: > {noformat} > ISO8859-1 ISO-8859-1 0x00FF > ISO8859_1 ISO-8859-1 0x00FF > 8859-1 ISO-8859-1 0x00FF > 8859_1 ISO-8859-1 0x00FF > {noformat} > the loadEncodingInfo() method iterates over these entries, but the order > differs between Java 8 and Java 11. > Java 8: > {noformat} > ISO8859-1 > 8859_1 > 8859-1 > ISO8859_1 > {noformat} > Java 11: > {noformat} > ISO8859-1 > ISO8859_1 > 8859_1 > 8859-1 > {noformat} > Every entry is put in the _encodingTableKeyJava map using the Java name as > key, and in the _encodingTableKeyMime hastable using the MIME name as key. > In our case, the method getEncodingInfo(String encoding) with "encoding" > having the value "ISO-8859-1". First the _encodingTableKeyJava map is > checked; it doesn't contain the key "ISO-8859-1". Then the > _encodingTableKeyMime map is checked, which contains the last entry that was > processed from the properties file with a matching MIME name. Then the Java > name of that entry is used to build a new EncodingInfo object and perform the > actual encoding using the String class. > The problem here is that with Java 11, the last entry from the properties > file is "8859-1". This is NOT an alias for the actual ISO-8859-1 encoding. > With Java 8, the last entry would be "ISO8859_1" which IS an alias for > ISO-8859-1. > The aliases as I found them are: > {noformat} > ISO-8859-1 > 819 > ISO8859-1 > l1 > ISO_8859-1:1987 > ISO_8859-1 > 8859_1 > iso-
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811574#comment-17811574 ] Cédric Damioli commented on XALANJ-2725: It would be great to solve all open encoding issues and then be able to make a release. I think [~maxfortun]'s proposal, even if it don't cover all potential cases may still be a good achievement. What is missing ? Does all tests pass ? I'll test the PR on my edge cases. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2617) Serializer produces separately escaped surrogate pair instead of codepoint
[ https://issues.apache.org/jira/browse/XALANJ-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811567#comment-17811567 ] Cédric Damioli commented on XALANJ-2617: [~kesh...@alum.mit.edu] I think you may also mark this one resolved as duplicate of XALANJ-2419 ? > Serializer produces separately escaped surrogate pair instead of codepoint > -- > > Key: XALANJ-2617 > URL: https://issues.apache.org/jira/browse/XALANJ-2617 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization, Xalan >Affects Versions: 2.7.1, 2.7.2 >Reporter: Daniel Kec >Assignee: Steven J. Hathaway >Priority: Major > Attachments: JI9053942.java, > XALANJ-2617_Fix_missing_surrogate_pairs_support.patch, > XALANJ-2617_java.patch, XALANJ-2617_test.patch > > > When trying to serialize XML with char consisting of unicode surogate char > "\uD840\uDC0B" I have tried several and non worked. XML Transformer creates > XML string with escaped surogate pair separately, which makes XML > unparseable. eg.: SAXParseException; Character reference "" is an > invalid XML character. It looks like a bug introduced in the XALANJ-2271 fix. > > {code:java|title=Output of Xalan ver. 2.7.2} > kec@phoebe:~/Downloads$ java -version > java version "1.8.0_171" > Java(TM) SE Runtime Environment (build 1.8.0_171-b11) > Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode) > kec@phoebe:~/Downloads$ java -cp > /home/kec/.m2/repository/xml-apis/xml-apis/1.4.01/xml-apis-1.4.01.jar:/home/kec/.m2/repository/xalan/xalan/2.7.2/xalan-2.7.2.jar:/home/kec/.m2/repository/xalan/serializer/2.7.2/serializer-2.7.2.jar:. > JI9053942 > Character: > EXPECTED: > ACTUAL: > [Fatal Error] :1:50: Character reference "&#
[jira] [Commented] (XALANJ-2618) Error in org/apache/xml/serializer/Encodings.properties
[ https://issues.apache.org/jira/browse/XALANJ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811491#comment-17811491 ] Cédric Damioli commented on XALANJ-2618: Thanks for asking. I will remove my PR. Yours seems indeed better. I really don't need to be credited (you may mention our discussions and why your PR is better than mine in PR comments, for the record). What we all need is à Xalan release fixing all pending encoding issues > Error in org/apache/xml/serializer/Encodings.properties > --- > > Key: XALANJ-2618 > URL: https://issues.apache.org/jira/browse/XALANJ-2618 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization, transformation >Affects Versions: 2.7.2 > Environment: Java 11 >Reporter: Simon Schaarschmidt >Assignee: Steven J. Hathaway >Priority: Major > Labels: Java11 > > We transform and serialize using encoding ISO-8859-1. With JDK 1.8 all is > fine, but with OpenJDK 11 the result will be written (from class > ToTextStream) in character references, e.g. > "*#105;#100;#61;#49;*" instead of "*id=1*". > In org/apache/xml/serializer/Encodings.properties (serializer.jar) are > various encodings defined, e.g. > {{ISO8859-1 ISO-8859-1 0x00FF}} > {{ISO8859_1 ISO-8859-1 0x00FF}} > {{{color:#ff}8859-1{color} ISO-8859-1 0x00FF}} > {{{color:#ff}8859_1{color} ISO-8859-1 0x00FF}} > First value: Java encoding name > Second value: comma separated preferred mime names. > The class org.apache.xml.serializer.Encodings reads this file in a Properties > object and processes the definitions to create EncodingInfo objects and puts > them (see method loadEncodingInfo()) into the member fields > __encodingTableKeyJava_ and __encodingTableKeyMime_ (both Hashtable). > Especially putting Elements into _encodingTableKeyMime is critical because > there is not a 1:1 mapping and the latest returned Properties.keys() element > replaces the previous ElementInfo object. > Until Java 1.8 the first line from above is the latest entry in Enumeration, > therefor _encodingTableKeyMime returns the EncodingInfo object with Java > encoding "{color:#14892c}ISO8859-1{color}" for encoding "ISO-8859-1". With > Java 11 the elements of the Enumeration returned by Properties.keys() has a > different order: the third line from above is the latest entry! Therefor > _encodingTableKeyMime returns the EncodingInfo object with Java encoding > "*{color:#ff}8859-1{color}*" when asking for encoding "ISO-8859-1". But: > "8859-1" ist not a valid Java encoding name! Method > EncodingInfo.inEncoding(char,String) fails internally with an > *UnsupportedEncodingException* and returns false. > The methods in class Encodings first searches EncodingInfo object in > _encodingTableKeyJava and uses elements from _encodingTableKeyMime as > fallback. > I suggest the definitions in Encodings.properties must be extended with > additional lines, e.g. > {{*{color:#14892c}ISO-8859-1{color}* ISO-8859-1 0x00FF}} > Also for encodings ISO-8859-2..9. Or all entries with Java encoding name > "8859*" should be removed. (They are not valid Java encoding names - > UnsupportedEncodingException!) > Finally I think, the current mechanism of collecting the EncodingInfo objects > using two Hashtables is critical. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811010#comment-17811010 ] Cédric Damioli commented on XALANJ-2419: Hi [~kesh...@alum.mit.edu], I've added a PR for XALANJ-2618, I suppose this will also fix your tests on this ticket > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Fix For: The Latest Development Code > > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2618) Error in org/apache/xml/serializer/Encodings.properties
[ https://issues.apache.org/jira/browse/XALANJ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811338#comment-17811338 ] Cédric Damioli commented on XALANJ-2618: The Encodings.properties has three columns. The first one is supposed to hold "JAVA name encoding". But all encoding 8859-* (without the ISO- prefix) are *not* proper Java encodings. ISO-8859-* are instead properly recognized, and should be kept as is. Why do you think that it would be a regression ? All tests seems to pass well with these removals. > Error in org/apache/xml/serializer/Encodings.properties > --- > > Key: XALANJ-2618 > URL: https://issues.apache.org/jira/browse/XALANJ-2618 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization, transformation >Affects Versions: 2.7.2 > Environment: Java 11 >Reporter: Simon Schaarschmidt >Assignee: Steven J. Hathaway >Priority: Major > Labels: Java11 > > We transform and serialize using encoding ISO-8859-1. With JDK 1.8 all is > fine, but with OpenJDK 11 the result will be written (from class > ToTextStream) in character references, e.g. > "*#105;#100;#61;#49;*" instead of "*id=1*". > In org/apache/xml/serializer/Encodings.properties (serializer.jar) are > various encodings defined, e.g. > {{ISO8859-1 ISO-8859-1 0x00FF}} > {{ISO8859_1 ISO-8859-1 0x00FF}} > {{{color:#ff}8859-1{color} ISO-8859-1 0x00FF}} > {{{color:#ff}8859_1{color} ISO-8859-1 0x00FF}} > First value: Java encoding name > Second value: comma separated preferred mime names. > The class org.apache.xml.serializer.Encodings reads this file in a Properties > object and processes the definitions to create EncodingInfo objects and puts > them (see method loadEncodingInfo()) into the member fields > __encodingTableKeyJava_ and __encodingTableKeyMime_ (both Hashtable). > Especially putting Elements into _encodingTableKeyMime is critical because > there is not a 1:1 mapping and the latest returned Properties.keys() element > replaces the previous ElementInfo object. > Until Java 1.8 the first line from above is the latest entry in Enumeration, > therefor _encodingTableKeyMime returns the EncodingInfo object with Java > encoding "{color:#14892c}ISO8859-1{color}" for encoding "ISO-8859-1". With > Java 11 the elements of the Enumeration returned by Properties.keys() has a > different order: the third line from above is the latest entry! Therefor > _encodingTableKeyMime returns the EncodingInfo object with Java encoding > "*{color:#ff}8859-1{color}*" when asking for encoding "ISO-8859-1". But: > "8859-1" ist not a valid Java encoding name! Method > EncodingInfo.inEncoding(char,String) fails internally with an > *UnsupportedEncodingException* and returns false. > The methods in class Encodings first searches EncodingInfo object in > _encodingTableKeyJava and uses elements from _encodingTableKeyMime as > fallback. > I suggest the definitions in Encodings.properties must be extended with > additional lines, e.g. > {{*{color:#14892c}ISO-8859-1{color}* ISO-8859-1 0x00FF}} > Also for encodings ISO-8859-2..9. Or all entries with Java encoding name > "8859*" should be removed. (They are not valid Java encoding names - > UnsupportedEncodingException!) > Finally I think, the current mechanism of collecting the EncodingInfo objects > using two Hashtables is critical. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810716#comment-17810716 ] Cédric Damioli commented on XALANJ-2419: I think I may know this one ! It reminds me an issue with Encodings.properties loaded in a different order in Java 8 and Java 9+, leading to issues on modern JVM because it references unexisting encodings. Could it be related ? In my case I have modified Encoding.properties by removing all 8859_* encodings and all worked again I'm pretty sure that a Jira issue existe about this one but I can't find it anymore ... > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Fix For: The Latest Development Code > > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Comment Edited] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810716#comment-17810716 ] Cédric Damioli edited comment on XALANJ-2419 at 1/25/24 7:37 AM: - I think I may know this one ! It reminds me an issue with Encodings.properties loaded in a different order in Java 8 and Java 9+, leading to issues on modern JVM because it references unexisting encodings. Could it be related ? In my case I have modified Encoding.properties by removing all 8859_* encodings and all worked again See XALANJ-2625 and XALANJ-2618 was (Author: cedric): I think I may know this one ! It reminds me an issue with Encodings.properties loaded in a different order in Java 8 and Java 9+, leading to issues on modern JVM because it references unexisting encodings. Could it be related ? In my case I have modified Encoding.properties by removing all 8859_* encodings and all worked again I'm pretty sure that a Jira issue existe about this one but I can't find it anymore ... > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Fix For: The Latest Development Code > > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2618) Error in org/apache/xml/serializer/Encodings.properties
[ https://issues.apache.org/jira/browse/XALANJ-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811363#comment-17811363 ] Cédric Damioli commented on XALANJ-2618: I'll test your branch. Does your other work on XALANJ-2419 pass with this one ? > Error in org/apache/xml/serializer/Encodings.properties > --- > > Key: XALANJ-2618 > URL: https://issues.apache.org/jira/browse/XALANJ-2618 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization, transformation >Affects Versions: 2.7.2 > Environment: Java 11 >Reporter: Simon Schaarschmidt >Assignee: Steven J. Hathaway >Priority: Major > Labels: Java11 > > We transform and serialize using encoding ISO-8859-1. With JDK 1.8 all is > fine, but with OpenJDK 11 the result will be written (from class > ToTextStream) in character references, e.g. > "*#105;#100;#61;#49;*" instead of "*id=1*". > In org/apache/xml/serializer/Encodings.properties (serializer.jar) are > various encodings defined, e.g. > {{ISO8859-1 ISO-8859-1 0x00FF}} > {{ISO8859_1 ISO-8859-1 0x00FF}} > {{{color:#ff}8859-1{color} ISO-8859-1 0x00FF}} > {{{color:#ff}8859_1{color} ISO-8859-1 0x00FF}} > First value: Java encoding name > Second value: comma separated preferred mime names. > The class org.apache.xml.serializer.Encodings reads this file in a Properties > object and processes the definitions to create EncodingInfo objects and puts > them (see method loadEncodingInfo()) into the member fields > __encodingTableKeyJava_ and __encodingTableKeyMime_ (both Hashtable). > Especially putting Elements into _encodingTableKeyMime is critical because > there is not a 1:1 mapping and the latest returned Properties.keys() element > replaces the previous ElementInfo object. > Until Java 1.8 the first line from above is the latest entry in Enumeration, > therefor _encodingTableKeyMime returns the EncodingInfo object with Java > encoding "{color:#14892c}ISO8859-1{color}" for encoding "ISO-8859-1". With > Java 11 the elements of the Enumeration returned by Properties.keys() has a > different order: the third line from above is the latest entry! Therefor > _encodingTableKeyMime returns the EncodingInfo object with Java encoding > "*{color:#ff}8859-1{color}*" when asking for encoding "ISO-8859-1". But: > "8859-1" ist not a valid Java encoding name! Method > EncodingInfo.inEncoding(char,String) fails internally with an > *UnsupportedEncodingException* and returns false. > The methods in class Encodings first searches EncodingInfo object in > _encodingTableKeyJava and uses elements from _encodingTableKeyMime as > fallback. > I suggest the definitions in Encodings.properties must be extended with > additional lines, e.g. > {{*{color:#14892c}ISO-8859-1{color}* ISO-8859-1 0x00FF}} > Also for encodings ISO-8859-2..9. Or all entries with Java encoding name > "8859*" should be removed. (They are not valid Java encoding names - > UnsupportedEncodingException!) > Finally I think, the current mechanism of collecting the EncodingInfo objects > using two Hashtables is critical. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820236#comment-17820236 ] Cédric Damioli commented on XALANJ-2725: Hi [~kesh...@alum.mit.edu], I just made a few tests with your new branch, and all my previously failing tests are ok now. Does all unit tests also pass ? > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2732) Xalan jar is missing META-INF/services
Cédric Damioli created XALANJ-2732: -- Summary: Xalan jar is missing META-INF/services Key: XALANJ-2732 URL: https://issues.apache.org/jira/browse/XALANJ-2732 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Affects Versions: The Latest Development Code Reporter: Cédric Damioli It seems that the maven build does not copy xalan/src/main/java/META-INF/services in the jar, resulting in JAXP not finding appropriate TransformerFactory -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2578) Maven build system for Xalan-J
[ https://issues.apache.org/jira/browse/XALANJ-2578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe updated XALANJ-2578: Attachment: xalan-java-trunk.patch Patch adding Maven POM files Maven build system for Xalan-J -- Key: XALANJ-2578 URL: https://issues.apache.org/jira/browse/XALANJ-2578 Project: XalanJ2 Issue Type: Improvement Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: The Latest Development Code Reporter: Uwe Assignee: Steven J. Hathaway Labels: build, maven Fix For: The Latest Development Code Attachments: xalan-java-trunk.patch Original Estimate: 36h Remaining Estimate: 36h I have developed Maven POM files to build Xalan-J with Maven instead of Ant. The additions attempt to be non-intrusive, leaving the existing ant build system untouched. For this reason, the POMs are not as lean as they could be (see below). I've run the minitest suite ('ant smoketest.gump'), and it all works fine on a fresh checkout. This patch is intended as a basis for discussion, as there are still a few questions open: * Will this replace the ant build? (preferably yes, we could make the POMs much leaner by moving sources into standard Maven directories; less maintenance) * What will be the target version number? (see below for details) * How do we integrate testing? (it's currently in a separate project, and that's a good thing - but running it from Maven would be nice) Overview There is a project parent POM, which groups the project into the following modules: * serializer (builds serializer.jar, from the org.apache.xml.serializer.* packages) * xalan-impl (builds an intermediate jar from the rest of the sources) * xalan (builds xalan.jar, using the maven shade plugin to integrate xalan-impl and dependent libraries into an uber-jar - this replicates the output of the ant build process) Output artifacts (xalan.jar and serializer.jar) are placed in the 'build' directory, like in the ant build. The groupId and artifactId are the same as the ones in Maven Central for Xalan-J 2.7.1 Details === In the Maven build, dependent libraries (BCEL, java_cup, regexp) are pulled from Maven Central and differ slightly in version from what's checked into SVN in the lib-directory. The same goes for the tools directory; the Maven tooling uses artifacts from Maven central and ignores the tools directory altogether. Since the versioning scheme in the project differs from standard Maven versioning (2.7.D2 for a defelopment version of 2.7.2 vs its Maven equivalent 2.7.2-SNAPSHOT), I've left it with Maven standard for now. The current version in the POMs is set to 2.8-SNAPSHOT, as I'd expect a change like this to go into a minor release rather than a bugfix release. Because the versioning question is still open, the POMs leave Version.java in both serializer.jar and xalan.jar alone. Depending on what we decide to do with project versioning and whether or not we preserve the ant build (preferably no), the way the Version classes operate will need to change (either read the version from a generated property file, read it from META-INF or use Maven filtering). Currently, the ant build simple overwrites them and removes them on clean (even though they're checked in). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANC-760) Code analysis revealed multiple potential buffer overflows
Int3 created XALANC-760: --- Summary: Code analysis revealed multiple potential buffer overflows Key: XALANC-760 URL: https://issues.apache.org/jira/browse/XALANC-760 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: 1.11 Reporter: Int3 Assignee: Steven J. Hathaway src/xalanc/Harness/XalanXMLFileReporter.cpp The float at line 490 can exceed 40 bytes in length (max double is 317 bytes) src/xalanc/Utils/MsgCreator/MsgCreator.cpp This utility lacks any buffer bounding to protect against buffer overflows src/xalanc/Utils/MsgCreator/InMemHandler.cpp This utility lacks any buffer bounding to protect against buffer overflows src/xalanc/XalanExe/XalanExe.cpp There is no upper bound on n_maxParams -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2598) unhandled exceptions
songwanging created XALANJ-2598: --- Summary: unhandled exceptions Key: XALANJ-2598 URL: https://issues.apache.org/jira/browse/XALANJ-2598 Project: XalanJ2 Issue Type: Improvement Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Reporter: songwanging Assignee: Steven J. Hathaway Priority: Minor In method compileExtension() of class Compiler(\src\org\apache\xpath\compiler\Compiler.java) The catch block catch (WrongNumberArgsException e) performs no actions to handle its expected exception, which makes itself useless. To fix this bug, we should add more code into the catch block to handle this exception, or directly delete this catch block. compileExtension(){ ... try{ … } catch (WrongNumberArgsException wnae) { ; // should never happen } } = In method getDTM() of class refDTMManagerDefault (src\org\apache\xml\dtm\refDTMManagerDefault.java) The catch block catch (Exception e) performs no actions to handle its expected exception, which makes itself useless. To fix this bug, we should add more code into the catch block to handle this exception, or directly delete this catch block. getDTM(){ ... try { reader.setProperty("http://xml.org/sax/properties/lexical-handler;, null); } catch (Exception e) {} } .. } -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2189) xalan:evaluate behavior seems to have changed in 2.7 and this change has broken existing applications
[ https://issues.apache.org/jira/browse/XALANJ-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15386352#comment-15386352 ] Boris commented on XALANJ-2189: --- We ran into the same bug. I poked a bit around, created a much smaller testcase and with this I found out - thanks to git bisect - that the commit that broke this was revision 338150 in the svn (which is commit fc90504a in the official git mirror). This commit is pretty big as it included support for JAXP 1.3, but I found the change to Variable.java pretty interesting, because it commented a "hack" in the code that (according to the comment) was "needed to evaluate xpaths from extensions": https://svn.apache.org/viewvc/xalan/java/trunk/src/org/apache/xpath/operations/Variable.java?r1=338103=338150 I guess this was removed because it was a hack and using "getVariableOrParam" in this case was deemed to be a better way, but this is what seems to break the evaluate function. I've still got no clue what exactly is going on, but reverting this part of the change does fix the bug. But of course without really understanding it I don't know what it might break regarding the JAXP 1.3 feature. I ran the minitest and smoketest and they seemed fine. I don't want to argue to go back to using the old hack without really knowing the consequences, but I thought at least I'd share what I found out so far. I'll also append the smaller testcase that I've used in case someone is interested. > xalan:evaluate behavior seems to have changed in 2.7 and this change has > broken existing applications > - > > Key: XALANJ-2189 > URL: https://issues.apache.org/jira/browse/XALANJ-2189 > Project: XalanJ2 > Issue Type: Bug > Components: Xalan-extensions >Affects Versions: 2.7 > Environment: Windows XP, JDK 1.4.2_08 >Reporter: Rick Bullotta >Assignee: Ilene Seelemann >Priority: Critical > Attachments: Dataset.xml, SubtotalOutput.xml, SubtotalTransform.xsl > > > See attached source XML and transform as well as what the correct result > should be (when used with Xalan 2.6.x). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2189) xalan:evaluate behavior seems to have changed in 2.7 and this change has broken existing applications
[ https://issues.apache.org/jira/browse/XALANJ-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris updated XALANJ-2189: -- Attachment: xalanj-2189-test-input.xml xalanj-2189-test.xsl > xalan:evaluate behavior seems to have changed in 2.7 and this change has > broken existing applications > - > > Key: XALANJ-2189 > URL: https://issues.apache.org/jira/browse/XALANJ-2189 > Project: XalanJ2 > Issue Type: Bug > Components: Xalan-extensions >Affects Versions: 2.7 > Environment: Windows XP, JDK 1.4.2_08 >Reporter: Rick Bullotta >Assignee: Ilene Seelemann >Priority: Critical > Attachments: Dataset.xml, SubtotalOutput.xml, SubtotalTransform.xsl, > xalanj-2189-test-input.xml, xalanj-2189-test.xsl > > > See attached source XML and transform as well as what the correct result > should be (when used with Xalan 2.6.x). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2606) A suspicious use of an incrementer in a for loop
JC created XALANJ-2606: -- Summary: A suspicious use of an incrementer in a for loop Key: XALANJ-2606 URL: https://issues.apache.org/jira/browse/XALANJ-2606 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Reporter: JC Assignee: Steven J. Hathaway Priority: Trivial In a recent github snapshot, I've found a suspicious use of an incrementer in for loop. src/org/apache/xalan/xsltc/compiler/util/MethodGenerator.java {code:java} 501 for (int i = 0; i < slotCount; i++) { 502 Object slotEntries = _variables.get(i); 503 if (slotEntries != null) { 504 if (slotEntries instanceof ArrayList) { 505 ArrayList slotList = (ArrayList) slotEntries; 506 507 for (int j = 0; j < slotList.size(); j++) { 508 allVarsEverDeclared.add(slotList.get(i)); 509 } 510 } else { 511 allVarsEverDeclared.add(slotEntries); 512 } 513 } 514 } {code} In Line 508, slotList.get( i) should be slotList.get(j)? I have no idea if slotCount is always same as or less than slotList.size(). However, I thought it's worth to report just in case. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2605) xsltc trax TransformerImpl does not correctly reset output properties!
Victor created XALANJ-2605: -- Summary: xsltc trax TransformerImpl does not correctly reset output properties! Key: XALANJ-2605 URL: https://issues.apache.org/jira/browse/XALANJ-2605 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: transformation, XSLTC Affects Versions: 2.7.2, 2.7.1 Environment: Oracle Java 7, 8 Reporter: Victor Assignee: Steven J. Hathaway Priority: Critical There seems to be a bug in com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.reset(). Basically, calling reset() will call setOutputProperties(null), which will in turn set the value of _properties to _propertiesClone but without cloning the later, which results on both _properties and _propertiesClone being reference to the SAME object. So after reset() is called once on a TransformerImpl, the next time setOutputProperties(String, String) is called on it, _propertiesClone is modified, and thus all the future calls to reset() won't work with respect to output properties! The solution would be to change the following line: _properties = _propertiesClone; to: _properties = _propertiesClone.clone(); Note that this bug affects Java 7 and Java 8 (xalan 2.7.0), and it seems this is the default implementation of Transformer returned by the default TransformerFactory, so it is quite surprising it wasn't discovered before! Did I misunderstood the contract or reset() maybe? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2436) Xalan must not expose bundled classes (bcel, regexp)
[ https://issues.apache.org/jira/browse/XALANJ-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423299#comment-16423299 ] Trejkaz commented on XALANJ-2436: - This seems like it would be a good idea. Right now we have our own dependency on a later BCEL and Xalan's internal copy of an older one, and it worries me that for one user out there, they might be getting the Xalan one ahead of ours on their classpath. > Xalan must not expose bundled classes (bcel, regexp) > > > Key: XALANJ-2436 > URL: https://issues.apache.org/jira/browse/XALANJ-2436 > Project: XalanJ2 > Issue Type: Bug > Components: Xalan >Affects Versions: 2.7.1 > Environment: any >Reporter: Holger Hoffstätte >Priority: Critical > Attachments: XALAN-2436.patch, rewrite-packages.rules > > > I just spent the better part of half a day figuring out what caused the > problem outlined in > https://sourceforge.net/tracker/?func=detail=614693=1902137_id=96405. > Xalan bundles regexp and bcel, however since one of the recommened ways of > installing xalan is via the endorsed mechanism this will wreak serious havoc > on any other apps that use bcel. That would be less of a problem is xalan's > version were up to date, but as of 2.7.1 it still includes a version from the > early stone age (see XALANJ-2423). The solution is easy: when building the > aggregate jar, add an ant task to rewrite the bundled packages via jarjar > (http://code.google.com/p/jarjar/). This can be trivially added to the build > and creates a completely self-contained xalan jar that will not blow up the > world when endorsed. > I will attach a trivial rule file for jarjar that rewrites the embedded > packages which should immediately fix any collision problems. For more > information about how to use jarjar, see > http://code.google.com/p/jarjar/wiki/GettingStarted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-781) Compilation fails with ICU
[ https://issues.apache.org/jira/browse/XALANC-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831419#comment-16831419 ] Imran commented on XALANC-781: -- Any updates on this issue? > Compilation fails with ICU > -- > > Key: XALANC-781 > URL: https://issues.apache.org/jira/browse/XALANC-781 > Project: XalanC > Issue Type: Bug > Components: XalanC >Affects Versions: 1.11 > Environment: GNU/Linux Kernel 3.0.13 x86_64 > gcc (GCC) 8.0.1 20180307 (experimental) > Compilation with -std=gnu++17 >Reporter: Laurent Stacul >Assignee: Steven J. Hathaway >Priority: Major > Labels: easyfix > Attachments: xalanc_fix_compilation_with_icu.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > There are several errors when compiling xalanc with ICU (XALAN_USE_ICU) in > the folder ICUBridge. In several places, the namespace of ICU > classes/functions is not given leading to the following types of error: > {code:java} > xalan-c-1.11/c/src/xalanc/ICUBridge/ICUFormatNumberFunctor.hpp:227:12: error: > 'DecimalFormat' does not name a type;{code} > I provide a patch for such issues. > Stac -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2627) Adding multiple transform using addTransform consumes huge memory and cause OOM
Subhajit created XALANJ-2627: Summary: Adding multiple transform using addTransform consumes huge memory and cause OOM Key: XALANJ-2627 URL: https://issues.apache.org/jira/browse/XALANJ-2627 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Reporter: Subhajit Assignee: Gary D. Gregory We are using XALAN for XSLT transformation. We have around 10 transformer. Our pattern is Transformation.builder(transformationFactory) .setUseInterpreter(true) .setLogger(log) .setSource(some source)) .setResult(some target) .addTransform(Handler1()) .addTransform(Handler2()) .addTransform(Handler3()) .addTransform(Handler4()) .addTransform(Handler5()) .addTransform(Handler6()) .addTransform(Handler7()) .addTransform(Handler8()) .addTransform(Handler9()) .addTransform(Handler10()) .addTransform(Handler11()) .addTransform(Handler12()) .build() .transform(); This pattern seems to take lots of memory. But if we do them individually 12 times (by using output of 1 as input of another), memory usage get reduced. Transformation.builder(transformationFactory) .setUseInterpreter(true) .setLogger(log) .setSource(some source)) .setResult(some target) .addTransform(Handler1()) .build() .transform(); It seems Xalan is holding unnecessary memory for the 1st type of pattern -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17493984#comment-17493984 ] Max commented on XALANJ-2419: - [~jharrop] thanks for pointing to your work here. Saved me time. Would be nice to have it merged in. > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Updated] (XALANJ-2637) Xalan 2.7.2 Vulnerability
[ https://issues.apache.org/jira/browse/XALANJ-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vick updated XALANJ-2637: - Description: I need to remove the vulnerability detected for the Apache Xalan library, version 2.7.2, so the application will be free of this vulnerability (was: I need to remove the vulnerability detected for the Apache Xalan library, version 2.7.2.1, so the application will be free of this vulnerability) > Xalan 2.7.2 Vulnerability > - > > Key: XALANJ-2637 > URL: https://issues.apache.org/jira/browse/XALANJ-2637 > Project: XalanJ2 > Issue Type: Bug > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Xalan >Affects Versions: 2.7.2 >Reporter: vick >Assignee: Gary D. Gregory >Priority: Major > Fix For: 2.7.2 > > > I need to remove the vulnerability detected for the Apache Xalan library, > version 2.7.2, so the application will be free of this vulnerability -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2637) Xalan 2.7.2 Vulnerability
vick created XALANJ-2637: Summary: Xalan 2.7.2 Vulnerability Key: XALANJ-2637 URL: https://issues.apache.org/jira/browse/XALANJ-2637 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.7.2 Reporter: vick Assignee: Gary D. Gregory Fix For: 2.7.2 I need to remove the vulnerability detected for the Apache Xalan library, version 2.7.2.1, so the application will be free of this vulnerability -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2649) Xalan 2.7.3 is missing dependencies (Regression from 2.7.2)
mt created XALANJ-2649: -- Summary: Xalan 2.7.3 is missing dependencies (Regression from 2.7.2) Key: XALANJ-2649 URL: https://issues.apache.org/jira/browse/XALANJ-2649 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.7.3 Reporter: mt Assignee: Gary D. Gregory After upgrading from 2.7.2 to 2.7.3 via maven central, we get the following runtime error. It seems like 2.7.3 is missing the dependencies to serializer and xercesImpl . After manually adding a dependency to serializer:2.7.3 , the issue is fixed. This can also be seen in Maven Central: [Maven Central: xalan:xalan:2.7.2 (sonatype.com)|https://central.sonatype.com/artifact/xalan/xalan/2.7.2/dependencies] -> has dependencies on serializer and xercesImpl [Maven Central: xalan:xalan:2.7.3 (sonatype.com)|https://central.sonatype.com/artifact/xalan/xalan/2.7.3/dependencies] -> no dependencies {code:java} java.lang.NoClassDefFoundError: org/apache/xml/serializer/SerializerTrace at java.base/java.lang.ClassLoader.defineClass1(Native Method) at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1012) at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150) at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862) at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760) at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681) at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639) at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520) at org.apache.xalan.processor.ProcessorStylesheetElement.getStylesheetRoot(ProcessorStylesheetElement.java:123) at org.apache.xalan.processor.ProcessorStylesheetElement.startElement(ProcessorStylesheetElement.java:74) at org.apache.xalan.processor.StylesheetHandler.startElement(StylesheetHandler.java:623) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:518) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDriver.scanRootElementHook(XMLNSDocumentScannerImpl.java:613) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3079) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:836) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:605) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at java.xml/com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:542) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:889) at java.xml/com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:825) at java.xml/com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at java.xml/com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1224) at java.xml/com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:637) at org.apache.xalan.processor.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:917) at org.apache.xalan.processor.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:771) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Comment Edited] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810437#comment-17810437 ] Max edited comment on XALANJ-2725 at 1/24/24 2:56 PM: -- Hi [~kesh...@alum.mit.edu] , all good points. Considering that I am not an "expert" in xalan, and have no deep understanding in its full complex functionality, my intention was to minimize the potential blast radius that may have been caused by my change. Thus I changed as little code as possible solving a very narrow and very specific problem. I agree that a broken character sequence SHOULD result in an error and not in masking. Will this change break backward compatibility? I am not sure, but it is the right course of action. As far as using a Character class I just followed suit from other areas in the code. As you suggested, we do not really need that class and can get away with a numeric value. I do not think 0 is a valid value, so might as well serve in place of null. Also, I tested my changes live on my use-cases, I did not find the regression tests that you are mentioning as failing. Would you mind pointing me in the right direction? I'd like to run them and see where they fail, maybe I'll catch what I missed. Thanks! was (Author: maxfortun): Hi [~kesh...@alum.mit.edu] , all good points. Considering that I am not an "expert" in xalan, and have no deep understanding in its full complex functionality, my intention was to minimize the potential blast radius that may have been caused by my change. Thus I changed as little code as possible solving a very narrow and very specific problem. I agree that a broken character sequence SHOULD result in an error and not in masking. Will this change break backward compatibility? I am not sure, but it is the right course of action. As far as using a Character class I just followed suit from other areas in the code. As you suggested, we do not really need that class and can get away with a numeric value, as you suggested. I do not think 0 is a valid value, so might as well serve in place of null. Also, I tested my changes live on my use-cases, I did not find the regression tests that you are mentioning as failing. Would you mind pointing me in the right direction? I'd like to run them and see where they fail, maybe I'll catch what I missed. Thanks! > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Comment Edited] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810519#comment-17810519 ] Max edited comment on XALANJ-2725 at 1/24/24 6:16 PM: -- [~kesh...@alum.mit.edu] , Found the issue. Modded the patch. Please take a look. The issue was that I overlooked the original look-ahead and did not replace it with a look behind. Also, in case of a char being in encoding, need to serialize as is without escaping. was (Author: maxfortun): [~kesh...@alum.mit.edu] , Found the issue. Modded the patch. Please take a look. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810529#comment-17810529 ] Max commented on XALANJ-2725: - For ease of review: [https://github.com/apache/xalan-java/pull/166] > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810437#comment-17810437 ] Max commented on XALANJ-2725: - Hi [~kesh...@alum.mit.edu] , all good points. Considering that I am not an "expert" in xalan, and have no deep understanding in its full complex functionality, my intention was to minimize the potential blast radius that may have been caused by my change. Thus I changed as little code as possible solving a very narrow and very specific problem. I agree that a broken character sequence SHOULD result in an error and not in masking. Will this change break backward compatibility? I am not sure, but it is the right course of action. As far as using a Character class I just followed suit from other areas in the code. As you suggested, we do not really need that class and can get away with a numeric value, as you suggested. I do not think 0 is a valid value, so might as well serve in place of null. Also, I tested my changes live on my use-cases, I did not find the regression tests that you are mentioning as failing. Would you mind pointing me in the right direction? I'd like to run them and see where they fail, maybe I'll catch what I missed. Thanks! > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Comment Edited] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810492#comment-17810492 ] Max edited comment on XALANJ-2725 at 1/24/24 4:33 PM: -- Thank you for you wonderful instructions. Setting up and running tests was effortless. For what it's worth, I printed out the test's input and output, and in reality it is equivalent, but not equal. Input: 0¤2 output: #65584;¤#65586; The output is actually escaped correctly, but it is not mapped back to the utf16 characters. This may be because we fixed only a part of the serialization. Someplace else in the code numeric entities must be serialized back to the actual utf16 encoding. was (Author: maxfortun): Thank you for you wonderful instructions. Setting up and running tests was effortless. For what it's worth, I printed out the test's input and output, and in reality it is equivalent, but not equal. Input: 0¤2 output: ¤ The output is actually escaped correctly, but it is not mapped back to the utf16 characters. This may be because we fixed only a part of the serialization. Someplace else in the code numeric entities must be serialized back to the actual utf16 encoding. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810492#comment-17810492 ] Max commented on XALANJ-2725: - Thank you for you wonderful instructions. Setting up and running tests was effortless. For what it's worth, I printed out the test's input and output, and in reality it is equivalent, but not equal. Input: 0¤2 output: ¤ The output is actually escaped correctly, but it is not mapped back to the utf16 characters. This may be because we fixed only a part of the serialization. Someplace else in the code numeric entities must be serialized back to the actual utf16 encoding. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Comment Edited] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810492#comment-17810492 ] Max edited comment on XALANJ-2725 at 1/24/24 5:14 PM: -- Thank you for you wonderful instructions. Setting up and running tests was effortless. For what it's worth, I printed out the test's input and output, and in reality it is equivalent, but not equal. Input: 0¤2 output: #65584;¤#65586; The output is actually escaped correctly, but it is not mapped back to the utf16 characters. This may be because we fixed only a part of the serialization. Someplace else in the code numeric entities must be deserialized back to the actual utf16 encoding. was (Author: maxfortun): Thank you for you wonderful instructions. Setting up and running tests was effortless. For what it's worth, I printed out the test's input and output, and in reality it is equivalent, but not equal. Input: 0¤2 output: #65584;¤#65586; The output is actually escaped correctly, but it is not mapped back to the utf16 characters. This may be because we fixed only a part of the serialization. Someplace else in the code numeric entities must be serialized back to the actual utf16 encoding. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810519#comment-17810519 ] Max commented on XALANJ-2725: - Found the issue. Modded the patch. Please take a look. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max updated XALANJ-2725: Attachment: (was: astral-chars-split-buffer.patch) > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max updated XALANJ-2725: Attachment: astral-chars-split-buffer.patch > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Comment Edited] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810519#comment-17810519 ] Max edited comment on XALANJ-2725 at 1/24/24 6:08 PM: -- [~kesh...@alum.mit.edu] , Found the issue. Modded the patch. Please take a look. was (Author: maxfortun): Found the issue. Modded the patch. Please take a look. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811718#comment-17811718 ] Max commented on XALANJ-2725: - [~kesh...@alum.mit.edu], what do you think about adding an incomplete surrogate pair handling policy? Either via a setter method or via a system property? Or chain both? Check if class is set via setter, if not set via setter check the system prop, otherwise default to UTF16IncompleteSurrogatePairErrorPolicy. Or some name like that? ToStream already sets one aspect of its behavior via system property here, so this is not something new: [https://github.com/apache/xalan-java/blob/d83b90e588a5f2499e3eccc7cfcc44708f01494f/serializer/src/main/java/org/apache/xml/serializer/ToStream.java#L111] This will allow us to add UTF16IncompleteSurrogatePairOutputPolicy, or implement a custom one to serialize as the individual use-case would require. I am worried that if we do not provide a facility for catching errors, currently working code will start breaking and upgrades will become a nightmare. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17811980#comment-17811980 ] Max commented on XALANJ-2725: - [~kesh...@alum.mit.edu] , added a split buffer test case: [https://github.com/apache/xalan-test/pull/10] > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810043#comment-17810043 ] Max commented on XALANJ-2419: - [~kesh...@alum.mit.edu] , thank you for working on this. As you suggested, why don't you merge what works and I can try to help you work on the split buffer issue after on a good code? > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Comment Edited] (XALANJ-2419) Astral characters written as a pair of NCRs with the surrogate scalar values when using UTF-8
[ https://issues.apache.org/jira/browse/XALANJ-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810043#comment-17810043 ] Max edited comment on XALANJ-2419 at 1/23/24 5:08 PM: -- [~kesh...@alum.mit.edu] , thank you for working on this. As you suggested, why don't you merge what works and I can try to help you work on the split buffer issue after? on a good code? was (Author: maxfortun): [~kesh...@alum.mit.edu] , thank you for working on this. As you suggested, why don't you merge what works and I can try to help you work on the split buffer issue after on a good code? > Astral characters written as a pair of NCRs with the surrogate scalar values > when using UTF-8 > - > > Key: XALANJ-2419 > URL: https://issues.apache.org/jira/browse/XALANJ-2419 > Project: XalanJ2 > Issue Type: Bug > Components: Serialization >Affects Versions: 2.7.1 >Reporter: Henri Sivonen >Assignee: Joe Kesselman >Priority: Major > Attachments: XALANJ-2419-fix-v3.txt, XALANJ-2419-tests-v3.txt > > > org.apache.xml.serializer.ToStream contains the following code: > else if (m_encodingInfo.isInEncoding(ch)) { > // If the character is in the encoding, and > // not in the normal ASCII range, we also > // just leave it get added on to the clean characters > > } > else { > // This is a fallback plan, we should never get here > // but if the character wasn't previously handled > // (i.e. isn't in the encoding, etc.) then what > // should we do? We choose to write out an entity > writeOutCleanChars(chars, i, lastDirtyCharProcessed); > writer.write("&#
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812766#comment-17812766 ] Max commented on XALANJ-2725: - [~kesh...@alum.mit.edu] , thank you for the heads up, and for actually working on this at all. This project has been in deep neglect until you gave it some love. Good luck and rooting for your return :) > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max updated XALANJ-2725: Attachment: astral-chars-split-buffer.patch > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2725) Possible buffer-boundry issue when serializing surrogate pairs
[ https://issues.apache.org/jira/browse/XALANJ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17810065#comment-17810065 ] Max commented on XALANJ-2725: - [~kesh...@alum.mit.edu] , I added a patch I tried with your PR. See if your regression tests pass now. Need to figure out a good test case for the split buffer. > Possible buffer-boundry issue when serializing surrogate pairs > -- > > Key: XALANJ-2725 > URL: https://issues.apache.org/jira/browse/XALANJ-2725 > Project: XalanJ2 > Issue Type: Improvement > Security Level: No security risk; visible to anyone(Ordinary problems in > Xalan projects. Anybody can view the issue.) > Components: Serialization >Reporter: Joe Kesselman >Assignee: Joe Kesselman >Priority: Major > Labels: Surrogates, escaping, unicode, utf > Attachments: astral-chars-split-buffer.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > XALANJ-2419 addressed a case where "astral" Unicode characters, requiring a > surrogate pair (two UTF-16 units), were not being serialized correctly. We > have a proposed fix for that. > There is reported to still be an edge case when a surrogate pair which > crosses buffer boundaries might not be handled correctly. [~maxfortun] > offered what looks like a reasonable proposed fix > (https://github.com/maxfortun/xalan-j/blob/a9bd5591d9f8a523548aeec091e886b64c691628/src/org/apache/xml/serializer/ToStream.java#L1607), > but in my testing this was not serializing the surrogate pairs correctly, > causing regression on the tests XALANJ-2419 introduced. I don't know whether > that's because we're taking multiple paths through > But the edge case does appear to be real, and if so we will need some such > solution. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANC-728) GSoC Add More EXSLT Functions
[ https://issues.apache.org/jira/browse/XALANC-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway updated XALANC-728: -- Attachment: GSoCUpdate20120903.zip gsoc_diff Updates to xalan/c/branch/GSoC-2012 work area. gsoc_diff is a difference file on the work area. GSoCUpdate20120903.zip contains the replacement files. There are still some debug activities required. Integration of the XalanDateTime classes are happening. GSoC Add More EXSLT Functions - Key: XALANC-728 URL: https://issues.apache.org/jira/browse/XALANC-728 Project: XalanC Issue Type: Improvement Components: XPathC Affects Versions: CurrentCVS Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Labels: XPath, gsoc2012, mentor Attachments: GSoC-2012-07-02.zip, GSoC-2012-07-14.zip, GSoC-2012-08-07.zip, GSoC-2012-08-20.zip, gsoc_diff, GSoCUpdate20120903.zip, Tester.cpp, XalanDatesAndTimes v0.1.zip, XalanDatesAndTimes v0.2.zip Implement a more complete set of EXSLT functions into the Xalan-C XPath environment. See: http://www.exslt.org -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANC-728) GSoC Add More EXSLT Functions
[ https://issues.apache.org/jira/browse/XALANC-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway updated XALANC-728: -- Attachment: GSoCUpdate20120903.zip The GSoCUpdate20120903.zip are new and replacement files for the xalan/c/branch/GSoC-2012 tree. These files incorporate the work of Samuel and are built using MS .NET 2010 in the branched source tree. Some debugging is still required. Some XPath wrappers are still required, but the library looks clean. There will need to be some memory management cleanup. - Steve Hathaway GSoC Add More EXSLT Functions - Key: XALANC-728 URL: https://issues.apache.org/jira/browse/XALANC-728 Project: XalanC Issue Type: Improvement Components: XPathC Affects Versions: CurrentCVS Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Labels: XPath, gsoc2012, mentor Attachments: GSoC-2012-07-02.zip, GSoC-2012-07-14.zip, GSoC-2012-08-07.zip, GSoC-2012-08-20.zip, gsoc_diff, GSoCUpdate20120903.zip, GSoCUpdate20120903.zip, Tester.cpp, XalanDatesAndTimes v0.1.zip, XalanDatesAndTimes v0.2.zip Implement a more complete set of EXSLT functions into the Xalan-C XPath environment. See: http://www.exslt.org -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-731) Retire Project XML - Update Xalan Website Links
[ https://issues.apache.org/jira/browse/XALANC-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13458200#comment-13458200 ] Steven J. Hathaway commented on XALANC-731: --- The Xalan website pages no longer make reference to (dist)/xml/* distribution artifacts. The (archive-dist)/xml/* distribution artifacts should remain available. Steven J. Hathaway Retire Project XML - Update Xalan Website Links --- Key: XALANC-731 URL: https://issues.apache.org/jira/browse/XALANC-731 Project: XalanC Issue Type: Task Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Update website links to vacate use of (dist)/xml/xalan-c (dist)/xml/xalan-j (dist)/xml/xerces-c and (dist)/xml/xerces-j -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2546) xsl:sort lang attribute ignores parameter value, only hard-coding works
[ https://issues.apache.org/jira/browse/XALANJ-2546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13458426#comment-13458426 ] Adam Jez commented on XALANJ-2546: -- I would like to recall that this issue is still not fixed. Don tested the patch the day after it was committed and unfortunately it did not resolve the issue. Please reopen this defect. xsl:sort lang attribute ignores parameter value, only hard-coding works --- Key: XALANJ-2546 URL: https://issues.apache.org/jira/browse/XALANJ-2546 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.7.1 Environment: java version 1.6.0_20, Xalan 2.7.1 Reporter: Don Smith Attachments: sorting-example.zip, XALANJ-2546.diff I have an XSL stylesheet that uses xsl:sort for a list of names. I added the lang attribute to the sort, using a variable passed to the stylesheet for its value: lang={$locale}. When sorting a list of Russian names, the ordering is incorrect. I can see that the parameter value is present and correct in the stylesheet as it executes by using an xsl:message statement. When I hard-code the value of lang to ru (lang=ru), the sort works correctly. This defect cause improper sorting in Russian and Polish, a defect in our application. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANC-732) AIX Build - Makefile Errors
[ https://issues.apache.org/jira/browse/XALANC-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway updated XALANC-732: -- Attachment: makefile-incl.patch makefile-incl.patch // patches svn rev 1383082 (svn)/xalan/c/trunk/makefile.incl AIX Build - Makefile Errors --- Key: XALANC-732 URL: https://issues.apache.org/jira/browse/XALANC-732 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: 1.11 Environment: AIX platforms Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Attachments: makefile-incl.patch The current SVN repository needs to have Makefiles fixed before release. A line in the AIX-specific section references a bad version of the Xerces XML parser library. Xerces 3.1 is used instead of 2.7 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANC-733) Ensure that XalanLocator::getSystemId() and getPublicId() do not return NULL
[ https://issues.apache.org/jira/browse/XALANC-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway updated XALANC-733: -- Attachment: XalanLocator.patch Here is the patch file - XalanLocator.patch - Steve Ensure that XalanLocator::getSystemId() and getPublicId() do not return NULL Key: XALANC-733 URL: https://issues.apache.org/jira/browse/XALANC-733 Project: XalanC Issue Type: Bug Components: XalanC, XPathC Affects Versions: 1.11 Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Attachments: XalanLocator.patch The recommended patch to xalanc/PlatformSupport/XalanLocator.hpp ensures that getSystemId() and getPublicId() do not return NULL pointers. XalanC source files that can benefit from the patch include: xalanc/PlatformSupport/ProblemListenerBase.cpp xalanc/PlatformSupport/XSLException.cpp xalanc/XPath/XPathExecutionContextDefault.cpp The patch will be submitted shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-733) Ensure that XalanLocator::getSystemId() and getPublicId() do not return NULL
[ https://issues.apache.org/jira/browse/XALANC-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13461033#comment-13461033 ] Steven J. Hathaway commented on XALANC-733: --- XalanLocator-3.patch // committed to SVN trunk. - Steven J. Hathaway Ensure that XalanLocator::getSystemId() and getPublicId() do not return NULL Key: XALANC-733 URL: https://issues.apache.org/jira/browse/XALANC-733 Project: XalanC Issue Type: Bug Components: XalanC, XPathC Affects Versions: 1.11 Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Attachments: XalanLocator-3.patch, XalanLocator.patch, XalanLocator.patch2 The recommended patch to xalanc/PlatformSupport/XalanLocator.hpp ensures that getSystemId() and getPublicId() do not return NULL pointers. XalanC source files that can benefit from the patch include: xalanc/PlatformSupport/ProblemListenerBase.cpp xalanc/PlatformSupport/XSLException.cpp xalanc/XPath/XPathExecutionContextDefault.cpp The patch will be submitted shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-733) Ensure that XalanLocator::getSystemId() and getPublicId() do not return NULL
[ https://issues.apache.org/jira/browse/XALANC-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465343#comment-13465343 ] Steven J. Hathaway commented on XALANC-733: --- With the release distribution of 1.11 coming up, I don't have the time to integrate a new source.cpp into the build. I therefore have a patch to the existing source.hpp that should work. The integration effort for a new source file requires not just Makefile integration but four versions of Microsoft Visual Studio .NET project file maintenance. Steven J. Hathaway Xalan Documentation Project Ensure that XalanLocator::getSystemId() and getPublicId() do not return NULL Key: XALANC-733 URL: https://issues.apache.org/jira/browse/XALANC-733 Project: XalanC Issue Type: Bug Components: XalanC, XPathC Affects Versions: 1.11 Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Attachments: XalanLocator-3.patch, XalanLocator-4.patch, XalanLocator.patch, XalanLocator.patch2 The recommended patch to xalanc/PlatformSupport/XalanLocator.hpp ensures that getSystemId() and getPublicId() do not return NULL pointers. XalanC source files that can benefit from the patch include: xalanc/PlatformSupport/ProblemListenerBase.cpp xalanc/PlatformSupport/XSLException.cpp xalanc/XPath/XPathExecutionContextDefault.cpp The patch will be submitted shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANC-734) Allow runConfigure CFLAGS and CXXFLAGS to inherit environment
Steven J. Hathaway created XALANC-734: - Summary: Allow runConfigure CFLAGS and CXXFLAGS to inherit environment Key: XALANC-734 URL: https://issues.apache.org/jira/browse/XALANC-734 Project: XalanC Issue Type: Bug Components: XalanC Environment: Unix Platform Builds Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Priority: Trivial Modify the runConfigure shell script to allow CFLAGS and CXXFLAGS to inherit compiler-specific flags from the environment. Some user development platforms require a different set of CFLAGS and CXXFLAGS for C and C++ compilers. This separation will allow compiler-specific flags to be inherited from the environment. Reported by Martin Elzen - of usoft.com -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Closed] (XALANC-642) When building for Cygwin, the install target creates an invalid path from the prefix (-P | --prefix option) supplied to runConfigure
[ https://issues.apache.org/jira/browse/XALANC-642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway closed XALANC-642. - Resolution: Unresolved If Cygwin is to be supported, then this issue may be raised again. When building for Cygwin, the install target creates an invalid path from the prefix (-P | --prefix option) supplied to runConfigure -- Key: XALANC-642 URL: https://issues.apache.org/jira/browse/XALANC-642 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: 1.10 Environment: IBM ThinkPad, Windows XP Professional, Cygwin, latest update as of 12/15/06, GNU bash, version 3.2.5(8)-release (i686-pc-cygwin), GNU Make 3.8 Reporter: Will Sappington while building Xalan 1.10.0 in the above environment, give the command ./runConfigure -p cygwin -c gcc -x g++ -P /cygdrive/c/proj/3rdParty/libs/Xalan-c_1_10_0-C/package and run make. After the build completes run make install (also does this if running make install from the beginning). get the make output below. in summary, the make -C XalanMsgLib install step prepends an additional '/' to the prefix path resulting in the command mkdir -p //cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/package/lib being issued which generates the error message mkdir: cannot create directory `//cygdrive': No such host or network. Per Dave Bertoni, this is the work around, posted to the mail list on 3/14/07: 1. specify the prefix you want as a configure option to the runConfigure script: ./runConfigure -p cygwin -c gcc -x g++ -C --prefix=/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/package 2. Do the build (although you don't have to rebuild if you don't want to, since this won't affect the binaries. 3. Open Makefile.incl and search for the prefix you specified. You'll see something like this: prefix = /cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/package/ 4. Just remove the leading / from the definition: prefix = cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/package/ == make output == $ make install make -C src/xalanc install make[1]: Entering directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc' Preparing the directory structure for a build ... mkdir -p ../../obj mkdir -p ../../lib mkdir -p ../../bin make -C Utils prepare make[2]: Entering directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils ' mkdir -p ../../../nls mkdir -p ../../../nls/include make[2]: Leaving directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils' make -C Utils locale make[2]: Entering directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils ' make[2]: Nothing to be done for `locale'. make[2]: Leaving directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils' make -C Utils install make[2]: Entering directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils ' /usr/bin/install -c -m 644 ../../../nls/include/LocalMsgIndex.hpp /cygdrive/c/proj/3rdParty/libs/xalan-c_ 1_10_0-C/xml-xalan/c//src/xalanc/PlatformSupport make -C XalanMsgLib install make[3]: Entering directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils /XalanMsgLib' mkdir -p //cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/package/lib mkdir: cannot create directory `//cygdrive': No such host or network path make[3]: *** [install] Error 1 make[3]: Leaving directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils/ XalanMsgLib' make[2]: *** [install] Error 2 make[2]: Leaving directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc/ Utils' make[1]: *** [install] Error 2 make[1]: Leaving directory `/cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c/src/xalanc' make: *** [install] Error 2 wsappington@NDMA-WSappingto /cygdrive/c/proj/3rdParty/libs/xalan-c_1_10_0-C/xml-xalan/c $ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-731) Retire Project XML - Update Xalan Website Links
[ https://issues.apache.org/jira/browse/XALANC-731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471382#comment-13471382 ] Steven J. Hathaway commented on XALANC-731: --- The old Apache XML Project is officially retired by INFRA. All operational Xalan-C/C++ website links point to the archives. The necessary Webpage content required of the old XML project for xalan-c and xalan-j has been rehosted to the Apache XALAN project. The website content is under http://xalan.apache.org/old/ New software distribution trees are being planned. Xalan-C is the first to use the Xalan distribution repository for Version 1.11. See my committer's website for review. http://people.apache.org/~shathaway/docs/xalan/xalan-c/index.html As the website reviews take place and show no significant problems, the code is moved to the production Xalan websites and to the source (svn) repository. Sincerely, Steven J. Hathaway Retire Project XML - Update Xalan Website Links --- Key: XALANC-731 URL: https://issues.apache.org/jira/browse/XALANC-731 Project: XalanC Issue Type: Task Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Update website links to vacate use of (dist)/xml/xalan-c (dist)/xml/xalan-j (dist)/xml/xerces-c and (dist)/xml/xerces-j -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-717) Crash (pure virtual method call) when included file is not well formed XML
[ https://issues.apache.org/jira/browse/XALANC-717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487538#comment-13487538 ] Steven J. Hathaway commented on XALANC-717: --- Reviewing this patch is still on my TODO list. -- I am also looking at XERCESC-1919 that resolves a similar issue for the XML parser. Crash (pure virtual method call) when included file is not well formed XML -- Key: XALANC-717 URL: https://issues.apache.org/jira/browse/XALANC-717 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: 1.10 Reporter: Christian Luidolt Assignee: Steven J. Hathaway Attachments: xalanC-include-crash.diff create a .xsl file which includes antoher .xsl file (using xsl:include) which in not well-formed XML, e.g. has garbage at the end of the file. When executing XalanTransformer::transform() the program crashes (virtual method call) because the locator has already been freed. The reason for the crash is that the Locator of the included file has been pushed onto the locator stack by the StylesheetHandler but has not been removed because the endDocument callback has not been called. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-481) XalanC doesn't handle correctly the Unicode surrogate pairs
[ https://issues.apache.org/jira/browse/XALANC-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13487546#comment-13487546 ] Steven J. Hathaway commented on XALANC-481: --- Previous comment should be UTF16 not UTF8 XalanC doesn't handle correctly the Unicode surrogate pairs --- Key: XALANC-481 URL: https://issues.apache.org/jira/browse/XALANC-481 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: CurrentCVS Environment: all Reporter: Dmitry Hayes Priority: Minor Fix For: CurrentCVS For the stylesheet : ?xml version=1.0? xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; version=1.0 xsl:variable name=single_surrogate_pair#x10001;/xsl:variable xsl:template match=/ out lenght={string-length($single_surrogate_pair)}/out /xsl:template /xsl:stylesheet we have an output: ?xml version=1.0 encoding=UTF-8?out lenght=2/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANC-726) Windows Builds - Xalan-C 1.10.1 bugfixes using SVN Repository
[ https://issues.apache.org/jira/browse/XALANC-726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway updated XALANC-726: -- Fix Version/s: (was: 1.10) CurrentCVS Windows Builds - Xalan-C 1.10.1 bugfixes using SVN Repository - Key: XALANC-726 URL: https://issues.apache.org/jira/browse/XALANC-726 Project: XalanC Issue Type: Task Components: XalanC Affects Versions: 1.10 Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Priority: Minor Fix For: CurrentCVS Should we try to have a bugfix release build of Xalan-C 1.10.1, maintaining compatibility with Xerces-C XML Version 2.8.0? The current SVN repository builds for Windows no longer support Xerces-C Versions 2.7 and 2.8 XML Parsers. These old parsers use VC6, VC7.1 and VC8 tools. I can probably help with a backwards port if necessary. Support for VC6 is going away because it is no longer used by the Xerces-C Version 3.+ parsers. The current SVN repository is designed for Version 1.11 and Xerces-C Version 3.0 and newer. Xalan-C 1.10 is supported by Xerces-C parsers of version 2.8 or older. A backwards port of project builds would be required to make a Xalan-C 1.10.1 bugfix release Steven J. Hathaway -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-481) XalanC doesn't handle correctly the Unicode surrogate pairs
[ https://issues.apache.org/jira/browse/XALANC-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537212#comment-13537212 ] Steven J. Hathaway commented on XALANC-481: --- The length value in Microsoft and other 16-bit UTF-16 encoding products reports length in UTF-16 (16-bit) elements, not the length in unicode characters. Xerces-C emits a count of (16-bit) XMLCh units. XalanC doesn't handle correctly the Unicode surrogate pairs --- Key: XALANC-481 URL: https://issues.apache.org/jira/browse/XALANC-481 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: CurrentCVS Environment: all Reporter: Dmitry Hayes Priority: Minor Fix For: CurrentCVS For the stylesheet : ?xml version=1.0? xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; version=1.0 xsl:variable name=single_surrogate_pair#x10001;/xsl:variable xsl:template match=/ out lenght={string-length($single_surrogate_pair)}/out /xsl:template /xsl:stylesheet we have an output: ?xml version=1.0 encoding=UTF-8?out lenght=2/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANJ-2568) TransformerImpl causes a
Sami Suuriniemi created XALANJ-2568: --- Summary: TransformerImpl causes a Key: XALANJ-2568 URL: https://issues.apache.org/jira/browse/XALANJ-2568 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.7.1 Reporter: Sami Suuriniemi Assignee: Steven J. Hathaway Priority: Minor In situation: String sourceFile = ./transform.xsl; TransformerFactory tFac = TransformerFactory.newInstance(); Transformer transformer = tFac.newTransformer(new StreamSource(sourceFile)); transformer.setOutputProperty(encoding, UTF-8); If source XSL file has UTF-8 characters in it and the FILE itself is not encoded UTF-8 it causes a an odd and hard to locate nullpointerexception in eg, java.lang.NullPointerException at org.apache.xalan.transformer.TransformerImpl.setOutputProperty(TransformerImpl.java:966) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2568) TransformerImpl NullPointerException when faulty encoded XSL file
[ https://issues.apache.org/jira/browse/XALANJ-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sami Suuriniemi updated XALANJ-2568: Summary: TransformerImpl NullPointerException when faulty encoded XSL file (was: TransformerImpl causes a ) TransformerImpl NullPointerException when faulty encoded XSL file - Key: XALANJ-2568 URL: https://issues.apache.org/jira/browse/XALANJ-2568 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.7.1 Reporter: Sami Suuriniemi Assignee: Steven J. Hathaway Priority: Minor In situation: String sourceFile = ./transform.xsl; TransformerFactory tFac = TransformerFactory.newInstance(); Transformer transformer = tFac.newTransformer(new StreamSource(sourceFile)); transformer.setOutputProperty(encoding, UTF-8); If source XSL file has UTF-8 characters in it and the FILE itself is not encoded UTF-8 it causes a an odd and hard to locate nullpointerexception in eg, java.lang.NullPointerException at org.apache.xalan.transformer.TransformerImpl.setOutputProperty(TransformerImpl.java:966) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANC-736) Assertion failure in debug mode
[ https://issues.apache.org/jira/browse/XALANC-736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569696#comment-13569696 ] Steven J. Hathaway commented on XALANC-736: --- NOTES http://xalan.apache.org/xalan-c/usagepatterns.html#xalantransformer The comments [1] [2] [3] pertain to The Xerces-C and Xalan-C libraries perform their own memory management. Objects created for an Xalan Transformer are owned by the factories and destroyed when the factories are destroyed, unless you instantiate your own ownership of the objects by explicitly removing them from the factories. [1] XalanTransformer::initialize(); causes static classes and veriables to be allocated. XalanTransformer::terminate(); may cause instance of xslException.getMessage() to be destroyed. This may cause std::cout ((xslException.getMessage()).c_str()); to fail. [2] XSLTInputSource(...) is an an initializer that contains its own default memory management and therefore 'new' is not required. The objects returned by XSLTInputSource are owned by the XalanTransformer [3] delete xslIn; delete xmlIn; Terminating of the librararies usually destroys the transformer instances and anything found in their factories. --- Assertion failure in debug mode --- Key: XALANC-736 URL: https://issues.apache.org/jira/browse/XALANC-736 Project: XalanC Issue Type: Bug Affects Versions: 1.11 Environment: Visual Studio 2010 Binary package of Xerces 3.1.1 Binary package of Xalan 1.11.0 Reporter: Claudia Baier Assignee: Steven J. Hathaway Attachments: 736.diff, XercesXalanDemo.zip I have a problem running applications in debug mode. I try to process a xslt file which includes another xslt file. In release mode everything works fine. But in debug mode the application crashes with the message: Debug Assertion Failed! ... Expression: invalid null pointer This happens when I call theXalanTransformer.transform(*parsedXML, *xslIn, std::cout); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANC-736) Assertion failure in debug mode
[ https://issues.apache.org/jira/browse/XALANC-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway updated XALANC-736: -- Attachment: XalanParsedURI.diff XalanParsedURI.cpp URL Parsing Information - Leading to the patch. Components: d_scheme(...:)m_scheme d_authority (://...) m_authority (path) (/...)m_path d_query (?...)m_query d_fragment (#...)m_fragment The Stylesheets are assembled from elements as callbacks from a Xerces SAX reader and integrated XML parser. The prepared patch fixes the issue of URI parsing where components are of zero length. The issue has been reported only with Microsoft Studio 2010 for debug builds. The release builds have no problem assertion. -- Patches applied to: xalanc/PlatformSupport/XalanParsedURI.cpp Microsoft Studio 2010 (VC10) in debug mode has a bad assertion in the runtime when making XalanDOMString copies and assignments where the RHS.m_data is null and the length is zero. If you ignore the assertion, the method completes properly. The Microsoft debug assertion issue may also affect the copy of any C++ class instance or structure instance that contains a NULL pointer. Copies may work without assertion if instead of a NULL pointer, you have a pointer to a data buffer with empty content. Modifying the XalanVector as previously proposed could probably affect other problems in the Xalan codebase. I have made a patch to xalanc/PlatformSupport/XalanParsedURI.cpp to keep stylesheet xsl:include ... and xsl:import ... from doing copies and assignments of zero-length XalanDOMStrings when parsing URIs. I did'nt want to re-validate the entire Xalan world where XalanDOMString class is being used. I also did not want to re-validate everwhere XalanVector template is used. Sincerely, Steven J. Hathaway Assertion failure in debug mode --- Key: XALANC-736 URL: https://issues.apache.org/jira/browse/XALANC-736 Project: XalanC Issue Type: Bug Affects Versions: 1.11 Environment: Visual Studio 2010 Binary package of Xerces 3.1.1 Binary package of Xalan 1.11.0 Reporter: Claudia Baier Assignee: Steven J. Hathaway Attachments: 736.diff, XalanParsedURI.cpp, XalanParsedURI.diff, XercesXalanDemo.zip I have a problem running applications in debug mode. I try to process a xslt file which includes another xslt file. In release mode everything works fine. But in debug mode the application crashes with the message: Debug Assertion Failed! ... Expression: invalid null pointer This happens when I call theXalanTransformer.transform(*parsedXML, *xslIn, std::cout); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Assigned] (XALANJ-2568) TransformerImpl NullPointerException when faulty encoded XSL file
[ https://issues.apache.org/jira/browse/XALANJ-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway reassigned XALANJ-2568: -- Assignee: (was: Steven J. Hathaway) TransformerImpl NullPointerException when faulty encoded XSL file - Key: XALANJ-2568 URL: https://issues.apache.org/jira/browse/XALANJ-2568 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.7.1 Reporter: Sami Suuriniemi Priority: Minor In situation: String sourceFile = ./transform.xsl; TransformerFactory tFac = TransformerFactory.newInstance(); Transformer transformer = tFac.newTransformer(new StreamSource(sourceFile)); transformer.setOutputProperty(encoding, UTF-8); If source XSL file has UTF-8 characters in it and the FILE itself is not encoded UTF-8 it causes a an odd and hard to locate nullpointerexception java.lang.NullPointerException at org.apache.xalan.transformer.TransformerImpl.setOutputProperty(TransformerImpl.java:966) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2569) Strange NPE when deploying xalan to glassfish
[ https://issues.apache.org/jira/browse/XALANJ-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590251#comment-13590251 ] Jay Xu commented on XALANJ-2569: Hi, I checked some other references and found this issue is due to GF's redeployment mechanism. After org.apache.catalina.loader.WebappClassLoader.ENABLE_CLEAR_REFERENCES=false added to GF's start parameters, the issue no longer happens Strange NPE when deploying xalan to glassfish - Key: XALANJ-2569 URL: https://issues.apache.org/jira/browse/XALANJ-2569 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan Affects Versions: 2.7.1 Environment: Ubuntu 12.10 with 3.5.0-22-generic x64 OpenJDK Runtime Environment (IcedTea7 2.3.4) (7u9-2.3.4-0ubuntu1.12.10.1) x64 Glassfish 3.1.2.2 with VM args: -XX:+UnlockDiagnosticVMOptions -XX:+UseParallelOldGC -XX:ParallelGCThreads=6 -XX:MaxPermSize=1024m -XX:PermSize=64m -XX:LargePageSizeInBytes=2m -XX:+UseParallelGC -Xmx4096m -Xmn1024m -Xms4096m -javaagent:/opt/glassfish3/glassfish/lib/monitor/flashlight-agent.jar -Dosgi.shell.telnet.maxconn=1 -Djdbc.drivers=org.apache.derby.jdbc.ClientDriver -Dfelix.fileinstall.disableConfigSave=false -Dfelix.fileinstall.dir=/opt/glassfish3/glassfish/modules/autostart/ -Djavax.net.ssl.keyStore=/opt/glassfish3/glassfish/domains/domain1/config/keystore.jks -Dosgi.shell.telnet.port= -Djava.security.policy=/opt/glassfish3/glassfish/domains/domain1/config/server.policy -Djava.awt.headless=true -Dfelix.fileinstall.log.level=2 -Dfelix.fileinstall.poll=5000 -Dcom.sun.aas.instanceRoot=/opt/glassfish3/glassfish/domains/domain1 -Dosgi.shell.telnet.ip=127.0.0.1 -Dcom.sun.enterprise.config.config_environment_factory_class=com.sun.enterprise.config.serverbeans.AppserverConfigEnvironmentFactory -Djava.endorsed.dirs=/opt/glassfish3/glassfish/modules/endorsed:/opt/glassfish3/glassfish/lib/endorsed -Dcom.sun.aas.installRoot=/opt/glassfish3/glassfish -Dfelix.fileinstall.bundles.startTransient=true -Djava.ext.dirs=/usr/lib/jvm/java-7-openjdk-amd64/lib/ext:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/ext:/opt/glassfish3/glassfish/domains/domain1/lib/ext -Dfelix.fileinstall.bundles.new.start=true -Djavax.net.ssl.trustStore=/opt/glassfish3/glassfish/domains/domain1/config/cacerts.jks -Dcom.sun.enterprise.security.httpsOutboundKeyAlias=s1as -DANTLR_USE_DIRECT_CLASS_LOADING=true -Djava.security.auth.login.config=/opt/glassfish3/glassfish/domains/domain1/config/login.conf -Dgosh.args=--nointeractive -Dfile.encoding=utf8 -Djava.library.path=/opt/glassfish3/glassfish/lib:/usr/java/packages/lib/amd64:/usr/lib/jni:/lib:/usr/lib Reporter: Jay Xu Priority: Critical We are using docx4j 2.8.1 to generate PDFs which depends on xalan 2.7.1. When deploying our proj to GF, after a period of time(it worked well when just starting GF), xalan threw a NPE looks like Caused by: java.lang.NullPointerException at org.apache.xml.serializer.OutputPropertiesFactory.getDefaultMethodProperties(OutputPropertiesFactory.java:260) at org.apache.xalan.templates.OutputProperties.init(OutputProperties.java:83) at org.apache.xalan.transformer.TransformerIdentityImpl.init(TransformerIdentityImpl.java:88) at org.apache.xalan.processor.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:823) at com.sun.xml.bind.v2.runtime.JAXBContextImpl.createTransformer(JAXBContextImpl.java:728) at com.sun.xml.bind.v2.runtime.XMLSe|#] When digging into the source code Ln.260 of OutputPropertiesFactory.java, NPE is impossible here to my opinion: ... synchronized (m_synch_object) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Created] (XALANC-741) Patch for VS 2012
Steven J. Hathaway created XALANC-741: - Summary: Patch for VS 2012 Key: XALANC-741 URL: https://issues.apache.org/jira/browse/XALANC-741 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: 1.11 Environment: Microsoft Visual Studio 2012 Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway I am preparing patches to build Xalan-C with MS Studio 2012. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANC-741) Patch for VS 2012
[ https://issues.apache.org/jira/browse/XALANC-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven J. Hathaway updated XALANC-741: -- Attachment: ms-xalan-vs2012.zip ms-scripts.zip Files: ms-scripts.zip = batch scripts to launch VS 2012 ms-xalan-vs2012.zip = VC11 projects for VS 2012 Patch for VS 2012 - Key: XALANC-741 URL: https://issues.apache.org/jira/browse/XALANC-741 Project: XalanC Issue Type: Bug Components: XalanC Affects Versions: 1.11 Environment: Microsoft Visual Studio 2012 Reporter: Steven J. Hathaway Assignee: Steven J. Hathaway Labels: patch Attachments: ms-scripts.zip, ms-xalan-vs2012.zip I am preparing patches to build Xalan-C with MS Studio 2012. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Updated] (XALANJ-2570) Argument type mismatch when using Java extension
[ https://issues.apache.org/jira/browse/XALANJ-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Holger Rehn updated XALANJ-2570: Description: In a stylesheet I'm calling method append() on a StringBuilder provided as parameter: xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform; xmlns:java=http://xml.apache.org/xslt/java; xsl:param name=SB / xsl:template match=/ xsl:messagexsl:value-of select=java:append( $SB, 'text' )//xsl:message /xsl:template /xsl:stylesheet Problem #1: In the example above the MethodResolver may not choose the appropriate method. It does work with Java 7 build 1.7.0-b147 (choosing StringBuilder.append(String)), but not with the latest Java 7 Update 17 (choosing StringBuilder.append(CharSequence)) which causes an IllegalArgumentException to be thrown. Method MethodResolver.scoreMatch() returns the same score for both (and other) methods because the value is of class XObject.CLASS_STRING. Problem #2: MethodResolver.convert() is not able to handle values of type CharSequence properly and ends up converting the String value into a Double (NaN). Please find attached a patch (against trunk, SVN revision 1383083) addressing problem #2 and improving MethodResolver.scoreMatch() to provide a better score if a value exactly matches the target type, compared to a value only assignable to the target type. was: In a stylesheet I'm calling method append() on a StringBuilder provided as parameter: xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform; xmlns:java=http://xml.apache.org/xslt/java; xsl:param name=SB / xsl:template match=/ xsl:messagexsl:value-of select=java:append( $SB, 'text' )//xsl:message /xsl:template /xsl:stylesheet Problem #1: In the example above the MethodResolver may not choose the appropriate method. It does work with Java 7 build 1.7.0-b147 (choosing StringBuilder.append(String)), but not with the latest Java 7 Update 17 (choosing StringBuilder.append(CharSequence)). Method MethodResolver.scoreMatch() returns the same score for both (and other) methods because the value is of class XObject.CLASS_STRING. Problem #2: MethodResolver.convert() is not able to handle values of type CharSequence properly and ends up converting the String value into a Double (NaN). Please find attached a patch (against trunk, SVN revision 1383083) addressing problem #2 and improving MethodResolver.scoreMatch() to provide a better score if a value exactly matches the target type, compared to a value only assignable to the target type. Argument type mismatch when using Java extension Key: XALANJ-2570 URL: https://issues.apache.org/jira/browse/XALANJ-2570 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan-extensions Affects Versions: The Latest Development Code, 2.7.1, 2.7.D2, 2.7.2 Reporter: Holger Rehn Assignee: Steven J. Hathaway Priority: Blocker Attachments: XalanJ-2570.diff In a stylesheet I'm calling method append() on a StringBuilder provided as parameter: xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform; xmlns:java=http://xml.apache.org/xslt/java; xsl:param name=SB / xsl:template match=/ xsl:messagexsl:value-of select=java:append( $SB, 'text' )//xsl:message /xsl:template /xsl:stylesheet Problem #1: In the example above the MethodResolver may not choose the appropriate method. It does work with Java 7 build 1.7.0-b147 (choosing StringBuilder.append(String)), but not with the latest Java 7 Update 17 (choosing StringBuilder.append(CharSequence)) which causes an IllegalArgumentException to be thrown. Method MethodResolver.scoreMatch() returns the same score for both (and other) methods because the value is of class XObject.CLASS_STRING. Problem #2: MethodResolver.convert() is not able to handle values of type CharSequence properly and ends up converting the String value into a Double (NaN). Please find attached a patch (against trunk, SVN revision 1383083) addressing problem #2 and improving MethodResolver.scoreMatch() to provide a better score if a value exactly matches the target type, compared to a value only assignable to the target type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2567) xsl:sort sorts Polish incorrectly
[ https://issues.apache.org/jira/browse/XALANJ-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614329#comment-13614329 ] Don Smith commented on XALANJ-2567: --- Any update on this? We have a customer in Poland that would like names sorted correctly. Thanks! xsl:sort sorts Polish incorrectly - Key: XALANJ-2567 URL: https://issues.apache.org/jira/browse/XALANJ-2567 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan, XSLTC Affects Versions: The Latest Development Code, 2.7.1 Environment: java version 1.6.0_29, latest development code from http://svn.apache.org/repos/asf/xalan/java/trunk Reporter: Don Smith Attachments: sorting-example.zip Sorting the Polish alphabet is incorrect. See correct order at https://en.wikipedia.org/wiki/Polish_alphabet, specifically the Ł character that follows L. Using the files in the attached zip file, I sort the alphabet using xsl:sort with a lang attribute of pl. The Ł character is sorted at the end instead of between L and M. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Resolved] (XALANJ-2567) xsl:sort sorts Polish incorrectly
[ https://issues.apache.org/jira/browse/XALANJ-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Don Smith resolved XALANJ-2567. --- Resolution: Fixed Fix Version/s: The Latest Development Code I just downloaded built the latest source. The sorting now works correctly with my sample code (zip file attached). xsl:sort sorts Polish incorrectly - Key: XALANJ-2567 URL: https://issues.apache.org/jira/browse/XALANJ-2567 Project: XalanJ2 Issue Type: Bug Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Components: Xalan, XSLTC Affects Versions: The Latest Development Code, 2.7.1 Environment: java version 1.6.0_29, latest development code from http://svn.apache.org/repos/asf/xalan/java/trunk Reporter: Don Smith Fix For: The Latest Development Code Attachments: sorting-example.zip, sorting-example.zip Sorting the Polish alphabet is incorrect. See correct order at https://en.wikipedia.org/wiki/Polish_alphabet, specifically the Ł character that follows L. Using the files in the attached zip file, I sort the alphabet using xsl:sort with a lang attribute of pl. The Ł character is sorted at the end instead of between L and M. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org
[jira] [Commented] (XALANJ-2550) GSoC: Complete support for StAXSource / StAXResult.
[ https://issues.apache.org/jira/browse/XALANJ-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621096#comment-13621096 ] Michael Glavassevich commented on XALANJ-2550: -- Hi Pulasthi, Yes, this project is still available for GSoC 2013. Thanks. GSoC: Complete support for StAXSource / StAXResult. --- Key: XALANJ-2550 URL: https://issues.apache.org/jira/browse/XALANJ-2550 Project: XalanJ2 Issue Type: New Feature Security Level: No security risk; visible to anyone(Ordinary problems in Xalan projects. Anybody can view the issue.) Affects Versions: 2.7.1 Reporter: Michael Glavassevich Labels: gsoc, gsoc2013, mentor StAXSource [1] and StAXResult [2] were introduced in JAXP 1.4. Xalan-J does not yet support these interfaces. The goal of this project is to implement support for StAXSource/StAXResult in the TransformerFactory and Transformer as required for JAXP 1.4. [1] http://docs.oracle.com/javase/6/docs/api/javax/xml/transform/stax/StAXSource.html [2] http://docs.oracle.com/javase/6/docs/api/javax/xml/transform/stax/StAXResult.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org