[ https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849196#comment-17849196 ]
Mahmoud M. Almazari commented on XERCESJ-1205: ---------------------------------------------- [~tkrammer]/[~radu_coravu], the provided solutions work fine. However, I have the following concerns: * The provided solution loads all entities into the entity manager each time. In many applications, thousands of entities are present, but only a few are utilized. Therefore, I propose loading entities only when necessary. * I agree with [~radu_coravu] that the publicId should be used to identify the external entity. However, XMLDTDScannerImpl#scanEntityDecl is actually using only the publicId without the systemId. * XMLDTDScannerImpl#scanEntityDecl is adding unparsed entities that is being identified by the {{{}notation{}}}. However, the provided solutions do not consider them. * [~radu_coravu], I'm wondering if we need to reuse the DTD grammar when parsing XML files with different internal subsets only. Basically, when caching is turned off and parsing begins, if the entity is already loaded, the convention in the entity manager is to report a warning in case of a duplicate entity duplication. * The code above uses {{addExternalEntity}} function, which will be affected by the current entity {{{}fCurrentEntity{}}}. However, I believe we should consider the cached and stored {{baseSystemID}} after caching without being affected by the current entity in the entity manager when parsing documents. I have attached a possible fix: [^XERCESJ-1205.patch] [~mukulg], [~mrgla...@ca.ibm.com], [~ankitp], [~jan.tosovsky.cz], caching DTD grammars is vital, and it's unfortunate that the issue persists. Your input is invaluable in resolving this. Thanks for your dedication! > Entity resolution does not work with DTD grammar caching resolved > ----------------------------------------------------------------- > > Key: XERCESJ-1205 > URL: https://issues.apache.org/jira/browse/XERCESJ-1205 > Project: Xerces2-J > Issue Type: Bug > Components: DTD > Affects Versions: 2.8.1 > Environment: JDK1.5. The issue appears on various machines, Windows, > Linux, Mac OSX. I don't believe it is platform specific. > Reporter: Tin Pavlinic > Assignee: Michael Glavassevich > Priority: Major > Attachments: XERCESJ-1205.patch, XERCESJ-1465.patch, bug.zip, > entitypatch-r1813171.patch > > > We have a DTD which defines some entities. We are parsing multiple documents > against this DTD. If grammar caching is enabled, the entities are unresolved > when the grammar is loaded from the cache, instead of the DTD. > It seems that they are cleared every time a document is parsed and are only > loaded when a DTD is loaded and not from the cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org For additional commands, e-mail: j-dev-h...@xerces.apache.org