[ 
https://issues.apache.org/jira/browse/XERCESJ-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849196#comment-17849196
 ] 

Mahmoud M. Almazari commented on XERCESJ-1205:
----------------------------------------------

[~tkrammer]/[~radu_coravu], the provided solutions work fine. However, I have 
the following concerns: 
 * The provided solution loads all entities into the entity manager each time. 
In many applications, thousands of entities are present, but only a few are 
utilized. Therefore, I propose loading entities only when necessary.
 * 
I agree with [~radu_coravu] that the publicId should be used to identify the 
external entity. However, XMLDTDScannerImpl#scanEntityDecl is actually using 
only the publicId without the systemId.
 
 * XMLDTDScannerImpl#scanEntityDecl is adding unparsed entities that is being 
identified by the {{{}notation{}}}. However, the provided solutions do not 
consider them.
 * [~radu_coravu], I'm wondering if we need to reuse the DTD grammar when 
parsing XML files with different internal subsets only. Basically, when caching 
is turned off and parsing begins, if the entity is already loaded, the 
convention in the entity manager is to report a warning in case of a duplicate 
entity duplication.
 * The code above uses {{addExternalEntity}} function, which will be affected 
by the current entity {{{}fCurrentEntity{}}}. However, I believe we should 
consider the cached and stored {{baseSystemID}} after caching without being 
affected by the current entity in the entity manager when parsing documents.

 

I have attached a possible fix: [^XERCESJ-1205.patch]

[~mukulg], [~mrgla...@ca.ibm.com], [~ankitp], [~jan.tosovsky.cz], caching DTD 
grammars is vital, and it's unfortunate that the issue persists. Your input is 
invaluable in resolving this. Thanks for your dedication!

> Entity resolution does not work with DTD grammar caching resolved
> -----------------------------------------------------------------
>
>                 Key: XERCESJ-1205
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1205
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: DTD
>    Affects Versions: 2.8.1
>         Environment: JDK1.5. The issue appears on various machines, Windows, 
> Linux, Mac OSX. I don't believe it is platform specific.
>            Reporter: Tin Pavlinic
>            Assignee: Michael Glavassevich
>            Priority: Major
>         Attachments: XERCESJ-1205.patch, XERCESJ-1465.patch, bug.zip, 
> entitypatch-r1813171.patch
>
>
> We have a DTD which defines some entities. We are parsing multiple documents 
> against this DTD. If grammar caching is enabled, the entities are unresolved 
> when the grammar is loaded from the cache, instead of the DTD. 
> It seems that they are cleared every time a document is parsed and are only 
> loaded when a DTD is loaded and not from the cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: j-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: j-dev-h...@xerces.apache.org

Reply via email to