[
https://issues.apache.org/jira/browse/XERCESJ-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619078#comment-16619078
]
Yitzhak Khabinsky commented on XERCESJ-1701:
--------------------------------------------
[~mukul_gandhi]
As I said, we don't have any influence on the system of origin.
We don't know how XML data feeds are generated.
Saxonica solved this exact problem by changing the *hashing formula.*
Since then Saxon has no problem to enforce unique constraint even for the
*multi-gigabyte* XML files.
I provided you earlier a link to the ticket in their system.
Please touch base with Michael Kay to reveal what {color:#333333}hashing
formula {color}they switched to.
> Xerces-J 2.12.0: XSD 1.1 PK constraint scalability issue
> --------------------------------------------------------
>
> Key: XERCESJ-1701
> URL: https://issues.apache.org/jira/browse/XERCESJ-1701
> Project: Xerces2-J
> Issue Type: Bug
> Components: JAXP (javax.xml.validation)
> Affects Versions: 2.12.0
> Environment: Windows 10, 64-bit
> Reporter: Yitzhak Khabinsky
> Priority: Major
> Labels: XSD, key
> Fix For: 2.12.0
>
> Attachments: SubscriberCountFact.zip
>
>
> Hello,
> A test case is very simple:
> * XML file, size 700 MB
> * XSD file
> XSD is enforcing the following:
> * XML structure
> * Data elements/attributes data types
> * *PK constraint, composite primary key based on four elements*
> * No asserts/assertions/CTAs
> <xs:key name="PK">
> <xs:selector xpath="r"/>
> <xs:field xpath="CountryCode"/>
> <xs:field xpath="Date"/>
> <xs:field xpath="AnalyticsArrangementKey"/>
> <xs:field xpath="PaymentType"/>
> </xs:key>
>
> Saxon Java EE runs XSD validation for 2 minutes
> Xerces-J 2.12.0 cannot finish it at all, running for many hours.
> If I comment out the *xs:key* constraint, Xerces has no problems to finish
> the validation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]