On 20/09/2019 14:37, Sebastian Hellmann wrote:
Hi Andy,

I would prefer:

// keeping NOT NFC Warnings

// setErrorWarning(iriFactoryInst, ViolationCodes.NOT_NFC, false, false);

// turning of NOT_NFKC warnings

setErrorWarning(iriFactoryInst, ViolationCodes.NOT_NFKC, false, false);

  // ** Applies to various unicode blocks.

setErrorWarning(iriFactory, ViolationCodes.COMPATIBILITY_CHARACTER, false, false);


Reason:

I am really trying to understand Unicode details, but it is tough and I am doing this by the side.

Ditto.

jena-iri and its checking is vast and complicated.

Long term, I'd like to switch to a simpler and more maintainable codebase for the basic use of parsing. But that's an aspiration (An "aspiration" is a "plan" without a schedule :-)

Checking for NFC seems practical as this should be a good goal and it is in the RDF spec and information about it is decent. NOT_NFKC or the Compatibility Character warning is confusing.

Agreed - I'm comfortable with that set of choices.

(the various negations of NOT tests is also confusing!)


I am also wondering, if NOT_NFC and NOT_NFKC would mean that at least one warning is always produced. Turning all three off would be acceptable.

Or maybe I am reading this wrong, it first checks whether it is a compatibility char and then requires it to be in NFKC for backward compatibility?

It's three separate tests. (AbsLexer.java if anyone wants to look)

It's using java.text.Normalizer and some special code for compatibility.
I don't understand Unicode compatibility; the comment is that it is the difference between NFD and NFKD.


Config fine-tuning:

Regarding changing the config ourselves for further fine - tuning, we would copy the class into the current code and adjust it, I assume. Or do you think it is possible to make `private static void setErrorWarning` public?

When an IRIFactory is first used, it gets frozen and you can't change the warning/errors anymore. And it get used in the class it is created in so an app can't change it in-place.

Tinkering with the configuration of IRIResolver to allow it to change the IRIFactory can be done but we are very near 3.13.0 release so I'm keen to only make simple change (turning warnings/errors off; these have been tried out before). Adjusting the IRIFactory risks getting into weird java initializer situations, altering even something like that. Something better done earlier in a release cycle.

    Andy


-- Sebastian




On 20.09.19 14:48, Andy Seaborne wrote:


On 20/09/2019 10:53, Sebastian Hellmann wrote:
On 20.09.19 11:23, Andy Seaborne wrote:

https://issues.apache.org/jira/browse/JENA-864

Personally, I think we should just turn the warnings off.


That would be great. They are harmful in a way that they cause some flooding in stackoverflow/issue trackers as people are confused (googled before I wrote).

https://github.com/apache/jena/pull/607

    Andy

Reply via email to