On 20/09/2019 14:37, Sebastian Hellmann wrote:
Hi Andy,
I would prefer:
// keeping NOT NFC Warnings
// setErrorWarning(iriFactoryInst, ViolationCodes.NOT_NFC, false, false);
// turning of NOT_NFKC warnings
setErrorWarning(iriFactoryInst, ViolationCodes.NOT_NFKC, false, false);
// ** Applies to various unicode blocks.
setErrorWarning(iriFactory, ViolationCodes.COMPATIBILITY_CHARACTER,
false, false);
Reason:
I am really trying to understand Unicode details, but it is tough and I
am doing this by the side.
Ditto.
jena-iri and its checking is vast and complicated.
Long term, I'd like to switch to a simpler and more maintainable
codebase for the basic use of parsing. But that's an aspiration (An
"aspiration" is a "plan" without a schedule :-)
Checking for NFC seems practical as this
should be a good goal and it is in the RDF spec and information about it
is decent. NOT_NFKC or the Compatibility Character warning is confusing.
Agreed - I'm comfortable with that set of choices.
(the various negations of NOT tests is also confusing!)
I am also wondering, if NOT_NFC and NOT_NFKC would mean that at least
one warning is always produced. Turning all three off would be acceptable.
Or maybe I am reading this wrong, it first checks whether it is a
compatibility char and then requires it to be in NFKC for backward
compatibility?
It's three separate tests. (AbsLexer.java if anyone wants to look)
It's using java.text.Normalizer and some special code for compatibility.
I don't understand Unicode compatibility; the comment is that it is the
difference between NFD and NFKD.
Config fine-tuning:
Regarding changing the config ourselves for further fine - tuning, we
would copy the class into the current code and adjust it, I assume. Or
do you think it is possible to make `private static void
setErrorWarning` public?
When an IRIFactory is first used, it gets frozen and you can't change
the warning/errors anymore. And it get used in the class it is created
in so an app can't change it in-place.
Tinkering with the configuration of IRIResolver to allow it to change
the IRIFactory can be done but we are very near 3.13.0 release so I'm
keen to only make simple change (turning warnings/errors off; these have
been tried out before). Adjusting the IRIFactory risks getting into
weird java initializer situations, altering even something like that.
Something better done earlier in a release cycle.
Andy
-- Sebastian
On 20.09.19 14:48, Andy Seaborne wrote:
On 20/09/2019 10:53, Sebastian Hellmann wrote:
On 20.09.19 11:23, Andy Seaborne wrote:
https://issues.apache.org/jira/browse/JENA-864
Personally, I think we should just turn the warnings off.
That would be great. They are harmful in a way that they cause some
flooding in stackoverflow/issue trackers as people are confused
(googled before I wrote).
https://github.com/apache/jena/pull/607
Andy