Hi Dan,
On 12/12/2019 16:30, Stoner, Dan F wrote:
> I found some oddities and I am not exactly sure where to go next.
>
> We are noticing the following while processing meta.xml in darwin core
> archives produced by IPT (and other servers):
>
> Schema validation failed, continuing unvalidated
> XMLSyntaxError: Element '{http://rs.tdwg.org/dwc/text/}coreid': This element
> is not expected. Expected is ( {http://rs.tdwg.org/dwc/text/}coreId )
There's some background to this on this issue:
https://github.com/tdwg/dwc/issues/143
The schema itself and the documentation were conflicting, and this was
fixed (in mine and Tim's opinion) the wrong way, by changing the schema.
*I've just pushed a commit to fix it the right way,* i.e. reflecting 99%
actual usage and leaving the schema as it was for almost a decade.
Although we do accept either, we still see only 31 datasets registered
in GBIF with "coreId" rather than "coreid".
> It seems like most consumers are not actually validating meta.xml using the
> schema, and the producers are generating files out of compliance with the
> schema.
>
> Most of the Darwin Core archives I have manually inspected and tried to
> validate contain meta.xml with lowercase "i" in coreid despite the Standard
> indicating capital "I" in coreId.
>
>
> I poked at the GBIF Darwin Core Validator 3 code repo and found this:
>
> schema.meta=https://raw.githubusercontent.com/tdwg/dwc/master/standard/documents/text/tdwg_dwc_text.xsd,http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd
>
>
> The first link leads to 404, the second leads to an xsd that contains the
> proper coreId. So maybe the Validator is not being "strict" about validation
> against the schema?
I suspect it has been running for so long that, when the validator
process was originally started, both URLs were valid, and had coreid or
one of each.
Cheers,
Matt
_______________________________________________
IPT mailing list
[email protected]
https://lists.gbif.org/mailman/listinfo/ipt