Hi Dan,

On 12/12/2019 16:30, Stoner, Dan F wrote:
> I found some oddities and I am not exactly sure where to go next.
>
> We are noticing the following while processing meta.xml in darwin core 
> archives produced by IPT (and other servers):
>
> Schema validation failed, continuing unvalidated
> XMLSyntaxError: Element '{http://rs.tdwg.org/dwc/text/}coreid': This element 
> is not expected. Expected is ( {http://rs.tdwg.org/dwc/text/}coreId )

There's some background to this on this issue: 
https://github.com/tdwg/dwc/issues/143

The schema itself and the documentation were conflicting, and this was 
fixed (in mine and Tim's opinion) the wrong way, by changing the schema.

*I've just pushed a commit to fix it the right way,* i.e. reflecting 99% 
actual usage and leaving the schema as it was for almost a decade.

Although we do accept either, we still see only 31 datasets registered 
in GBIF with "coreId" rather than "coreid".

> It seems like most consumers are not actually validating meta.xml using the 
> schema, and the producers are generating files out of compliance with the 
> schema.
>
> Most of the Darwin Core archives I have manually inspected and tried to 
> validate contain meta.xml with lowercase "i" in coreid despite the Standard 
> indicating capital "I" in coreId.
>
>
> I poked at the GBIF Darwin Core Validator 3 code repo and found this:
>
> schema.meta=https://raw.githubusercontent.com/tdwg/dwc/master/standard/documents/text/tdwg_dwc_text.xsd,http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd
>
>
> The first link leads to 404, the second leads to an xsd that contains the 
> proper coreId.  So maybe the Validator is not being "strict" about validation 
> against the schema?

I suspect it has been running for so long that, when the validator 
process was originally started, both URLs were valid, and had coreid or 
one of each.

Cheers,

Matt


_______________________________________________
IPT mailing list
IPT@lists.gbif.org
https://lists.gbif.org/mailman/listinfo/ipt

Reply via email to