Forwarding a message from Andy Seaborne to dbpedia-discussion

Richard


Begin forwarded message:

> From: Andy Seaborne <[email protected]>
> Date: 15 April 2010 13:43:38 IST
> To: [email protected]
> Subject: [pedantic-web] DBPedia 3.5 parsing report
> Reply-To: [email protected]
>
> Does this count as pedantic? :-)
>
> I ran the files from http://downloads.dbpedia.org/3.5/en/ through an  
> N-Triples parser with checking:
>
> The report is here (it's 25K lines long which isn't so many given  
> the size of the data):
>
> http://www.openjena.org/~afs/DBPedia35-parse-log-2010-04-15.txt
>
> It covers both strict errors and warnings of ill-advised forms.
>
> A few examples:
>
> Bad IRI: <=?(''[[Nepenthes>
> Bad IRI: <http://www.european-athletics.org‎>
>
> Bad lexical forms for the value space:
> "1967-02-31"^^http://www.w3.org/2001/XMLSchema#date
> (there is no February the 31st)
>
> Warning of well known ports of other protocols:
> http://stream1.securenetsystems.net:443
>
> Warning about explicit about port 80:
>
> http://bibliotecadigitalhispanica.bne.es:80/
>
> and use of . and .. in absolute URIs which are all from the standard  
> list of IRI warnings.
>
> Bad IRI: <http://dbpedia.org/resource/..> Code: 8/ 
> NON_INITIAL_DOT_SEGMENT in PATH: The path contains a segment /../  
> not at the beginning of a relative reference, or it contains a /./  
> These should be removed.
>
>    Andy
>
> Software used:
>
> The IRI checker, by Jeremy Carroll, is available from
> http://www.openjena.org/iri/ and Maven.
>
> The lexical form checking is done by Apache Xerces.
>
> The N-triples parser is the one from TDB v0.8.5 which bundles the  
> above two together.
>
>
> -- 
> To unsubscribe, reply using "remove me" as the subject.


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to