On 09/09/15 21:43, Maria Jackson wrote:
Thanks a lot for your reply.

Jena seems to be taking a lot of time to load YAGO even though I am using
SSD and RAM=64 GB. My program is running since the past 11 days, is there
some way by which I may increase the loading speed without discarding the
already loaded triples!

How are you loading the triples?
What storage layer are you using?

(TDB had a bulk loader back then IIRC.)

And how many triples are you attempting to load?

        Andy


On Thu, Sep 10, 2015 at 12:32 AM, Andy Seaborne <[email protected]> wrote:

On 09/09/15 15:55, Maria Jackson wrote:

Thanks a lot but just to clarify. Will this cause a problem while querying
-- I mean will the query engine be able to retrieve all the triples for
which warning appeared.


Probably not.  It's a warning, not an error.  The only issue might be if
you use the IRI in a query as a constant.  You can test that with your
setup. [*]

The problem is the "wikicategory__Category:" part

_ in the schema name.  Only A-Z then A-Z,0-9 are allowed in the scheme name

[*]
I don;t know what ARQ 2.8.9 is - didn't it got 2.8.8 => 2.9.0-incubating
-- some YAGO custom build?

         Andy



On Wed, Sep 9, 2015 at 6:33 PM, Rob Vesse <[email protected]> wrote:

No Jena is not skipping those triples

The warning(s) mean that those triples contain faulty data that may not
be
properly interoperable with standards compliant RDF systems (including
possibly other parts of Jena itself).

I don't understand why you have don't the option of switching Jena
versions? (Although in this case the root cause is that the input data is
bad so upgrading Jena while advisable would not fix the issue)

Rob

On 09/09/2015 14:25, "Maria Jackson" <[email protected]>
wrote:

I am trying to load YAGO in Jena ARQ 2.8.9 (Its an old version of Jena
which was developed by some other developer -- so I dont have an option
of
switching to new version of Jena). I am ending up getting a lot of
errors
of the following form:

WARN  [line: 548250488, col: 38] Bad IRI:

<wikicategory__Category:Unincorporated_communities_in__(3)_Category:Uninco
rporated_communities_in_Branson_micropolitan_area>
Code: 0/ILLEGAL_CHARACTER in SCHEME: The character violates the grammar
rules for URIs/IRIs.


All triples in my turtle file end with a full stop(.). So, does this
warning mean that Jena is skipping these triples. Also what are the
implications of this warning when I'll query Jena using SPARQL.











Reply via email to