I am loading the triples using: java -cp lib/tdb-0.8.11-SNAPSHOT.jar:lib/* tdb.tdbloader -loc store dataset.ttl
I am trying to load: 2006601080 triples On Thu, Sep 10, 2015 at 3:39 AM, Andy Seaborne <[email protected]> wrote: > On 09/09/15 21:43, Maria Jackson wrote: > >> Thanks a lot for your reply. >> >> Jena seems to be taking a lot of time to load YAGO even though I am using >> SSD and RAM=64 GB. My program is running since the past 11 days, is there >> some way by which I may increase the loading speed without discarding the >> already loaded triples! >> > > How are you loading the triples? > What storage layer are you using? > > (TDB had a bulk loader back then IIRC.) > > And how many triples are you attempting to load? > > Andy > > > >> On Thu, Sep 10, 2015 at 12:32 AM, Andy Seaborne <[email protected]> wrote: >> >> On 09/09/15 15:55, Maria Jackson wrote: >>> >>> Thanks a lot but just to clarify. Will this cause a problem while >>>> querying >>>> -- I mean will the query engine be able to retrieve all the triples for >>>> which warning appeared. >>>> >>>> >>> Probably not. It's a warning, not an error. The only issue might be if >>> you use the IRI in a query as a constant. You can test that with your >>> setup. [*] >>> >>> The problem is the "wikicategory__Category:" part >>> >>> _ in the schema name. Only A-Z then A-Z,0-9 are allowed in the scheme >>> name >>> >>> [*] >>> I don;t know what ARQ 2.8.9 is - didn't it got 2.8.8 => 2.9.0-incubating >>> -- some YAGO custom build? >>> >>> Andy >>> >>> >>> >>> On Wed, Sep 9, 2015 at 6:33 PM, Rob Vesse <[email protected]> wrote: >>>> >>>> No Jena is not skipping those triples >>>> >>>>> >>>>> The warning(s) mean that those triples contain faulty data that may not >>>>> be >>>>> properly interoperable with standards compliant RDF systems (including >>>>> possibly other parts of Jena itself). >>>>> >>>>> I don't understand why you have don't the option of switching Jena >>>>> versions? (Although in this case the root cause is that the input data >>>>> is >>>>> bad so upgrading Jena while advisable would not fix the issue) >>>>> >>>>> Rob >>>>> >>>>> On 09/09/2015 14:25, "Maria Jackson" <[email protected]> >>>>> wrote: >>>>> >>>>> I am trying to load YAGO in Jena ARQ 2.8.9 (Its an old version of Jena >>>>> >>>>>> which was developed by some other developer -- so I dont have an >>>>>> option >>>>>> of >>>>>> switching to new version of Jena). I am ending up getting a lot of >>>>>> errors >>>>>> of the following form: >>>>>> >>>>>> WARN [line: 548250488, col: 38] Bad IRI: >>>>>> >>>>>> >>>>>> <wikicategory__Category:Unincorporated_communities_in__(3)_Category:Uninco >>>>>> rporated_communities_in_Branson_micropolitan_area> >>>>>> Code: 0/ILLEGAL_CHARACTER in SCHEME: The character violates the >>>>>> grammar >>>>>> rules for URIs/IRIs. >>>>>> >>>>>> >>>>>> All triples in my turtle file end with a full stop(.). So, does this >>>>>> warning mean that Jena is skipping these triples. Also what are the >>>>>> implications of this warning when I'll query Jena using SPARQL. >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >
