I am loading the triples using:

java -cp lib/tdb-0.8.11-SNAPSHOT.jar:lib/* tdb.tdbloader -loc store
dataset.ttl

I am trying to load: 2006601080 triples

On Thu, Sep 10, 2015 at 3:39 AM, Andy Seaborne <[email protected]> wrote:

> On 09/09/15 21:43, Maria Jackson wrote:
>
>> Thanks a lot for your reply.
>>
>> Jena seems to be taking a lot of time to load YAGO even though I am using
>> SSD and RAM=64 GB. My program is running since the past 11 days, is there
>> some way by which I may increase the loading speed without discarding the
>> already loaded triples!
>>
>
> How are you loading the triples?
> What storage layer are you using?
>
> (TDB had a bulk loader back then IIRC.)
>
> And how many triples are you attempting to load?
>
>         Andy
>
>
>
>> On Thu, Sep 10, 2015 at 12:32 AM, Andy Seaborne <[email protected]> wrote:
>>
>> On 09/09/15 15:55, Maria Jackson wrote:
>>>
>>> Thanks a lot but just to clarify. Will this cause a problem while
>>>> querying
>>>> -- I mean will the query engine be able to retrieve all the triples for
>>>> which warning appeared.
>>>>
>>>>
>>> Probably not.  It's a warning, not an error.  The only issue might be if
>>> you use the IRI in a query as a constant.  You can test that with your
>>> setup. [*]
>>>
>>> The problem is the "wikicategory__Category:" part
>>>
>>> _ in the schema name.  Only A-Z then A-Z,0-9 are allowed in the scheme
>>> name
>>>
>>> [*]
>>> I don;t know what ARQ 2.8.9 is - didn't it got 2.8.8 => 2.9.0-incubating
>>> -- some YAGO custom build?
>>>
>>>          Andy
>>>
>>>
>>>
>>> On Wed, Sep 9, 2015 at 6:33 PM, Rob Vesse <[email protected]> wrote:
>>>>
>>>> No Jena is not skipping those triples
>>>>
>>>>>
>>>>> The warning(s) mean that those triples contain faulty data that may not
>>>>> be
>>>>> properly interoperable with standards compliant RDF systems (including
>>>>> possibly other parts of Jena itself).
>>>>>
>>>>> I don't understand why you have don't the option of switching Jena
>>>>> versions? (Although in this case the root cause is that the input data
>>>>> is
>>>>> bad so upgrading Jena while advisable would not fix the issue)
>>>>>
>>>>> Rob
>>>>>
>>>>> On 09/09/2015 14:25, "Maria Jackson" <[email protected]>
>>>>> wrote:
>>>>>
>>>>> I am trying to load YAGO in Jena ARQ 2.8.9 (Its an old version of Jena
>>>>>
>>>>>> which was developed by some other developer -- so I dont have an
>>>>>> option
>>>>>> of
>>>>>> switching to new version of Jena). I am ending up getting a lot of
>>>>>> errors
>>>>>> of the following form:
>>>>>>
>>>>>> WARN  [line: 548250488, col: 38] Bad IRI:
>>>>>>
>>>>>>
>>>>>> <wikicategory__Category:Unincorporated_communities_in__(3)_Category:Uninco
>>>>>> rporated_communities_in_Branson_micropolitan_area>
>>>>>> Code: 0/ILLEGAL_CHARACTER in SCHEME: The character violates the
>>>>>> grammar
>>>>>> rules for URIs/IRIs.
>>>>>>
>>>>>>
>>>>>> All triples in my turtle file end with a full stop(.). So, does this
>>>>>> warning mean that Jena is skipping these triples. Also what are the
>>>>>> implications of this warning when I'll query Jena using SPARQL.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to