Hi Karel,

On 09/19/2011 11:33 AM, karel braeckman wrote:
> Hi Guys,
>
> I have let the dbpintegrator run over the weekend, but it got stuck
> again (same behavior as mentioned before: Virtuoso stuck at 100% CPU
> use). The good news is we think we found the problem.
>
> We found something strange in the update file with triples to delete
> where things got stuck. The file uses variables, but some of the
> variables are used more than once. For instance, the file
> 000981.removed.nt uses the ?o0 and ?o1 variables twice:
>
> <http://dbpedia.org/resource/Gaziantep>
> <http://www.w3.org/2000/01/rdf-schema#comment>  ?o0 .
> <http://dbpedia.org/resource/Public_domain>
> <http://www.w3.org/2000/01/rdf-schema#comment>  ?o0 .
> <http://dbpedia.org/resource/Gaziantep>
> <http://dbpedia.org/ontology/abstract>  ?o1 .
> <http://dbpedia.org/resource/Public_domain>
> <http://dbpedia.org/ontology/abstract>  ?o1 .

I got the problem.

> This is probably the reason why the DELETE query for this file has a
> higher complexity and tripped Virtuoso up. We changed the code of
> dbpintegrator slightly to perform a Sparql query per line of this file
> instead of one query for the entire file. The tool is running without
> any problems so far.

This is a good solution, and I'll check the problem in the extraction 
framework itself and use incremental variable numbers, so the number do 
not coincide.

> So perhaps something is going wrong in creating the files with the
> deletion triples?
> Best regards,
> Karel
>
> On Fri, Sep 16, 2011 at 5:53 PM, karel braeckman
> <[email protected]>  wrote:
>> Hi Mohamed,
>>
>> You were right, I checked my settings and the tool does start at the
>> correct date. I must have done something wrong earlier.
>>
>> Best regards,
>> Karel
>>
>> On Fri, Sep 16, 2011 at 5:26 PM, Mohamed Morsey
>> <[email protected]>  wrote:
>>> Hi Karel,
>>>
>>> On 09/16/2011 04:51 PM, karel braeckman wrote:
>>>> Hi guys,
>>>>
>>>> First of all, thanks for all the suggestions. I changed my settings
>>>> according to your suggestions, and at first the problem was the same
>>>> (100% CPU for quite a while when deleting triples) but after a while
>>>> (~10 minutes) Virtuoso completed the action and now the tool seems to
>>>> be running ok. I'm afraid I don't know which of the settings did
>>>> finally got it to work.
>>> Nice to hear that :).
>>>
>>>> I have one more problem with the dbpintegrator tool however. I set the
>>>> date in my lastDownloadDate.dat to 2011-09-10-00-000000, but the tool
>>>> seems to start at the first file of the current hour
>>>> (2011-09-16-16-000001), could this be a bug?
>>> I've tried the dbpintegrator tool on my machine starting from the point you
>>> mention and it seems to work properly, so please recheck your settings and
>>> let me know if the problem still exists
>>>
>>>> @Mohamed:
>>>> It really did take two days to fill the store with the DBpedia live
>>>> dump. Initially it was fast, but it got slower and slower. There
>>>> already was the default DBpedia dump (not the live version) inserted
>>>> into another graph, maybe the amount of triples is just too large? How
>>>> fast should it (more or less) take to load the live dump into Virtuoso
>>>> you think?
>>> Not 100% sure but it should take something like 3-4 hours.
>>>
>>>> Virtuoso was running for a few weeks the first time I tried to run the
>>>> sync tool. Since then, I restarted it a few times after changing
>>>> config files and trying to debug things.
>>> Exactly, this what I meant, a restart could be helpful.
>>>
>>>> @Patrick, @Kingsley:
>>>> The version I am using is Version: 06.01.3127, Build: Mar 16 2011 of
>>>> VOS. I downloaded and compiled it (on Ubuntu 10.04.2 LTS).
>>>>
>>>> The lastDownloadDate.dat file contains 2011-09-16-16-000555 at the
>>>> moment of writing (the tool is working now).
>>>>
>>>> Best regards and thanks for the hints,
>>>> Karel
>>>>
>>>> On Fri, Sep 16, 2011 at 3:53 PM, Patrick van Kleef
>>>> <[email protected]>    wrote:
>>>>> Hi Karel,
>>>>>
>>>>>> Forgot to mention the machine details:
>>>>>>
>>>>>> 24GB RAM
>>>>>> 2x quadcore Xeon E5540 2.5GHz
>>>>>> Virtuoso data is on SSD disks
>>>>> Your parameters look ok, but you may want to try adding the following:
>>>>>
>>>>>         [Parameters]
>>>>>         ...
>>>>>         DefaultIsolation = 2
>>>>>         ...
>>>>>
>>>>> which sets a different transaction isolation level which is more suitable
>>>>> for situation where updates/deletes and queries are done on the same
>>>>> server.
>>>>>
>>>>>
>>>>> Did you also set your linux kernel swappiness parameter as per the
>>>>> following
>>>>> Tips and Tricks article:
>>>>>
>>>>>
>>>>> http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtTipsAndTricksGuideRDFPerformanceTuning
>>>>>
>>>>> If not than your Linux kernel may start swapping out parts of your
>>>>> virtuoso
>>>>> process pages in favor of filesystem cache which will seriously hurt
>>>>> virtuoso's performance.
>>>>>
>>>>> In general you should make sure your system never starts swapping.
>>>>>
>>>>>
>>>>> Can you tell me the exact version of VOS you are using on your system and
>>>>> whether you are using the OS supplied version or if you compiled and
>>>>> installed it yourself. Note that the current version of Virtuoso
>>>>> OpenSource
>>>>> is 6.1.3 from:
>>>>>
>>>>>         http://sourceforge.net/projects/virtuoso/files/
>>>>>
>>>>> However if you are running an older version and are not afraid to do a
>>>>> build
>>>>> yourself, i would like to give you access to a prerelease of the upcoming
>>>>> 6.1.4 which has a number of new optimisations and fixes that maybe of
>>>>> benefit.
>>>>>
>>>>>
>>>>> Lastly on the subject of this dbpintegrator part, can you tell me the
>>>>> content of the file:
>>>>>
>>>>>         lastDownloadDate.dat
>>>>>
>>>>>
>>>>> Patrick
>>>>>
>>>>>
>>>>>
>>>
>>> --
>>> Kind Regards
>>> Mohamed Morsey
>>> Department of Computer Science
>>> University of Leipzig
>>>
>>>
> ------------------------------------------------------------------------------
> BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
> Learn about the latest advances in developing for the
> BlackBerry&reg; mobile platform with sessions, labs&  more.
> See new tools and technologies. Register for BlackBerry&reg; DevCon today!
> http://p.sf.net/sfu/rim-devcon-copy1
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>


-- 
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig


------------------------------------------------------------------------------
BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
Learn about the latest advances in developing for the 
BlackBerry&reg; mobile platform with sessions, labs & more.
See new tools and technologies. Register for BlackBerry&reg; DevCon today!
http://p.sf.net/sfu/rim-devcon-copy1 
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to