Hi again,

On 22 Aug 2014, at 17:44, Jörn Hees <j_h...@cs.uni-kl.de> wrote:

> On 22 Aug 2014, at 17:51, Hugh Williams <hwilli...@openlinksw.com> wrote:
> 
>> What I would not expect though is for the memory consumption to continue to 
>> increase until the server is killed due to oom error which would imply a 
>> possible memory leak, which is why I recommend building with the develop/7  
>> build where there have been improvement in memory management.
> 
> I currently just used the stable 7.1.0 release. I'll try with the dev build 
> again and report back...

So i'm running the import on a fresh dev build since my last email and i'm now 
at a total memory consumption of 31218/32177 MB (buffers: 15 MB, cache: 
remaining ~700 MB).

The Virtuoso process has allocated 31.5 GB (VIRT), 30.1 GB (RES) and 3.812 MB 
(SHR) Memory.

I'm not sure if i really have to run the importer till it's killed for out of 
memory (as i said it becomes pretty slow after a while and is currently only 
seeking around with 200 KB/s) or if this is enough already. As NumberOfBuffers 
is set to 2720000 as recommended i guess that anything above 21 GB is 
suspicious... we're at > 31 GB now.


I've also split up the input file into 100M line chunks so that i can track the 
progress a bit better...
14 of these are completely loaded now, so 1.4 G triples, the 15th is currently 
running.
These are the start times as reported in DB.DBA.LOAD_LIST. I added a column for 
loaded triples (not necessarily unique):
2014.8.22 19:59 0
2014.8.22 20:09 100M
2014.8.22 20:22 200M
2014.8.22 20:39 300M
2014.8.22 20:53 400M
2014.8.22 21:11 500M
2014.8.22 21:31 600M
2014.8.22 22:03 700M
2014.8.22 22:39 800M
2014.8.22 23:32 900M
2014.8.23 00:17 1G
2014.8.23 02:47 1.1G
2014.8.23 08:51 1.2G
2014.8.23 18:02 1.3G
2014.8.24 16:16 1.4G

The import times for 100M triples seem to be roughly about:
- 10 minutes initially
- 30 minutes after 600M loaded triples
- 45 minutes after 900M triples
- 2h:30 after 1G triples (I'm guessing that this is when the set Memory-Limit 
is hit)
- 6h after 1.1G triples
- 10h after 1.2G triples
- 22h after 1.3G triples
- >22h after 1.4G triples

The last 4 lines sadly don't give me the impression that this scales nearly 
linearly after virtuoso runs out of fast random access memory and has to rely 
on block storage :-/ Is there maybe a setting which allows virtuoso to fall 
back to a merge-sort like approach like creating sorted temp dbs and then 
merging them bottom up? Wouldn't this scale way beyond the available RAM sizes 
and not cause the seek&wait pattern i observe?!?


Anything else i can do to help to debug this? Can i stop the import?

Cheers,
Jörn


------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to