I've been taking the new tdbloader2 out for a spin with some fairly large 
datasets. In total, I have about 3Billion triples I am trying to load. I have 
87 turtle files that average around 1-2GB each. I am running the job under 
Ubuntu 10.10 on a quad core system with 6GB of ram. The load process runs very 
vast up until about 26M triples and performance drops sharply from about 100k 
down to about 400 and the it eventually runs out of memory.

I am using TDB 0.8.9. I tried to tweak the memory settings, but that only 
prolongs the problem. I am assuming that 1-2GB files are a likely culprit, but 
I wanted to be sure. Also, does tdbloader2 have a preference to N-Triples over 
Turtle?

Ryan-

Ryan J. McDonough
Architect
Service Platforms
NOKIA INC.

Reply via email to