new hint :
I commented out the ((LuceneIndexService) indexService).enableCache(
"uri", 500000 );
and it started to work. Is this cache setting mixed up with commit() ?
> It does not solve the problem. I removed the if (count>=10000) { block
> and do commit after every rc.add but program fails on last file.
> More over I did run the program with rc.add( file, "",
> RDFFormat.NTRIPLES,context); commiting only once, after all files are
> loaded and it is finished successfully.
>
>
>
> file_: links_uscensus_en.nt
> [WARNING] an additional exception was thrown
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:592)
> at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:283)
> at java.lang.Thread.run(Thread.java:613)
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:172)
> at
> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136)
> at
> org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:247)
> at
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157)
> at
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
> at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80)
> at
> org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:144)
> at org.apache.lucene.search.TermScorer.nextDoc(TermScorer.java:130)
> at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:248)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
> at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:113)
> at org.apache.lucene.search.Hits.<init>(Hits.java:80)
> at org.apache.lucene.search.Searcher.search(Searcher.java:52)
> at org.apache.lucene.search.Searcher.search(Searcher.java:42)
> at
> org.neo4j.index.lucene.LuceneIndexService.searchForNodes(LuceneIndexService.java:387)
> at
> org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:272)
> at
> org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:228)
> at
> org.neo4j.index.lucene.LuceneIndexService.getSingleNode(LuceneIndexService.java:405)
> at
> org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupNode(AbstractUriBasedExecutor.java:162)
> at
> org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupOrCreateNode(AbstractUriBasedExecutor.java:177)
> at
> org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.handleAddObjectRepresentation(VerboseQuadExecutor.java:262)
> at
> org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.addToNodeSpace(VerboseQuadExecutor.java:70)
> at org.neo4j.rdf.store.RdfStoreImpl.addStatement(RdfStoreImpl.java:89)
> at org.neo4j.rdf.store.RdfStoreImpl.addStatements(RdfStoreImpl.java:69)
> at
> org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.internalAddStatement(GraphDatabaseSailConnectionImpl.java:623)
> at
> org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.innerAddStatement(GraphDatabaseSailConnectionImpl.java:440)
> at
> org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.addStatement(GraphDatabaseSailConnectionImpl.java:478)
> at
> org.openrdf.repository.sail.SailRepositoryConnection.addWithoutCommit(SailRepositoryConnection.java:228)
> at
> org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:460)
> at gov.lanl.memento.core.Test.get(Test.java:229)
> at gov.lanl.memento.core.Test.main(Test.java:103)
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO]
> ------------------------------------------------------------------------
> [INFO] An exception occured while executing the Java class. Java heap
> space
>
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Trace
> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
> occured while executing the Java class. Java heap space
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583)
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512)
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482)
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330)
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291)
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
> at org.apache.maven.cli.MavenCli.main(MavenCli.java:287)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:592)
> at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
> at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
> at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
> at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
> Caused by: org.apache.maven.plugin.MojoExecutionException: An exception
> occured while executing the Java class. Java heap space
> at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
> at
> org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451)
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558)
> ... 16 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at
> org.neo4j.kernel.impl.cache.AdaptiveCacheManager.adaptCaches(AdaptiveCacheManager.java:237)
> at
> org.neo4j.kernel.impl.cache.AdaptiveCacheManager$AdaptiveCacheWorker.run(AdaptiveCacheManager.java:218)
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Total time: 4 minutes 33 seconds
> [INFO] Finished at: Thu Apr 15 09:34:27 MDT 2010
> [INFO] Final Memory: 12M/61M
> [INFO] ------------------------------------------
>
>
>
>
>> You increment the "count" variable even if the line is empty... which
>> means
>> that it could possibly skip the 10000 mark sometimes, change the
>> condition
>> to "if(count >=10000)" instead. Also how much heap have you given the
>> JVM?
>>
>> 2010/4/14 Lyudmila L. Balakireva <[email protected]>
>>
>>> Hi,
>>> I was loading on by file basis and was commiting after each file.
>>> Even though 22 mln file was finished in 5 hours and 66 mln did not
>>> finish in 3 days.
>>> I rewrite the program to read file and commit after some amount of
>>> records in hope better to control memory
>>> but program fails with " out of memory error" even for small dataset
>>> (80 000) . (With file approach the small dataset was loading without
>>> problem).
>>> my snippet:
>>> for ( File file : files )
>>> { SimpleTimer timer = new SimpleTimer();
>>> FileInputStream in = new
>>> FileInputStream(file);
>>> BufferedReader br = new
>>> BufferedReader(new
>>> InputStreamReader(in));
>>> String strLine;
>>> int count=0;
>>> while ((strLine = br.readLine()) !=
>>> null) {
>>> count = count+1;
>>>
>>> if (strLine.trim().length()
>>> !=
>>> 0)
>>> {
>>> String[] result =
>>> strLine.split("\\s");
>>>
>>>
>>> rc.add(f.createURI(stripeN3(result[0])),f.createURI(stripeN3(result[1])),f.createURI(stripeN3(result[2])),
>>> context) ;
>>>
>>>
>>> if
>>> (count==10000)
>>> {
>>> //rc.add( file,
>>> "",
>>> RDFFormat.NTRIPLES,context);
>>>
>>> rc.commit();
>>> count = 0;
>>> }
>>> }
>>> }
>>>
>>> br.close();
>>> in.close();
>>> rc.commit();
>>>
>>> timer.end();
>>> }
>>>
>>> sumtimer.end();
>>> rc.commit();
>>> rc.close();
>>> }
>>>
>>> What can cause the problem :
>>> INFO] Trace
>>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>>> occured while executing the Java class. Java heap space
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583)
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512)
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482)
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330)
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291)
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
>>> at
>>> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
>>> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
>>> at org.apache.maven.cli.MavenCli.main(MavenCli.java:287)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:592)
>>> at
>>> org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
>>> at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
>>> at
>>> org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
>>> at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
>>> Caused by: org.apache.maven.plugin.MojoExecutionException: An exception
>>> occured while executing the Java class. Java heap space
>>> at
>>> org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
>>> at
>>>
>>> org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451)
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558)
>>> ... 16 more
>>> Caused by: java.lang.OutOfMemoryError: Java heap space
>>> at java.nio.ByteBuffer.wrap(ByteBuffer.java:350)
>>> at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
>>> at
>>> org.neo4j.kernel.impl.transaction.XidImpl.getNewGlobalId(XidImpl.java:55)
>>> at
>>>
>>> org.neo4j.kernel.impl.transaction.TransactionImpl.<init>(TransactionImpl.java:67)
>>> at
>>> org.neo4j.kernel.impl.transaction.TxManager.begin(TxManager.java:497)
>>> at
>>> org.neo4j.kernel.EmbeddedGraphDbImpl.beginTx(EmbeddedGraphDbImpl.java:238)
>>> at
>>>
>>> org.neo4j.kernel.EmbeddedGraphDatabase.beginTx(EmbeddedGraphDatabase.java:139)
>>> at
>>>
>>> org.neo4j.index.impl.GenericIndexService.beginTx(GenericIndexService.java:105)
>>> at
>>> org.neo4j.index.impl.IndexServiceQueue.run(IndexServiceQueue.java:221)
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] Total time: 5 minutes 14 seconds
>>>
>>>
>>> Thank you for the help,
>>> Lyudmila
>>>
>>>
>>>
>>>
>>> > There are some problems at the moment regarding insertion speeds.
>>> >
>>> > o We haven't yet created an rdf store which can use a BatchInserter
>>> (which
>>> > could also be tweaked to ignore checking if statements already exists
>>> > before
>>> > it adds each statement and all that).
>>> > o The other one is that the sail layer on top of the neo4j-rdf
>>> component
>>> > contains functionality which allows a thread to have more than one
>>> running
>>> > transaction at the same time. This was added due to some users
>>> > requirements,
>>> > but slows it down by a factor 2 or something (not sure about this).
>>> >
>>> > I would like to see both these issues resolved soon, and when they
>>> are
>>> > fixed
>>> > insertion speeds will be quite nice!
>>> >
>>> > 2010/4/9 Lyudmila L. Balakireva <[email protected]>
>>> >
>>> >> Hi,
>>> >> How to optimize loading to the VerboseQuadStore?
>>> >> I am doing test similar to the test example from neo rdf sail
>>> and
>>> it
>>> >> is very slow. The size of files 3G - 7G .
>>> >> Thanks,
>>> >> Luda
>>> >> _______________________________________________
>>> >> Neo mailing list
>>> >> [email protected]
>>> >> https://lists.neo4j.org/mailman/listinfo/user
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Mattias Persson, [[email protected]]
>>> > Hacker, Neo Technology
>>> > www.neotechnology.com
>>> > _______________________________________________
>>> > Neo mailing list
>>> > [email protected]
>>> > https://lists.neo4j.org/mailman/listinfo/user
>>> >
>>>
>>> _______________________________________________
>>> Neo mailing list
>>> [email protected]
>>> https://lists.neo4j.org/mailman/listinfo/user
>>>
>>
>>
>>
>> --
>> Mattias Persson, [[email protected]]
>> Hacker, Neo Technology
>> www.neotechnology.com
>> _______________________________________________
>> Neo mailing list
>> [email protected]
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
> _______________________________________________
> Neo mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user