It does not solve the problem. I removed the if (count>=10000) { block
and do commit after every rc.add but program fails on last file.
More over I did run the program with rc.add( file, "",
RDFFormat.NTRIPLES,context); commiting only once, after all files are
loaded and it is finished successfully.
file_: links_uscensus_en.nt
[WARNING] an additional exception was thrown
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:592)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:283)
at java.lang.Thread.run(Thread.java:613)
Caused by: java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:172)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136)
at
org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:247)
at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157)
at
org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80)
at
org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:144)
at org.apache.lucene.search.TermScorer.nextDoc(TermScorer.java:130)
at org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:248)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:173)
at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:113)
at org.apache.lucene.search.Hits.<init>(Hits.java:80)
at org.apache.lucene.search.Searcher.search(Searcher.java:52)
at org.apache.lucene.search.Searcher.search(Searcher.java:42)
at
org.neo4j.index.lucene.LuceneIndexService.searchForNodes(LuceneIndexService.java:387)
at
org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:272)
at
org.neo4j.index.lucene.LuceneIndexService.getNodes(LuceneIndexService.java:228)
at
org.neo4j.index.lucene.LuceneIndexService.getSingleNode(LuceneIndexService.java:405)
at
org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupNode(AbstractUriBasedExecutor.java:162)
at
org.neo4j.rdf.store.representation.standard.AbstractUriBasedExecutor.lookupOrCreateNode(AbstractUriBasedExecutor.java:177)
at
org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.handleAddObjectRepresentation(VerboseQuadExecutor.java:262)
at
org.neo4j.rdf.store.representation.standard.VerboseQuadExecutor.addToNodeSpace(VerboseQuadExecutor.java:70)
at org.neo4j.rdf.store.RdfStoreImpl.addStatement(RdfStoreImpl.java:89)
at org.neo4j.rdf.store.RdfStoreImpl.addStatements(RdfStoreImpl.java:69)
at
org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.internalAddStatement(GraphDatabaseSailConnectionImpl.java:623)
at
org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.innerAddStatement(GraphDatabaseSailConnectionImpl.java:440)
at
org.neo4j.rdf.sail.GraphDatabaseSailConnectionImpl.addStatement(GraphDatabaseSailConnectionImpl.java:478)
at
org.openrdf.repository.sail.SailRepositoryConnection.addWithoutCommit(SailRepositoryConnection.java:228)
at
org.openrdf.repository.base.RepositoryConnectionBase.add(RepositoryConnectionBase.java:460)
at gov.lanl.memento.core.Test.get(Test.java:229)
at gov.lanl.memento.core.Test.main(Test.java:103)
[INFO]
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO]
------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. Java heap space
[INFO]
------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: An exception
occured while executing the Java class. Java heap space
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583)
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512)
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482)
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330)
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291)
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:592)
at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
Caused by: org.apache.maven.plugin.MojoExecutionException: An exception
occured while executing the Java class. Java heap space
at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
at
org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451)
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558)
... 16 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at
org.neo4j.kernel.impl.cache.AdaptiveCacheManager.adaptCaches(AdaptiveCacheManager.java:237)
at
org.neo4j.kernel.impl.cache.AdaptiveCacheManager$AdaptiveCacheWorker.run(AdaptiveCacheManager.java:218)
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 4 minutes 33 seconds
[INFO] Finished at: Thu Apr 15 09:34:27 MDT 2010
[INFO] Final Memory: 12M/61M
[INFO] ------------------------------------------
> You increment the "count" variable even if the line is empty... which
> means
> that it could possibly skip the 10000 mark sometimes, change the condition
> to "if(count >=10000)" instead. Also how much heap have you given the JVM?
>
> 2010/4/14 Lyudmila L. Balakireva <[email protected]>
>
>> Hi,
>> I was loading on by file basis and was commiting after each file.
>> Even though 22 mln file was finished in 5 hours and 66 mln did not
>> finish in 3 days.
>> I rewrite the program to read file and commit after some amount of
>> records in hope better to control memory
>> but program fails with " out of memory error" even for small dataset
>> (80 000) . (With file approach the small dataset was loading without
>> problem).
>> my snippet:
>> for ( File file : files )
>> { SimpleTimer timer = new SimpleTimer();
>> FileInputStream in = new
>> FileInputStream(file);
>> BufferedReader br = new
>> BufferedReader(new
>> InputStreamReader(in));
>> String strLine;
>> int count=0;
>> while ((strLine = br.readLine()) !=
>> null) {
>> count = count+1;
>>
>> if (strLine.trim().length() !=
>> 0)
>> {
>> String[] result =
>> strLine.split("\\s");
>>
>>
>> rc.add(f.createURI(stripeN3(result[0])),f.createURI(stripeN3(result[1])),f.createURI(stripeN3(result[2])),
>> context) ;
>>
>>
>> if (count==10000)
>> {
>> //rc.add( file,
>> "",
>> RDFFormat.NTRIPLES,context);
>>
>> rc.commit();
>> count = 0;
>> }
>> }
>> }
>>
>> br.close();
>> in.close();
>> rc.commit();
>>
>> timer.end();
>> }
>>
>> sumtimer.end();
>> rc.commit();
>> rc.close();
>> }
>>
>> What can cause the problem :
>> INFO] Trace
>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>> occured while executing the Java class. Java heap space
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:583)
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:512)
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:482)
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:330)
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:291)
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:142)
>> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:336)
>> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:129)
>> at org.apache.maven.cli.MavenCli.main(MavenCli.java:287)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:592)
>> at
>> org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
>> at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
>> at
>> org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
>> at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
>> Caused by: org.apache.maven.plugin.MojoExecutionException: An exception
>> occured while executing the Java class. Java heap space
>> at
>> org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
>> at
>>
>> org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:451)
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:558)
>> ... 16 more
>> Caused by: java.lang.OutOfMemoryError: Java heap space
>> at java.nio.ByteBuffer.wrap(ByteBuffer.java:350)
>> at java.nio.ByteBuffer.wrap(ByteBuffer.java:373)
>> at
>> org.neo4j.kernel.impl.transaction.XidImpl.getNewGlobalId(XidImpl.java:55)
>> at
>>
>> org.neo4j.kernel.impl.transaction.TransactionImpl.<init>(TransactionImpl.java:67)
>> at
>> org.neo4j.kernel.impl.transaction.TxManager.begin(TxManager.java:497)
>> at
>> org.neo4j.kernel.EmbeddedGraphDbImpl.beginTx(EmbeddedGraphDbImpl.java:238)
>> at
>>
>> org.neo4j.kernel.EmbeddedGraphDatabase.beginTx(EmbeddedGraphDatabase.java:139)
>> at
>>
>> org.neo4j.index.impl.GenericIndexService.beginTx(GenericIndexService.java:105)
>> at
>> org.neo4j.index.impl.IndexServiceQueue.run(IndexServiceQueue.java:221)
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Total time: 5 minutes 14 seconds
>>
>>
>> Thank you for the help,
>> Lyudmila
>>
>>
>>
>>
>> > There are some problems at the moment regarding insertion speeds.
>> >
>> > o We haven't yet created an rdf store which can use a BatchInserter
>> (which
>> > could also be tweaked to ignore checking if statements already exists
>> > before
>> > it adds each statement and all that).
>> > o The other one is that the sail layer on top of the neo4j-rdf
>> component
>> > contains functionality which allows a thread to have more than one
>> running
>> > transaction at the same time. This was added due to some users
>> > requirements,
>> > but slows it down by a factor 2 or something (not sure about this).
>> >
>> > I would like to see both these issues resolved soon, and when they are
>> > fixed
>> > insertion speeds will be quite nice!
>> >
>> > 2010/4/9 Lyudmila L. Balakireva <[email protected]>
>> >
>> >> Hi,
>> >> How to optimize loading to the VerboseQuadStore?
>> >> I am doing test similar to the test example from neo rdf sail and
>> it
>> >> is very slow. The size of files 3G - 7G .
>> >> Thanks,
>> >> Luda
>> >> _______________________________________________
>> >> Neo mailing list
>> >> [email protected]
>> >> https://lists.neo4j.org/mailman/listinfo/user
>> >>
>> >
>> >
>> >
>> > --
>> > Mattias Persson, [[email protected]]
>> > Hacker, Neo Technology
>> > www.neotechnology.com
>> > _______________________________________________
>> > Neo mailing list
>> > [email protected]
>> > https://lists.neo4j.org/mailman/listinfo/user
>> >
>>
>> _______________________________________________
>> Neo mailing list
>> [email protected]
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
>
>
> --
> Mattias Persson, [[email protected]]
> Hacker, Neo Technology
> www.neotechnology.com
> _______________________________________________
> Neo mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user