Re: [Neo4j] MMap Error on importing large data

Craig Taverner Sun, 27 Feb 2011 09:49:36 -0800

Funny you should suggest this. 10 minutes ago I started a new test run that
does exactly that. If any test case takes more than 5min to import, I run 3
gc() with 1s sleeps in between. OK, so not your 5s sleep, but let's see if
it helps.


I also reduced Xmx to 1600M, and reduced the mmap settings by a similar
amount, hopefully generally reducing the apps memory consumption somewhat.
Let's see what happens this time.

On Sun, Feb 27, 2011 at 6:45 PM, Michael Hunger <
michael.hun...@neotechnology.com> wrote:

> you can try to null the batch inserter and all its external deps that you
> control add several System.gc() with some thread.sleep(5000) in between that
> should free the heap
> you can also output runtimes free memory or even better have a jconsole run
> concurrently to see heap allocation history (it might even show mmap)
>
> Michael
>
> Sent from my iBrick4
>
>
> Am 27.02.2011 um 18:33 schrieb Craig Taverner <cr...@amanzi.com>:
>
> >>
> >> What about the "IOException Operation not permitted" ?
> >> Can you check the access rights on your store?
> >>
> >
> > They look fine (644 and 755). Also, it would seem strange for the access
> > rights to change in the middle of a run. The database is being written to
> > continuously for  about 5 hours successfully before this error. I also
> note
> > that I have 20GB free space, so running out of disk space seems unlikely.
> > Having said that, I will do another run with a parallel check for disk
> space
> > also.
> >
> > While googling I saw that you had a similar problem in November, that
> Johan
> >> answered.
> >> From the answer it seems that the kernel adapts its memory usage and
> >> segmentation from the store size.
> >> So as the store size before the import was zero, probably some of the
> >> adjustments that normally
> >> take place for such a large store won't be done.
> >>
> >
> > I create both the batch inserter and the graph database service with a
> set
> > configuration, as in the top of the file at
> >
> https://github.com/neo4j/neo4j-spatial/blob/master/src/test/java/org/neo4j/gis/spatial/Neo4jTestCase.java
> >
> > So your suggestion to run the batch insert in a first VM run and the API
> >> work in a second one makes a lot of
> >> sense to me, because the kernel is then able to optimize memory usage at
> >> startup (if you didn't supply a config file).
> >>
> >
> > I will try that tomorrow perhaps. I would need to extract the test code
> to a
> > place I can use from a console app first. But I noticed also that Mattias
> > thought that two JVM's would not help.
> >
> > Regarding the test-issue. I would really love to have this code elsewhere
> >> and just used in the tests, then it could be used
> >> by other people too and that would it perhaps also easier to reproduce
> your
> >> problem just with the data file.
> >>
> >
> > I can do that. I'm short of time right now, but will see if I can get to
> > that soon. Should be relatively simple to extract to the OSMDataset, so
> > other users can call it. Basically the code traverses both the GIS
> (layers)
> > views of the OSM data model, and the OSM view (ways, nodes, changesets,
> > users) and produces some statistics on what is found. Could be generally
> > interesting. The one messy part is the code also makes a number of
> > assertions for expected patterns, and this only makes sense in the JUnit
> > test. I would need to save the stats to a map, return that to the junit
> code
> > so it can make the assertions later.
> >
> > Can you point me to the data file used and attach the test case that you
> >> probably modified locally? Then I'd try this at my machine.
> >>
> >
> > I've just pushed the code to github. The test class is the TestOSMImport.
> > Currently it skips a test if the test data is missing, and there is only
> > data for two specific test cases in the code base (Billesholm and Malmö).
> To
> > get it to run the big tests, simply download denmark.osm and/or
> croatia.osm
> > from downloads.cloudmade.com. At the moment croatia.osm imports fine, at
> > reasonable performance, but denmark.osm is the one giving the problems.
> >
> > Looks like the memory mapped buffer configuration needs to be tweaked.
> >>
> >
> > From Johans previous answer, combined with something I read on the wiki,
> it
> > seems that the batch inserter needs different mmap settings than the
> normal
> > API. I read that the batch inserter uses the heap for its mmap, while the
> > normal API does not. If I understand correctly, this means that when
> using
> > the batch inserter, we have to use smaller mmap, otherwise we might fill
> the
> > heap too soon?
> >
> > In any case, it seems like keeping mmap settings relatively small should
> > avoid this problem, although might not lead to best performance? Have I
> > understood correctly?
> >
> > On Windows heap buffers are used by default and auto configuration
> >> will look how much heap is available. Getting out of memory exceptions
> >> is an indication that the configuration passed in is using more memory
> >> than available heap.
> >>
> >
> > I am currently using -Xmx2048 on a 4GB ram machine, 32bit java, and the
> > settings:
> >
> >    static {
> >        NORMAL_CONFIG.put( "neostore.nodestore.db.mapped_memory", "50M" );
> >        NORMAL_CONFIG.put(
> > "neostore.relationshipstore.db.mapped_memory", "150M" );
> >        NORMAL_CONFIG.put( "neostore.propertystore.db.mapped_memory",
> "200M" );
> >        NORMAL_CONFIG.put(
> > "neostore.propertystore.db.strings.mapped_memory", "300M" );
> >        NORMAL_CONFIG.put(
> > "neostore.propertystore.db.arrays.mapped_memory", "10M" );
> >        NORMAL_CONFIG.put( "dump_configuration", "false" );
> >    }
> >
> >
> > These settings do not seem to be too high, but if the normal graph
> database
> > service will allocate memory outside the heap, and the heap has already
> been
> > filled by the batch inserter, perhaps that is where the problem lies?
> > Perhaps we do need a way of freeing memory more aggressively after the
> batch
> > insertion phase? Get the heap down before allowing the normal API access
> to
> > the memory?
> >
> > Regards, Craig
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] MMap Error on importing large data

Reply via email to