[email protected] wrote: > --001a11c2ed64c222e504e2c14561 > Content-Type: text/plain; charset=UTF-8 > > Hi, > > All writes occur in the parent process only. The child (normally) only > reopens the environment and performs a few short reads. > > But, it's the actual opening of the env in the forked child that is causing > the database growth. I tried to close the env straight after opening it in > the child (without performing any reads), and have encountered the same > issues.
I don't see anything like that here. There's nothing in mdb_env_open(MDB_NOSYNC) that can even affect the size of the DB file. I don't think there's anything we can investigate without sample code that reproduces the situation. > > Hope that makes sense, > Dimitrij > On 30 Jul 2013 21:19, "Howard Chu" <[email protected]> wrote: > >> dimitrij.denissenko@**blacksquaremedia.com<[email protected]>wrote: >> >>> Full_Name: Dimitrij Denissenko >>> Version: >>> OS: Ubuntu 12.04 >>> URL: >>> Submission from: (NULL) (62.30.100.0) >>> >>> >>> Hi, >>> >>> I found an interesting issue with LMDB. I have populated the DB with a >>> bunch of >>> records and it uses ~30M on disk (after sync). Then I added a background >>> process >>> to my app and populated the database again with the same record set. >>> Surprisingly. the resulting size on disk was >70M. >>> >>> The background process is forked periodically to perform some maintenance >>> tasks, >>> here is my (simplified) code: >>> >>> /* Close env before forking */ >>> mdb_env_close(env); >>> >>> if ((childpid = fork()) == 0) { >>> /* Child */ >>> rc = mdb_env_open(env, ".", MDB_NOSYNC, 0644); >>> ... >>> } else { >>> /* Parent */ >>> rc = mdb_env_open(env, ".", MDB_NOSYNC, 0644); >>> ... >>> } >>> >>> I could narrow it down to the mdb_env_open call in the child. If I add >>> exit(0) >>> before the mdb_env_open line, the DB size remains consistently at ~30M. >>> The data >>> size seems to grow proportionally to the number of forks performed during >>> data >>> load. What could be causing the growth? What can I do to prevent it? >>> >>> Thanks in advance >>> >>> PS: I tried it with MDB_FIXMAP and without, same result. >>> >> >> Without seeing more of your code, it's impossible to tell. Are you adding >> the data on both sides of the fork? In the above code snippet, where are >> your mdb_put calls occurring? Are both the parent and child processes >> writing identical data? >> >> -- >> -- Howard Chu >> CTO, Symas Corp. http://www.symas.com >> Director, Highland Sun http://highlandsun.com/hyc/ >> Chief Architect, OpenLDAP >> http://www.openldap.org/**project/<http://www.openldap.org/project/> >> > > --001a11c2ed64c222e504e2c14561 > Content-Type: text/html; charset=UTF-8 > Content-Transfer-Encoding: quoted-printable > > <p dir=3D"ltr">Hi,</p> > <p dir=3D"ltr">All writes occur in the parent process only. The child (norm= > ally) only reopens the environment and performs a few short reads. </p> > <p dir=3D"ltr">But, it's the actual opening of the env in the forked ch= > ild that is causing the database growth. I tried to close the env straight = > after opening it in the child (without performing any reads), and have enco= > untered the same issues.</p> > > <p dir=3D"ltr">Hope that makes sense,<br> > Dimitrij</p> > <div class=3D"gmail_quote">On 30 Jul 2013 21:19, "Howard Chu" <= > ;<a href=3D"mailto:[email protected]">[email protected]</a>> wrote:<br type=3D"a= > ttribution"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;bo= > rder-left:1px #ccc solid;padding-left:1ex"> > <a href=3D"mailto:[email protected]" target=3D"_blan= > k">dimitrij.denissenko@<u></u>blacksquaremedia.com</a> wrote:<br> > <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= > x #ccc solid;padding-left:1ex"> > Full_Name: Dimitrij Denissenko<br> > Version:<br> > OS: Ubuntu 12.04<br> > URL:<br> > Submission from: (NULL) (62.30.100.0)<br> > <br> > <br> > Hi,<br> > <br> > I found an interesting issue with LMDB. I have populated the DB with a bunc= > h of<br> > records and it uses ~30M on disk (after sync). Then I added a background pr= > ocess<br> > to my app and populated the database again with the same record set.<br> > Surprisingly. the resulting size on disk was >70M.<br> > <br> > The background process is forked periodically to perform some maintenance t= > asks,<br> > here is my (simplified) code:<br> > <br> > /* Close env before forking */<br> > mdb_env_close(env);<br> > <br> > if ((childpid =3D fork()) =3D=3D 0) {<br> > =C2=A0 =C2=A0 =C2=A0/* Child */<br> > =C2=A0 =C2=A0 =C2=A0rc =3D mdb_env_open(env, ".", MDB_NOSYNC, 064= > 4);<br> > =C2=A0 =C2=A0 =C2=A0...<br> > } else {<br> > =C2=A0 =C2=A0 =C2=A0/* Parent */<br> > =C2=A0 =C2=A0 =C2=A0rc =3D mdb_env_open(env, ".", MDB_NOSYNC, 064= > 4);<br> > =C2=A0 =C2=A0 =C2=A0...<br> > }<br> > <br> > I could narrow it down to the mdb_env_open call in the child. If I add exit= > (0)<br> > before the mdb_env_open line, the DB size remains consistently at ~30M. The= > data<br> > size seems to grow proportionally to the number of forks performed during d= > ata<br> > load. What could be causing the growth? What can I do to prevent it?<br> > <br> > Thanks in advance<br> > <br> > PS: I tried it with MDB_FIXMAP and without, same result.<br> > </blockquote> > <br> > Without seeing more of your code, it's impossible to tell. Are you addi= > ng the data on both sides of the fork? In the above code snippet, where are= > your mdb_put calls occurring? Are both the parent and child processes writ= > ing identical data?<br> > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
