[h2] Re: compactRewriteFully safety with DB writes
Hi Andrei, sorry to be unclear. I'll try to summarise: * I have a small file-based MVStore (~30MB on disk with LZW). * For the reasons described previously I can't estimate memory size of entry values in a way that's meaningful, so instead the size reported for entry values is fixed at 1024 and the cache is made very large. The idea is that entries read from the store, and entries added to the store are never flushed from the cache, and instead we flush it manually every now and then. This should work as the dataset is far smaller than the available heap. * This approach does work stably with 1.4.197, staying at around 150MB heap usage over a long period of many writes, reads, and deletes. * When switching to 1.4.200 the same approach results in 800MB+ of heap usage and growing. I stopped the experiment when I started getting low heap warnings (> 90% usage). Looking at the code for the LIRS cache, I can see it's gained weak references as you pointed out, so I now wonder if the JVM in my experiment would have taken action at some point, perhaps cleared those references and reduced heap usage. However, I'd like to understand where all that heap is going, and if I can avoid it triggering those warnings. One thing that may be atypical in this workload is there are a *lot* of deletes, typically balanced by a lot of adds (the total data size tends to be the same over time). If those deleted entries stay in the heap, then that would definitely cause this. Hope that's clearer. Cheers, Matt. On Sunday, July 25, 2021 at 11:55:25 PM UTC+9:30 andrei...@gmail.com wrote: > Hi Matt, > > After reading your last message, I still fail to understand what exactly > "no longer works" with 1.4.200? > If your concern is a heap usage increase, I would say it should never be a > problem on it's own, and in this case it is kind of expected, because cache > now may keep weak references for items that were just dropped in previous > version. > You also mention attempt to run in-memory configuration, does it fail with > 1.4.200? > > On Sunday, July 25, 2021 at 2:41:45 AM UTC-4 matt...@gmail.com wrote: > >> Sorry to keep replying to myself, but another question. >> >> After trialling the latest H2 (1.4.200) with our existing data, I saw >> heap usage shoot up from 150-200MB where it hovers with 1.4.147 to 840M and >> rising after a fairly short run time. >> >> Now, I expect this is totally caused by my set up since (a) our >> DataType.getMemory () simply returns 1024, and our MVStore cache size is >> set to 32000 (~32GB). I chose this due to the above issues with object size >> estimation so that data would effectively never leave the cache, and we >> manage long-term this by flushing the cache every 24 hours. >> >> Since the dataset in this case fits in a 35MB .mv file (it is compressed >> with LZW though), the all-in-memory idea seemed reasonable, and indeed ran >> stably with 1.4.197. >> >> I can see from the source code for Page and MVMap that a lot of work has >> gone into this area, but after some time looking at it I'm still not sure >> how to proceed, other than playing with other numbers for getMemory () >> and/or cache size. >> >> Do you have any ideas why my hacky solution no longer works and/or >> suggestions on how I might approach making it work with 1.4.200? >> >> Cheers, >> >> Matt. >> >> On Monday, July 19, 2021 at 11:24:30 AM UTC+9:30 Matthew Phillips wrote: >> >>> Hi Andrei, >>> >>> thanks again. We have a use-case that's very high on writes/deletes, and >>> the DB was apparently getting very fragmented. Using regular compact() with >>> 80 fill rate has halved the size, and we've done away with the nightly "big >>> compact" task. So for this use case it's working very well so far. >>> >>> The need for a cache flush is due to the fact that we store many >>> versions of the same data, most of them as small diffs against a previous >>> version. We reconstitute these versions into Clojure persistent maps, which >>> means there's a very high level of structural sharing between objects. So >>> when adding two maps A & B to the cache, the used memory for those maps is >>> almost certainly nothing like getMemory(A) + getMemory(B) due to (probable) >>> structural sharing between them. >>> >>> We did try some heuristic workarounds, but were always off by so much >>> that performance suffered badly (either flushing the cache when we didn't >>> need to, or writing too often). So, we use an effectively infinite cache, >>> flush overnight and let it re-fill. Not ideal, but don't have a better >>> solution right now. >>> >>> Cheers, >>> >>> Matt. >>> On Sunday, July 18, 2021 at 11:49:43 PM UTC+9:30 andrei...@gmail.com >>> wrote: >>> Hi Matt, IMHO, the best way to compact will be off-line one - MVStoreTool.compact(), and it can take only seconds (your mileage may very, of course). If you can not afford 1 min off-line interruption, then you
[h2] Re: compactRewriteFully safety with DB writes
Hi Matt, After reading your last message, I still fail to understand what exactly "no longer works" with 1.4.200? If your concern is a heap usage increase, I would say it should never be a problem on it's own, and in this case it is kind of expected, because cache now may keep weak references for items that were just dropped in previous version. You also mention attempt to run in-memory configuration, does it fail with 1.4.200? On Sunday, July 25, 2021 at 2:41:45 AM UTC-4 matt...@gmail.com wrote: > Sorry to keep replying to myself, but another question. > > After trialling the latest H2 (1.4.200) with our existing data, I saw heap > usage shoot up from 150-200MB where it hovers with 1.4.147 to 840M and > rising after a fairly short run time. > > Now, I expect this is totally caused by my set up since (a) our > DataType.getMemory () simply returns 1024, and our MVStore cache size is > set to 32000 (~32GB). I chose this due to the above issues with object size > estimation so that data would effectively never leave the cache, and we > manage long-term this by flushing the cache every 24 hours. > > Since the dataset in this case fits in a 35MB .mv file (it is compressed > with LZW though), the all-in-memory idea seemed reasonable, and indeed ran > stably with 1.4.197. > > I can see from the source code for Page and MVMap that a lot of work has > gone into this area, but after some time looking at it I'm still not sure > how to proceed, other than playing with other numbers for getMemory () > and/or cache size. > > Do you have any ideas why my hacky solution no longer works and/or > suggestions on how I might approach making it work with 1.4.200? > > Cheers, > > Matt. > > On Monday, July 19, 2021 at 11:24:30 AM UTC+9:30 Matthew Phillips wrote: > >> Hi Andrei, >> >> thanks again. We have a use-case that's very high on writes/deletes, and >> the DB was apparently getting very fragmented. Using regular compact() with >> 80 fill rate has halved the size, and we've done away with the nightly "big >> compact" task. So for this use case it's working very well so far. >> >> The need for a cache flush is due to the fact that we store many versions >> of the same data, most of them as small diffs against a previous version. >> We reconstitute these versions into Clojure persistent maps, which means >> there's a very high level of structural sharing between objects. So when >> adding two maps A & B to the cache, the used memory for those maps is >> almost certainly nothing like getMemory(A) + getMemory(B) due to (probable) >> structural sharing between them. >> >> We did try some heuristic workarounds, but were always off by so much >> that performance suffered badly (either flushing the cache when we didn't >> need to, or writing too often). So, we use an effectively infinite cache, >> flush overnight and let it re-fill. Not ideal, but don't have a better >> solution right now. >> >> Cheers, >> >> Matt. >> On Sunday, July 18, 2021 at 11:49:43 PM UTC+9:30 andrei...@gmail.com >> wrote: >> >>> Hi Matt, >>> IMHO, the best way to compact will be off-line one - >>> MVStoreTool.compact(), and it can take only seconds (your mileage may very, >>> of course). >>> If you can not afford 1 min off-line interruption, then you can try just >>> to let it run and do it's own maintenance in the background (asuming >>> autoCommitDelay > 0). >>> If I only knew some "best/better" way for on-line compaction, it would >>> probably be there already, as a background maintenance procedure. >>> I expect that the existing one will fit the bill, unless you update rate >>> is quite high. >>> BTW, flushing the cache looks like a futile exercise, indeed. >>> >>> On Thursday, July 15, 2021 at 7:36:05 PM UTC-4 matt...@gmail.com wrote: >>> Hello Andrei, thanks very much for your reply. Yes, I'm aware I'm on an old version: if it's not broken, don't fix it ;) Version 1.4.197 has been rock-solid for us for years, and I'm always loathe to change things for no reason. But you have given me a good reason, so I'll give the latest MVStore a try. Can you recommend the best way to 'manually' compact the database in the latest release? And just to be sure: could there be any data-loss issues from flushing the cache? Cheers, Matt. On Friday, July 16, 2021 at 4:45:12 AM UTC+9:30 andrei...@gmail.com wrote: > Hi Matt, > > If you are experiencing a problem, which looks and smells like a > cuncurrency issue, then there is definitely a good reason to suspect a > concurrency issue. 8-) > The real question here is: if you care enough about those problems, > why are you still on version 1.4.197. MVStore's concurrency / > synchronization was totally re-designed since then (and we are talking > years here), for example you will not even find >
[h2] Re: compactRewriteFully safety with DB writes
Sorry to keep replying to myself, but another question. After trialling the latest H2 (1.4.200) with our existing data, I saw heap usage shoot up from 150-200MB where it hovers with 1.4.147 to 840M and rising after a fairly short run time. Now, I expect this is totally caused by my set up since (a) our DataType.getMemory () simply returns 1024, and our MVStore cache size is set to 32000 (~32GB). I chose this due to the above issues with object size estimation so that data would effectively never leave the cache, and we manage long-term this by flushing the cache every 24 hours. Since the dataset in this case fits in a 35MB .mv file (it is compressed with LZW though), the all-in-memory idea seemed reasonable, and indeed ran stably with 1.4.197. I can see from the source code for Page and MVMap that a lot of work has gone into this area, but after some time looking at it I'm still not sure how to proceed, other than playing with other numbers for getMemory () and/or cache size. Do you have any ideas why my hacky solution no longer works and/or suggestions on how I might approach making it work with 1.4.200? Cheers, Matt. On Monday, July 19, 2021 at 11:24:30 AM UTC+9:30 Matthew Phillips wrote: > Hi Andrei, > > thanks again. We have a use-case that's very high on writes/deletes, and > the DB was apparently getting very fragmented. Using regular compact() with > 80 fill rate has halved the size, and we've done away with the nightly "big > compact" task. So for this use case it's working very well so far. > > The need for a cache flush is due to the fact that we store many versions > of the same data, most of them as small diffs against a previous version. > We reconstitute these versions into Clojure persistent maps, which means > there's a very high level of structural sharing between objects. So when > adding two maps A & B to the cache, the used memory for those maps is > almost certainly nothing like getMemory(A) + getMemory(B) due to (probable) > structural sharing between them. > > We did try some heuristic workarounds, but were always off by so much that > performance suffered badly (either flushing the cache when we didn't need > to, or writing too often). So, we use an effectively infinite cache, flush > overnight and let it re-fill. Not ideal, but don't have a better solution > right now. > > Cheers, > > Matt. > On Sunday, July 18, 2021 at 11:49:43 PM UTC+9:30 andrei...@gmail.com > wrote: > >> Hi Matt, >> IMHO, the best way to compact will be off-line one - >> MVStoreTool.compact(), and it can take only seconds (your mileage may very, >> of course). >> If you can not afford 1 min off-line interruption, then you can try just >> to let it run and do it's own maintenance in the background (asuming >> autoCommitDelay > 0). >> If I only knew some "best/better" way for on-line compaction, it would >> probably be there already, as a background maintenance procedure. >> I expect that the existing one will fit the bill, unless you update rate >> is quite high. >> BTW, flushing the cache looks like a futile exercise, indeed. >> >> On Thursday, July 15, 2021 at 7:36:05 PM UTC-4 matt...@gmail.com wrote: >> >>> Hello Andrei, >>> >>> thanks very much for your reply. >>> >>> Yes, I'm aware I'm on an old version: if it's not broken, don't fix it >>> ;) Version 1.4.197 has been rock-solid for us for years, and I'm always >>> loathe to change things for no reason. But you have given me a good reason, >>> so I'll give the latest MVStore a try. >>> >>> Can you recommend the best way to 'manually' compact the database in the >>> latest release? >>> >>> And just to be sure: could there be any data-loss issues from flushing >>> the cache? >>> >>> Cheers, >>> >>> Matt. >>> >>> On Friday, July 16, 2021 at 4:45:12 AM UTC+9:30 andrei...@gmail.com >>> wrote: >>> Hi Matt, If you are experiencing a problem, which looks and smells like a cuncurrency issue, then there is definitely a good reason to suspect a concurrency issue. 8-) The real question here is: if you care enough about those problems, why are you still on version 1.4.197. MVStore's concurrency / synchronization was totally re-designed since then (and we are talking years here), for example you will not even find MVStore.compactRewriteFully() method anymore, but instead it might just do all that space management, so you won't need that background operation at all. In any case, I would not expect that someone will look at 1.4.197 issues at this point. On the other hand, if you will find similar problem with current trunk version, and will be able to reproduce it, I will be more than happy to work on it. Cheers, Andrei. On Thursday, July 15, 2021 at 3:39:40 AM UTC-4 matt...@gmail.com wrote: > Hello, > > I'm trying to track down a perplexing problem when using an MVStore, > where it appears
Re: [h2] H2 Inmemory not able to load 7lc records and giving heapspace issues
try using nioMemFS to store the data on the native heap -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/CAFYHVnUhgJAP3CLUED_jD5r62odjngxLGoyovV03EO7j4AfKYQ%40mail.gmail.com.
Re: [h2] Autonomous commit - commit single savepoint or transaction
you're going to need to use two connections to achieve that -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to h2-database+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/CAFYHVnUNkjHdv%3DmcpQ4S_zM6mZNUmxcpMg5riNZK_sR6cjTBnQ%40mail.gmail.com.