Thank you Quincey and John for the helpful information. I will keep the metadata cache options in mind, but right now my biggest concern is the raw data cache. I currently have one file with many (up to ~1000) datasets, which I have open simultaneously. Since there is a separate raw data cache for each I think this explains the significant cache memory usage I am seeing.
I don't suppose there is a way to share a raw data cache across multiple datasets is there? When does memory allocation happen for the raw data cache? Does it get allocated at dataset open / create time or only as values are written to and read from the dataset? Are there any other options I have for affecting the behavior of the raw data cache? All I can find are H5P_get/set_[chunk_]cache. Thanks, Ethan On Tue, Jul 20, 2010 at 1:38 PM, John Mainzer <[email protected]> wrote: > >From [email protected] Tue Jul 20 07:48:03 2010 > >From: Quincey Koziol <[email protected]> > >Date: Tue, 20 Jul 2010 07:50:52 -0500 > >To: HDF Users Discussion List <[email protected]> > >Subject: Re: [Hdf-forum] Cache memory usage > > > >Hi Ethan, > > > >On Jul 19, 2010, at 8:41 PM, Ethan Dreyfuss wrote: > > > >> I am trying to get a handle on how much memory is being used by HDF5 for > caching, and have a couple questions: > >> > >> Do the cache limits apply globally (per process), per file, per dataset, > or in some other way? Specifically when trying to compute the total memory > usage I should just add the memory for the raw data chunk cache and the > metadata cache or do I need to multiply one or both by the number of > files/datasets/other? > > > > The metadata cache is per file and the raw data chunk cache is per > dataset. > > > >> Is there any good way to measure actual cache memory usage or am I > limited to using top to check process memory usage and computing values > based on cache parameters? > > > > Hmm, you can check the metadata cache, but I don't think there's a > query function for the chunk cache currently. Also, you can manually > garbage collect the internal HDF5 library memory allocations with > H5garbage_collect(), but we don't have a way to query that usage right now > either. Probably valgrind or top would still be reasonable now... > > > > Quincey > > > > > > > >_______________________________________________ > >Hdf-forum is for HDF software users discussion. > >[email protected] > >http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > > Hi Ethan, > > I believe I can add a bit to the above: > > As Quincey indicated, HDF5 creates one metadata cache per open file. > You can use H5Fget_mdc_size() to get the current cache size for a given > file, but note that the cache's footprint in memory will typically be two > to three times its current size. > > Note also that unless configured otherwise, the metadata cache will > attempt to resize itself so as to be big enough to contain the current > working set -- and no larger. Thus, depending on your access pattern, > you may see the metadata cache size grow and shrink. That said, you > have to work pretty hard to get it to grow beyond a several MB. > > For further information on the metadata cache, please see the portion > of the special topics section of the Users Guide that addresses the > metadata cache. Reading and understanding this portion of the > documentation is pretty much essential if you want to do take direct > control of the metadata cache without shooting yourself in the foot. > > The chunk cache is not my specialty, so I will not attempt to add to > Quincey's comments. > > I hope this helps. > > Best regards, > > John Mainzer > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
