Yeah we did try and scrutinize our opens/closes, and did actually end up finding a missing close which was making a small leak, but unfortunately even the code above runs slow and I think the only open handle there is fileSpaceId (which I've since made sure to close in the demo program for good measure).
On Thu, May 27, 2010 at 1:17 PM, <[email protected]> wrote: > > FYI... I had thought that I was experiencing a very similar problem under > Linux. > As my loop progressed, my performance writing via hyperslab grew worse and > worse. > After further troubleshooting and profiling I discovered that I had missed > some > H5Dclose() and H5Aclose() calls. Fixing that made a HUGE difference in my > test case (from 52 secs to approximately 2.5 secs for about 8Mb of data). > > Kirk > > > > Hi Ken, > > > > On May 27, 2010, at 12:20 PM, Ken Sullivan wrote: > > > >> Hi, sorry to not get back sooner, I've found a couple of other > >> interesting things. The speed issue doesn't seem to exist running in > >> linux, the same code runs in a blink. In windows, the really, really > >> slow runs (several minutes) only seem to happen when running from within > >> visual studios. When run from command line it's slow, e.g. 15 seconds > >> for 4000 vectors, but not minutes slow, and the time doesn't seem to > >> grow as I saw before with visual studios. > > > > Hmm, OK, I'll note that in our bug report. Sounds pretty Windows > > specific though... > > > > Quincey > > > > > >> #ifdef __cplusplus > >> extern "C" { > >> #endif > >> #include <hdf5.h> > >> #ifdef __cplusplus > >> } > >> #endif > >> #include <vector> > >> #include <iostream> > >> #include <stdlib.h> > >> #include <math.h> > >> > >> using namespace std; > >> > >> int main() { > >> unsigned long long totalNumVecs = 500000; > >> unsigned long long vecLength = 128; > >> hid_t baseType = H5T_NATIVE_FLOAT; > >> > >> unsigned long long roughNumVecsToGet = 4000; > >> unsigned long long skipRate = (unsigned long > >> long)ceilf((float)totalNumVecs / (float)roughNumVecsToGet); > >> vector<unsigned long long> vecInds; > >> for( int rowInd = 0; rowInd < totalNumVecs; rowInd += skipRate) { > >> vecInds.push_back(rowInd); > >> } > >> > >> int rank = 2; > >> hsize_t dims[2]; > >> dims[0] = totalNumVecs; > >> dims[1] = vecLength; > >> hid_t fileSpaceId = H5Screate_simple(rank, dims, NULL); > >> > >> hsize_t fileBlockCount[2]; > >> hsize_t fileOffset[2]; > >> > >> hsize_t selectionDims[2]; > >> selectionDims[0] = 1; > >> fileBlockCount[0] = 1; > >> fileOffset[0] = vecInds[0]; > >> for(int ir = 1; ir < rank; ++ir) { > >> selectionDims[ir] = dims[ir]; > >> fileBlockCount[ir] = 1; > >> fileOffset[ir] = 0; > >> } > >> > >> cout << "begin hyperslab building" << endl; > >> H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_SET, (const hsize_t*) > >> fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims); > >> unsigned long long numVecsToRead = vecInds.size(); > >> for (hsize_t id=1; id < numVecsToRead; ++id) { > >> if ( (id % 50) == 0) { > >> cout << id << "/" << numVecsToRead << endl; > >> } > >> fileOffset[0] = vecInds[id]; > >> H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_OR, (const hsize_t*) > >> fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims); > >> } > >> cout << "end hyperslab building" << endl; > >> > >> > >> return 0; > >> } > >> > >> > >> Thanks, > >> Ken > >> > >> > >> On Wed, May 26, 2010 at 8:17 AM, Quincey Koziol <[email protected]> > >> wrote: > >> Hi Ken, > >> > >> On May 25, 2010, at 5:36 PM, Ken Sullivan wrote: > >> > >> > Hi, I'm running into slow performance when selecting several (>1000) > >> non-consecutive rows from a 2-dimensional matrix, typically ~500,000 X > >> 100. The bottleneck is the for loop where each row vector index is > >> OR'ed into the hyperslab, i.e.: > >> > > >> > LOG4CXX_INFO(logger,"TIME begin hyperslab building"); //print out > >> with time stamp > >> > //select file buffer hyperslabs > >> > H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_SET, (const hsize_t*) > >> fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims); > >> > for (hsize_t id = 1; id < numVecsToRead; ++id) { > >> > LOG4CXX_INFO(logger, id << "/" << numVecsToRead); > >> > fileOffset[0] = fileLocs1Dim[id]; > >> > H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_OR, (const hsize_t*) > >> fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims); > >> > } > >> > LOG4CXX_INFO(logger,"TIME end hyperslab building"); > >> > > >> > One interesting thing is the time between each loop increases between > >> each iteration, e.g. no time at all between 1-2-3-4-5, but seconds > >> between 1000-1001-1002. So, the time to select the hyperslab is worse > >> than linear, and can become amazingly time consuming, e.g. >10 minutes > >> (!) for a few thousand. The read itself is very quick. > >> > >> Drat! Sounds like we've got an O(n^2) algorithm (or worse) > >> somewhere in the code that combines two selections. Can you send > >> us a standalone program that demonstrates the problem, so we can > >> file an issue for this, and get it fixed? > >> > >> > My current workaround is to check if the number of vectors to select > >> is greater than a heuristically determined number where it seems the > >> time to read the entire file (half a million row vectors) and copy the > >> requested vectors is less than the time to run the hyperslab > >> selection. Generally the number works out to ~500 vecs/0.5 seconds. > >> > > >> > While poking around the code, I found a similar function, > >> H5Scombine_hyperslab() that is only compiled if NEW_HYPERSLAB_API is > >> defined. Using this significantly reduced the time of selection, in > >> particular the time for each OR-ing seemed constant, so 2000 vectors > >> took twice as long as 1000, not many times as with > >> H5Sselect_hyperslab(). However, it's still 10s of seconds for few > >> thousand vector selection, and so it's still much quicker to read all > >> and copy (~1/2 second). > >> > Reading all and copying is not an ideal solution, as it requires > >> malloc/free ~250MB unnecessarily, and if I use H5Scombine_hyperslab() > >> the crossover number goes up, i.e. more than 500, and it's less likely > >> to be needed. I'm a bit nervous however about using this undocumented > >> code. > >> > > >> > So...am I doing something wrong? Is there a speedy way to select a > >> hyperslab consisting of 100s or 1000s of non-consecutive vectors? Is > >> NEW_HYPERSLAB_API safe? > >> > >> Currently, the NEW_HYPERSLAB_API is not tested or supported, so I > >> wouldn't use it. > >> > >> Quincey > >> > >> > >> _______________________________________________ > >> Hdf-forum is for HDF software users discussion. > >> [email protected] > >> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > >> > >> _______________________________________________ > >> Hdf-forum is for HDF software users discussion. > >> [email protected] > >> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > > > > _______________________________________________ > > Hdf-forum is for HDF software users discussion. > > [email protected] > > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > > > > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org >
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
