Hi Ken,
On May 27, 2010, at 12:20 PM, Ken Sullivan wrote:
> Hi, sorry to not get back sooner, I've found a couple of other interesting
> things. The speed issue doesn't seem to exist running in linux, the same
> code runs in a blink. In windows, the really, really slow runs (several
> minutes) only seem to happen when running from within visual studios. When
> run from command line it's slow, e.g. 15 seconds for 4000 vectors, but not
> minutes slow, and the time doesn't seem to grow as I saw before with visual
> studios.
Hmm, OK, I'll note that in our bug report. Sounds pretty Windows
specific though...
Quincey
> #ifdef __cplusplus
> extern "C" {
> #endif
> #include <hdf5.h>
> #ifdef __cplusplus
> }
> #endif
> #include <vector>
> #include <iostream>
> #include <stdlib.h>
> #include <math.h>
>
> using namespace std;
>
> int main() {
> unsigned long long totalNumVecs = 500000;
> unsigned long long vecLength = 128;
> hid_t baseType = H5T_NATIVE_FLOAT;
>
> unsigned long long roughNumVecsToGet = 4000;
> unsigned long long skipRate = (unsigned long long)ceilf((float)totalNumVecs
> / (float)roughNumVecsToGet);
> vector<unsigned long long> vecInds;
> for( int rowInd = 0; rowInd < totalNumVecs; rowInd += skipRate) {
> vecInds.push_back(rowInd);
> }
>
> int rank = 2;
> hsize_t dims[2];
> dims[0] = totalNumVecs;
> dims[1] = vecLength;
> hid_t fileSpaceId = H5Screate_simple(rank, dims, NULL);
>
> hsize_t fileBlockCount[2];
> hsize_t fileOffset[2];
>
> hsize_t selectionDims[2];
> selectionDims[0] = 1;
> fileBlockCount[0] = 1;
> fileOffset[0] = vecInds[0];
> for(int ir = 1; ir < rank; ++ir) {
> selectionDims[ir] = dims[ir];
> fileBlockCount[ir] = 1;
> fileOffset[ir] = 0;
> }
>
> cout << "begin hyperslab building" << endl;
> H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_SET, (const hsize_t*)
> fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims);
> unsigned long long numVecsToRead = vecInds.size();
> for (hsize_t id=1; id < numVecsToRead; ++id) {
> if ( (id % 50) == 0) {
> cout << id << "/" << numVecsToRead << endl;
> }
> fileOffset[0] = vecInds[id];
> H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_OR, (const hsize_t*)
> fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims);
> }
> cout << "end hyperslab building" << endl;
>
>
> return 0;
> }
>
>
> Thanks,
> Ken
>
>
> On Wed, May 26, 2010 at 8:17 AM, Quincey Koziol <[email protected]> wrote:
> Hi Ken,
>
> On May 25, 2010, at 5:36 PM, Ken Sullivan wrote:
>
> > Hi, I'm running into slow performance when selecting several (>1000)
> > non-consecutive rows from a 2-dimensional matrix, typically ~500,000 X 100.
> > The bottleneck is the for loop where each row vector index is OR'ed into
> > the hyperslab, i.e.:
> >
> > LOG4CXX_INFO(logger,"TIME begin hyperslab building"); //print out with
> > time stamp
> > //select file buffer hyperslabs
> > H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_SET, (const hsize_t*)
> > fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims);
> > for (hsize_t id = 1; id < numVecsToRead; ++id) {
> > LOG4CXX_INFO(logger, id << "/" << numVecsToRead);
> > fileOffset[0] = fileLocs1Dim[id];
> > H5Sselect_hyperslab(fileSpaceId, H5S_SELECT_OR, (const hsize_t*)
> > fileOffset, NULL, (const hsize_t*) fileBlockCount, selectionDims);
> > }
> > LOG4CXX_INFO(logger,"TIME end hyperslab building");
> >
> > One interesting thing is the time between each loop increases between each
> > iteration, e.g. no time at all between 1-2-3-4-5, but seconds between
> > 1000-1001-1002. So, the time to select the hyperslab is worse than linear,
> > and can become amazingly time consuming, e.g. >10 minutes (!) for a few
> > thousand. The read itself is very quick.
>
> Drat! Sounds like we've got an O(n^2) algorithm (or worse) somewhere
> in the code that combines two selections. Can you send us a standalone
> program that demonstrates the problem, so we can file an issue for this, and
> get it fixed?
>
> > My current workaround is to check if the number of vectors to select is
> > greater than a heuristically determined number where it seems the time to
> > read the entire file (half a million row vectors) and copy the requested
> > vectors is less than the time to run the hyperslab selection. Generally
> > the number works out to ~500 vecs/0.5 seconds.
> >
> > While poking around the code, I found a similar function,
> > H5Scombine_hyperslab() that is only compiled if NEW_HYPERSLAB_API is
> > defined. Using this significantly reduced the time of selection, in
> > particular the time for each OR-ing seemed constant, so 2000 vectors took
> > twice as long as 1000, not many times as with H5Sselect_hyperslab().
> > However, it's still 10s of seconds for few thousand vector selection, and
> > so it's still much quicker to read all and copy (~1/2 second).
> > Reading all and copying is not an ideal solution, as it requires
> > malloc/free ~250MB unnecessarily, and if I use H5Scombine_hyperslab() the
> > crossover number goes up, i.e. more than 500, and it's less likely to be
> > needed. I'm a bit nervous however about using this undocumented code.
> >
> > So...am I doing something wrong? Is there a speedy way to select a
> > hyperslab consisting of 100s or 1000s of non-consecutive vectors? Is
> > NEW_HYPERSLAB_API safe?
>
> Currently, the NEW_HYPERSLAB_API is not tested or supported, so I
> wouldn't use it.
>
> Quincey
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org