Hi Thomas, I've taken another look at your BitField code. I think that your functions nextSetBit and nextClearBit do not work as intended in every case. I assume that the check for data[i] == 0 is to quickly skip all empty longs and then search the long that has some bits set in the inner loop. Unfortunately in case when the first long is not zero (but does not have any bits set after the fromIndex) it will drop into the inner loop and then search the array of longs bit by bit again even if some longs are zeroes. You probably want to clear the bits of the first long below the "fromIndex" before comparing it to zero. This way your outer loop will always work as intended.
Your function getLastSetBit() is also searching bit by bit as well. You may actually want to take a look at the JDK implementation of the "java.util.BitSet" class. They also use an array of long for their storage and use several bit manipulation tricks to speed things up. Some helpful links: http://graphics.stanford.edu/~seander/bithacks.html http://www.jjj.de/bitwizardry/bitwizardrypage.html Alex public int nextSetBit(int fromIndex) { int i = fromIndex >> ADDRESS_BITS; int max = data.length; int maxAddress = data.length << ADDRESS_BITS; for (; i < max; i++) { if (data[i] == 0) { continue; } for (int j = Math.max(fromIndex, i << ADDRESS_BITS); j < maxAddress; j++) { if (get(j)) { return j; } } } return -1; } On Oct 14, 4:01 pm, "Thomas Mueller" <[EMAIL PROTECTED]> wrote: > Hi, > > I had a look at the BitField class as well. I still use a long array > for getLastSetBit(), nextSetBit(), nextClearBit(). But set and get a > byte at a time. My tests shows it's about 8 time faster. This will be > included in the next release. If you want, you can already test it, > the source code is in subversion > athttp://code.google.com/p/h2database/source/checkout > > Of course you can also send your version, it's always interesting to > compare two approaches. > > Regards, > Thomas > > On Tue, Oct 14, 2008 at 3:30 PM, Alex <[EMAIL PROTECTED]> wrote: > > > Hello Thomas, > > > I could send you my quick and dirty version of the BitField class. In > > our initial tests with large databases (300+ MB) the BitField > > operations were taking 70% of the Database.open time. > > The actual file IO was not even on the picture. When we used the > > modified BitField ( and a couple of small changes to the DiskFile > > class) the file IO came up and became the slowest operation. > > > Alex > > > On Oct 13, 3:19 pm, "Thomas Mueller" <[EMAIL PROTECTED]> > > wrote: > >> Hi, > > >> Thank for your help! I will have a look how this can be improved. > >> Maybe it makes sense to use int or byte, but I want to make sure it > >> does in fact get faster so I will need to write some benchmarks as > >> well. > > >> Regards, > >> Thomas > > >> On Sat, Oct 11, 2008 at 12:00 AM, Alex <[EMAIL PROTECTED]> wrote: > > >> > We were looking at some profiles running on large databases and > >> > noticed that under some circumstances BitField operations are taking a > >> > large percentage of time when opening the database. > > >> > Particularly the BitField class is implemented using "long" data type > >> > for bit storage. However when it gets used from the DiskFile class > >> > getSummary() and initFromSummary() methods they are using bytes. The > >> > code there basically moves the data from the longs into the bytes bit > >> > by bit which is very inefficient. It would be very easy to modify the > >> > BitField class to utilize the byte[] for storage and use this byte[] > >> > directly to store/restore the allocation table. > > >> > We also noticed that sometimes the BitField.setRange method is also > >> > taking considerable time. > >> > I guess there should be a way to optimize it as well using byte/long > >> > operations rather than iterating the range bit by bit. > > >> > Below are the code snippets in question: > > >> > initFromSummary: > >> > ====================================================== > >> > for (int i = 0, x = 0; i < b2 / 8; i++) { > >> > int mask = in.read(); > >> > if (init) { > >> > for (int j = 0; j < 8; j++, x++) { > >> > if (used.get(x) != ((mask & (1 << j)) != > >> > 0)) { > >> > throw Message.getInternalError("Redo > >> > failure, block: " + x + " expected in-use bit: " + used.get(x)); > >> > } > >> > } > >> > } else { > >> > for (int j = 0; j < 8; j++, x++) { > >> > if ((mask & (1 << j)) != 0) { > >> > used.set(x); > >> > } > >> > } > >> > } > >> > } > >> > ====================================================== > > >> > getSummary() > > >> > ====================================================== > >> > for (int i = 0, x = 0; i < blocks / 8; i++) { > >> > int mask = 0; > >> > for (int j = 0; j < 8; j++) { > >> > if (used.get(x)) { > >> > mask |= 1 << j; > >> > } > >> > x++; > >> > } > >> > out.write(mask); > >> > } > >> > ==================================================== --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "H2 Database" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/h2-database?hl=en -~----------~----~----~----~------~----~------~--~---
