The data structure is available at the second link that I originally provided (once again, it is https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en). This does not have O(1) time insertion at the front as yet as it was unclear to me whether or not it was agreed upon: _________________ From: Osvaldo Doederlein <opin...@gmail.com> Date: Mon, Mar 29, 2010 at 10:08 AM Subject: Re: A List implementation backed by multiple small arrays rather than the traditional single large array. To: Martin Buchholz <marti...@google.com> Cc: "Kevin L. Stern" <kevin.l.st...@gmail.com>, core-libs-dev@openjdk.java.net
Initially, it would be good enough to replace only java.util.ArrayList with minimal overhead. ArrayList does not support efficient add-at-front or other enhancements of ArrayDeque; but ArrayList is still a much more important and popular collection, it's the primary "straight replacement for primitive arrrays" and I guess it should continue with that role. _________________ As a disclaimer, I'm still tinkering with this so I'll be updating the document at the provided link as I find improvements. Thoughts? Thanks, Kevin On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz <marti...@google.com>wrote: > Hi Kevin, > > You're probably the only one on this list who has > seriously read the paper. It is not surprising that > taking a research paper into production would > discover bugs - the research never had to undergo > rigorous testing. (I like the Java culture of > combining spec + implementation + test suite) > > I suggest you ask the authors directly about the bug. > They would probably also be interested to hear > about your implementation. > > Are you aware of Integer.numberOfLeadingZeros? > > http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29> > > Martin > > On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern <kevin.l.st...@gmail.com> > wrote: > > I'm almost convinced now that the paper is incorrect. The code below > gives > > me the appropriate index into the index array and the offset into the > data > > block. That being said, remember when I mentioned that this will include > a > > bit more work to access an element than a simple bit shift and a bit > mask? > > Well this is more than a bit more - we'll be doing this each time an > index > > is requested. I'll spend some time trying to twiddle the bits to see if > I > > can eliminate/combine some of the operations. > > > > for (int r = 1; r < 33; r++) { > > int k = lg(r); > > int floorKO2 = k >> 1; > > int powFloorKO2 = (1 << floorKO2); > > int p = ((1 << floorKO2) - 1) << 1; > > int ceilKO2; > > if ((k & 1) == 1) { > > ceilKO2 = floorKO2 + 1; > > p += powFloorKO2; > > } else { > > ceilKO2 = floorKO2; > > } > > int e = r & ((1 << ceilKO2) - 1); > > int b = (r >> ceilKO2) & (powFloorKO2 - 1); > > > > System.out.println((r - 1) + " " + (p + b) + " " + e); > > } > > > > Kevin > > > > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern <kevin.l.st...@gmail.com > > > > wrote: > >> > >> I realize that 2 * (2^(k/2) - 1) only works for even numbered > superblocks, > >> the odd numbered superblocks need an additional term added (the number > of > >> data blocks in SB_[k-1]) to jive with my interpretation; anyhow, I also > came > >> across an alternative characterization of superblock in the paper which > >> states that data blocks are grouped within a superblock when they are > the > >> same size - to me, though, that implies that my example structure below > >> would be > >> > >> SB_0: [1] > >> SB_1: [2][2][2] > >> SB_2: [4][4][4][4][4][4] > >> > >> which seems to contradict my understanding of (1) below. I must be > >> reading this upside down. > >> > >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern < > kevin.l.st...@gmail.com> > >> wrote: > >>> > >>> What am I missing here? In "Resizable arrays in optimal time and > space" > >>> the authors define their data structure with the following property: > >>> > >>> (1) "When superblock SB_k is fully allocated, it consists of > >>> 2^(floor(k/2)) data blocks, each of size 2^(ceil(k/2))." > >>> > >>> Since the superblock is zero-based indexed this implies the following > >>> structure: > >>> > >>> SB_0: [1] > >>> SB_1: [2] > >>> SB_2: [2][2] > >>> SB_3: [4][4] > >>> SB_4: [4][4][4][4] > >>> [...] > >>> > >>> Let's have a look at Algorithm 3, Locate(i), with i = 3: > >>> > >>> r = 100 (the binary expansion of i + 1) > >>> k = |r| - 1 = 2 > >>> p = 2^k - 1 = 3 > >>> > >>> What concerns me is their statement that p represents "the number of > data > >>> blocks in superblocks prior to SB_k." There are only two data blocks > in > >>> superblocks prior to SB_2, not three. Given (1) above, unless I'm > >>> misinterpreting it, the number of data blocks in superblocks prior to > SB_k > >>> should be: > >>> > >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1) > >>> > >>> This, of course, seems to work out much better in my example above, > >>> giving the correct answer to my interpretation of their data structure, > but > >>> I have a hard time believing that this is their mistake rather than my > >>> misinterpretation. > >>> > >>> Thoughts? > >>> > >>> Kevin > >>> > >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz <marti...@google.com> > >>> wrote: > >>>> > >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern < > kevin.l.st...@gmail.com> > >>>> wrote: > >>>> > Hi Martin, > >>>> > > >>>> > Thanks much for your feedback. The first approach that comes to > mind > >>>> > to > >>>> > implement O(1) time front as well as rear insertion is to create a > >>>> > cyclic > >>>> > list structure with a front/rear pointer - to insert at the front > >>>> > requires > >>>> > decrementing the front pointer (modulo the size) and to insert at > the > >>>> > rear > >>>> > requires incrementing the rear pointer (modulo the size). We need > to > >>>> > resize > >>>> > when the two pointers bump into each other. Could you explain more > >>>> > about > >>>> > your suggestion of introducing an arraylet that is shared by the > front > >>>> > and > >>>> > the rear? > >>>> > >>>> It was a half-baked idea - I don't know if there's a way to turn it > into > >>>> something useful. I was thinking of the ArrayDeque implementation, > >>>> where all the elements live in a single array. > >>>> > >>>> > It's not clear to me how that would help and/or be a better > >>>> > approach than the cyclic list. Anyhow, the paper that you > reference, > >>>> > "Resizable arrays in optimal time and space", gives a deque so if we > >>>> > take > >>>> > that approach then the deque is specified. > >>>> > >>>> Technically, ArrayList also supports the Deque operations - > >>>> just not efficiently. > >>> > >> > > > > >