Hi Martin, I had intended to address your request for absolute O(1) operations in the previous email. The approach to achieving this suggested in [Brodnik99resizablearrays] is tantamount to making ArrayList operations absolute O(1) by keeping around an array of size (3/2)*n and filling it with a constant number of entries from the main array each time add is called. Although this distributes the work done during a resize across the n operations required to enter a resize-required state, it is at the expense of additional memory usage and slower add operations. My thought is that this would be a fine approach for a real-time application that requires hard guarantees on performance but would be a liability in so many Java applications that do not require these hard guarantees. I look forward to hearing your thoughts on the matter, though.
Kevin On Tue, Apr 13, 2010 at 6:18 AM, Kevin L. Stern <kevin.l.st...@gmail.com>wrote: > Hi Martin, > > It's interesting to note that the old circular list trick will not suffice > to turn this data structure into a deque since we might be copying all n > elements back to the front = 0 position every n^(1/2) operations (add > wouldn't amortize to O(1)). We could use the old two stacks trick (push > elements onto one stack, flip (the bottom) half (of) the elements to the > 'other' stack when the 'other' stack becomes empty), mentioned in > [Brodnik99resizablearrays], but I find this to be a bit CS 101. In > [Brodnik99resizablearrays] the authors suggest a method for making all > blocks roughly the same size, allowing us to expand/shrink capacity at the > beginning or the end; this is the approach that I will take to create a > deque. > > The FAQ for the Sun Contributor Agreement Q3 ( > http://www.sun.com/software/opensource/contributor_agreement.jsp#sa_3) > indicates that one should check with the project to determine where the SCA > should be sent. Do you know where I would find this information? > > Kevin > > @MISC{Brodnik99resizablearrays, > author = {Andrej Brodnik and Svante Carlsson and Erik D. Demaine and J. > Ian Munro and Robert Sedgewick}, > title = {Resizable Arrays in Optimal Time and Space}, > year = {1999} > > } > > On Sun, Apr 11, 2010 at 4:17 PM, Martin Buchholz <marti...@google.com>wrote: > >> Hi Kevin, >> >> Thanks for your continuing work on this. >> >> I like the test results, and agree with your analysis. >> I'm especially happy that you're beating >> ArrayList at some operations. >> >> I'd like to see O(1) addition at the beginning, >> implement both List and Deque (I regret >> our not having done this with ArrayDeque). >> >> An additional property that would be nice to >> have (but don't try too hard) >> is to provide some kind of real-time >> guarantees on the cost of an individual operation, >> not just amortized time. E.g. ArrayList.add >> is worst-case O(n), making it unsuitable for use >> in some real-time applications. >> >> I will help get your changes into the obvious >> software distributions. I assume you're happy >> with having this class included in any of >> Doug Lea's jsr166, guava-libraries, or the JDK itself. >> You should sign a Sun contributor agreement, >> or whatever the Oracle equivalent is, >> if you have not done so yet. >> >> Doug Lea likes public domain, >> guava-libraries likes the Apache license. >> >> We should get various people a chance to give >> a thumbs up on the design of this class - >> Doug Lea, Josh Bloch. >> >> Martin >> >> On Sun, Apr 11, 2010 at 09:32, Kevin L. Stern <kevin.l.st...@gmail.com> >> wrote: >> > Hello Martin, >> > >> > I spent some time this weekend trying to bring out bugs in the >> > implementation; I believe the latest version to be in decent shape. I >> have >> > also gathered some data on the performance of ChunkedArrayList over >> > ArrayList using the latest 1.6 JDK, which I've included below (note that >> the >> > numbers represent the time spent performing the specified operation with >> > ChunkedArrayList over the time spent with ArrayList, so 1.00 indicates >> > equivalent performance, < 1.00 indicates that ChunkedArrayList is less >> > costly and > 1.00 indicates that ArrayList is less costly). I've >> noticed >> > relatively significant variability in a few of the numbers when I switch >> > hardware; though, these data do seem to represent rough performance >> > expectations. For my test I generated x elements and then timed the >> process >> > of adding them to ArrayList/ChunkedArrayList, then I performed a get >> > operation on each for indices 0 through x-1 and finally I used the >> iterator >> > mechanism to retrieve the first through xth element (of course, I >> performed >> > each of these operations multiple times throwing away the timing for the >> > first few iterations to warm up the JVM). >> > >> > Regarding the question of whether or not this belongs in java.util, I >> would >> > suggest that if it is desirable from a GC point of view to eliminate the >> > large backing array from ArrayList then your suggestion of achieving >> this by >> > way of a data structure that is both time and space optimal is a >> > particularly elegant solution as it not only guarantees that no backing >> > array will be larger than sqrt(n) elements but it also provides dynamic >> > shrinking behavior, has less maximum memory overhead than ArrayList, and >> > copies (asymptotically) fewer elements during a resize than ArrayList. >> Of >> > course, this data structure does not do everything better than >> ArrayList; in >> > particular, indexed access is more costly, due to the required >> decomposition >> > of the index into backing array index and offset and the additional >> memory >> > indirection, and insertion-at-an-index is more costly, due to the >> multiple >> > array copies necessary to complete the shift. That being said, I think >> that >> > the additional cost of indexed access is partially mitigated by the >> > availability of iterator and listIterator, whose implementations do not >> use >> > the index decomposition procedure, and the additional cost of >> > insertion-at-an-index is partially mitigated by the fact that >> > insertion-at-an-index is already an undesirable operation on ArrayList >> due >> > to its linear time complexity. >> > >> > Kevin >> > >> > 1000000 elements: >> > Client JVM: >> > Add to ChunkedArrayList over ArrayList: 1.30 >> > Indexed access ChunkedArrayList over ArrayList: 1.80 >> > Iterator ChunkedArrayList over ArrayList: 0.52 >> > >> > Server JVM: >> > Add to ChunkedArrayList over ArrayList: 0.81 >> > Indexed access ChunkedArrayList over ArrayList: 2.87 >> > Iterator ChunkedArrayList over ArrayList: 1.31 >> > >> > 100000 elements: >> > Client JVM: >> > Add to ChunkedArrayList over ArrayList: 0.96 >> > Indexed access ChunkedArrayList over ArrayList: 1.86 >> > Iterator ChunkedArrayList over ArrayList: 0.48 >> > >> > Server JVM: >> > Add to ChunkedArrayList over ArrayList: 0.96 >> > Indexed access ChunkedArrayList over ArrayList: 1.89 >> > Iterator ChunkedArrayList over ArrayList: 2.68 >> > >> > 10000 elements: >> > Client JVM: >> > Add to ChunkedArrayList over ArrayList: 1.04 >> > Indexed access ChunkedArrayList over ArrayList: 2.33 >> > Iterator ChunkedArrayList over ArrayList: 0.53 >> > >> > Server JVM: >> > Add to ChunkedArrayList over ArrayList: 0.97 >> > Indexed access ChunkedArrayList over ArrayList: 2.45 >> > Iterator ChunkedArrayList over ArrayList: 2.52 >> > >> > 1000 elements: >> > Client JVM: >> > Add to ChunkedArrayList over ArrayList: 0.99 >> > Indexed access ChunkedArrayList over ArrayList: 2.27 >> > Iterator ChunkedArrayList over ArrayList: 0.54 >> > >> > Server JVM: >> > Add to ChunkedArrayList over ArrayList: 0.84 >> > Indexed access ChunkedArrayList over ArrayList: 1.23 >> > Iterator ChunkedArrayList over ArrayList: 1.11 >> > >> > >> > On Fri, Apr 9, 2010 at 7:42 PM, Martin Buchholz <marti...@google.com> >> wrote: >> >> >> >> My feeling on whether to support O(1) at both ends >> >> is that any flavor of this that ends up in the JDK eventually >> >> should really do this. My idea is that we can >> >> wholeheartedly recommend this collection class >> >> for overall good behavior without any of the surprising >> >> performance traps of existing collection classes. >> >> >> >> But for the preliminary version, it makes sense to >> >> support only O(1) at one end, if it simplifies the >> >> implementation. Random access will of course >> >> be worse than ArrayList, but by how much? >> >> We can do some benchmarking and look for >> >> micro-optimizations now. >> >> >> >> Kevin, what is you own personal feeling? >> >> Is the algorithm correct, and efficient enough? >> >> Do you think your new collection belongs in java.util? >> >> >> >> Martin >> >> >> >> On Sun, Apr 4, 2010 at 04:12, Kevin L. Stern <kevin.l.st...@gmail.com> >> >> wrote: >> >> > The data structure is available at the second link that I originally >> >> > provided (once again, it is >> >> > >> >> > >> https://docs.google.com/Doc?docid=0Aabrz3MPBDdhZGdrbnEzejdfM2M3am5wM2Mz&hl=en >> ). >> >> > This does not have O(1) time insertion at the front as yet as it was >> >> > unclear >> >> > to me whether or not it was agreed upon: >> >> > _________________ >> >> > From: Osvaldo Doederlein <opin...@gmail.com> >> >> > Date: Mon, Mar 29, 2010 at 10:08 AM >> >> > Subject: Re: A List implementation backed by multiple small arrays >> >> > rather >> >> > than the traditional single large array. >> >> > To: Martin Buchholz <marti...@google.com> >> >> > Cc: "Kevin L. Stern" <kevin.l.st...@gmail.com>, >> >> > core-libs-dev@openjdk.java.net >> >> > >> >> > Initially, it would be good enough to replace only >> java.util.ArrayList >> >> > with >> >> > minimal overhead. ArrayList does not support efficient add-at-front >> or >> >> > other >> >> > enhancements of ArrayDeque; but ArrayList is still a much more >> important >> >> > and >> >> > popular collection, it's the primary "straight replacement for >> primitive >> >> > arrrays" and I guess it should continue with that role. >> >> > _________________ >> >> > >> >> > As a disclaimer, I'm still tinkering with this so I'll be updating >> the >> >> > document at the provided link as I find improvements. >> >> > >> >> > Thoughts? >> >> > >> >> > Thanks, >> >> > >> >> > Kevin >> >> > >> >> > On Thu, Apr 1, 2010 at 10:28 PM, Martin Buchholz < >> marti...@google.com> >> >> > wrote: >> >> >> >> >> >> Hi Kevin, >> >> >> >> >> >> You're probably the only one on this list who has >> >> >> seriously read the paper. It is not surprising that >> >> >> taking a research paper into production would >> >> >> discover bugs - the research never had to undergo >> >> >> rigorous testing. (I like the Java culture of >> >> >> combining spec + implementation + test suite) >> >> >> >> >> >> I suggest you ask the authors directly about the bug. >> >> >> They would probably also be interested to hear >> >> >> about your implementation. >> >> >> >> >> >> Are you aware of Integer.numberOfLeadingZeros? >> >> >> >> >> >> >> >> >> >> http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros(int)<http://download.java.net/jdk7/docs/api/java/lang/Integer.html#numberOfLeadingZeros%28int%29> >> >> >> >> >> >> Martin >> >> >> >> >> >> On Wed, Mar 31, 2010 at 19:34, Kevin L. Stern < >> kevin.l.st...@gmail.com> >> >> >> wrote: >> >> >> > I'm almost convinced now that the paper is incorrect. The code >> below >> >> >> > gives >> >> >> > me the appropriate index into the index array and the offset into >> the >> >> >> > data >> >> >> > block. That being said, remember when I mentioned that this will >> >> >> > include a >> >> >> > bit more work to access an element than a simple bit shift and a >> bit >> >> >> > mask? >> >> >> > Well this is more than a bit more - we'll be doing this each time >> an >> >> >> > index >> >> >> > is requested. I'll spend some time trying to twiddle the bits to >> see >> >> >> > if >> >> >> > I >> >> >> > can eliminate/combine some of the operations. >> >> >> > >> >> >> > for (int r = 1; r < 33; r++) { >> >> >> > int k = lg(r); >> >> >> > int floorKO2 = k >> 1; >> >> >> > int powFloorKO2 = (1 << floorKO2); >> >> >> > int p = ((1 << floorKO2) - 1) << 1; >> >> >> > int ceilKO2; >> >> >> > if ((k & 1) == 1) { >> >> >> > ceilKO2 = floorKO2 + 1; >> >> >> > p += powFloorKO2; >> >> >> > } else { >> >> >> > ceilKO2 = floorKO2; >> >> >> > } >> >> >> > int e = r & ((1 << ceilKO2) - 1); >> >> >> > int b = (r >> ceilKO2) & (powFloorKO2 - 1); >> >> >> > >> >> >> > System.out.println((r - 1) + " " + (p + b) + " " + e); >> >> >> > } >> >> >> > >> >> >> > Kevin >> >> >> > >> >> >> > On Wed, Mar 31, 2010 at 7:08 PM, Kevin L. Stern >> >> >> > <kevin.l.st...@gmail.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> I realize that 2 * (2^(k/2) - 1) only works for even numbered >> >> >> >> superblocks, >> >> >> >> the odd numbered superblocks need an additional term added (the >> >> >> >> number >> >> >> >> of >> >> >> >> data blocks in SB_[k-1]) to jive with my interpretation; anyhow, >> I >> >> >> >> also >> >> >> >> came >> >> >> >> across an alternative characterization of superblock in the paper >> >> >> >> which >> >> >> >> states that data blocks are grouped within a superblock when they >> >> >> >> are >> >> >> >> the >> >> >> >> same size - to me, though, that implies that my example structure >> >> >> >> below >> >> >> >> would be >> >> >> >> >> >> >> >> SB_0: [1] >> >> >> >> SB_1: [2][2][2] >> >> >> >> SB_2: [4][4][4][4][4][4] >> >> >> >> >> >> >> >> which seems to contradict my understanding of (1) below. I must >> be >> >> >> >> reading this upside down. >> >> >> >> >> >> >> >> On Wed, Mar 31, 2010 at 6:36 PM, Kevin L. Stern >> >> >> >> <kevin.l.st...@gmail.com> >> >> >> >> wrote: >> >> >> >>> >> >> >> >>> What am I missing here? In "Resizable arrays in optimal time >> and >> >> >> >>> space" >> >> >> >>> the authors define their data structure with the following >> >> >> >>> property: >> >> >> >>> >> >> >> >>> (1) "When superblock SB_k is fully allocated, it consists of >> >> >> >>> 2^(floor(k/2)) data blocks, each of size 2^(ceil(k/2))." >> >> >> >>> >> >> >> >>> Since the superblock is zero-based indexed this implies the >> >> >> >>> following >> >> >> >>> structure: >> >> >> >>> >> >> >> >>> SB_0: [1] >> >> >> >>> SB_1: [2] >> >> >> >>> SB_2: [2][2] >> >> >> >>> SB_3: [4][4] >> >> >> >>> SB_4: [4][4][4][4] >> >> >> >>> [...] >> >> >> >>> >> >> >> >>> Let's have a look at Algorithm 3, Locate(i), with i = 3: >> >> >> >>> >> >> >> >>> r = 100 (the binary expansion of i + 1) >> >> >> >>> k = |r| - 1 = 2 >> >> >> >>> p = 2^k - 1 = 3 >> >> >> >>> >> >> >> >>> What concerns me is their statement that p represents "the >> number >> >> >> >>> of >> >> >> >>> data >> >> >> >>> blocks in superblocks prior to SB_k." There are only two data >> >> >> >>> blocks >> >> >> >>> in >> >> >> >>> superblocks prior to SB_2, not three. Given (1) above, unless >> I'm >> >> >> >>> misinterpreting it, the number of data blocks in superblocks >> prior >> >> >> >>> to >> >> >> >>> SB_k >> >> >> >>> should be: >> >> >> >>> >> >> >> >>> 2 * Sum[i=0->k/2-1] 2^i = 2 * (2^(k/2) - 1) >> >> >> >>> >> >> >> >>> This, of course, seems to work out much better in my example >> above, >> >> >> >>> giving the correct answer to my interpretation of their data >> >> >> >>> structure, but >> >> >> >>> I have a hard time believing that this is their mistake rather >> than >> >> >> >>> my >> >> >> >>> misinterpretation. >> >> >> >>> >> >> >> >>> Thoughts? >> >> >> >>> >> >> >> >>> Kevin >> >> >> >>> >> >> >> >>> On Tue, Mar 30, 2010 at 5:20 PM, Martin Buchholz >> >> >> >>> <marti...@google.com> >> >> >> >>> wrote: >> >> >> >>>> >> >> >> >>>> On Tue, Mar 30, 2010 at 04:25, Kevin L. Stern >> >> >> >>>> <kevin.l.st...@gmail.com> >> >> >> >>>> wrote: >> >> >> >>>> > Hi Martin, >> >> >> >>>> > >> >> >> >>>> > Thanks much for your feedback. The first approach that comes >> to >> >> >> >>>> > mind >> >> >> >>>> > to >> >> >> >>>> > implement O(1) time front as well as rear insertion is to >> create >> >> >> >>>> > a >> >> >> >>>> > cyclic >> >> >> >>>> > list structure with a front/rear pointer - to insert at the >> >> >> >>>> > front >> >> >> >>>> > requires >> >> >> >>>> > decrementing the front pointer (modulo the size) and to >> insert >> >> >> >>>> > at >> >> >> >>>> > the >> >> >> >>>> > rear >> >> >> >>>> > requires incrementing the rear pointer (modulo the size). We >> >> >> >>>> > need >> >> >> >>>> > to >> >> >> >>>> > resize >> >> >> >>>> > when the two pointers bump into each other. Could you >> explain >> >> >> >>>> > more >> >> >> >>>> > about >> >> >> >>>> > your suggestion of introducing an arraylet that is shared by >> the >> >> >> >>>> > front >> >> >> >>>> > and >> >> >> >>>> > the rear? >> >> >> >>>> >> >> >> >>>> It was a half-baked idea - I don't know if there's a way to >> turn >> >> >> >>>> it >> >> >> >>>> into >> >> >> >>>> something useful. I was thinking of the ArrayDeque >> >> >> >>>> implementation, >> >> >> >>>> where all the elements live in a single array. >> >> >> >>>> >> >> >> >>>> > It's not clear to me how that would help and/or be a better >> >> >> >>>> > approach than the cyclic list. Anyhow, the paper that you >> >> >> >>>> > reference, >> >> >> >>>> > "Resizable arrays in optimal time and space", gives a deque >> so >> >> >> >>>> > if >> >> >> >>>> > we >> >> >> >>>> > take >> >> >> >>>> > that approach then the deque is specified. >> >> >> >>>> >> >> >> >>>> Technically, ArrayList also supports the Deque operations - >> >> >> >>>> just not efficiently. >> >> >> >>> >> >> >> >> >> >> >> > >> >> >> > >> >> > >> >> > >> > >> > >> > >