OK, so my initial assumption of N-1 compositions was correct and I just made a mistake somewhere e=when walking through the algorithm.
So, what you are saying is that the cost of a compose operation is proportional to the size (number of items) in the two BufferedDocOps. -Tad On Thu, Feb 11, 2010 at 7:04 PM, Alexandre Mah <[email protected]> wrote: > The number of compose calls is always N-1. The difference in time > complexity is due to the amount of work the compositions need to do, > not the number of compositions that are done. > > The cost of composing two operations of sizes x and y is x + y, and > the resulting operation has size roughly x + y. > > Suppose you have operations A, B, C, D, E, F, G, and H of sizes a, b, > c, d, e, f, g, and h respectively. > > If you compose the operations in the following way: > ((((((AB)C)D)E)F)G)H > then the cost of composing A and B is (approximately) a + b, the cost > of composing AB with C is a + b + c, the cost of composing ABC with D > is a + b + c + d, and so on. > (To make the illustration simpler, if we just consider the case where > a = b = c = d = e = f = g = 1, then the cost is roughly 2 + 3 + 4 + 5 > + 6 + 7 + 8, and it's easy to see that this cost is quadratic if we > generalise this calculation to an arbritrary number of operations.) > > However, if you compose the operations in the following way: > ((AB)(CD))((EF)(GH)) > then the cost of composing A and B is (approximately) a + b, the cost > of composing C and D is c + d, the cost of composing AB with CD is a + > b + c + d, and so on. The total cost of this composition would be 3 * > (a + b + c + d + e + f + g + h), where the 3 in this expression can be > more usefully thought of as log 8 (due to there being 8 operations). > So composing in this way gives O(n log n) time complexity. > > Note that both ways of composing N operations perform exactly N-1 > compositions, but one has n^2 time complexity and the other has n log > n time complexity (where n is the total size of the N operations). > > Basing the DocOpCollector on the algorithm for incrementing a binary > number is just an elegant way to perform the more efficient way of > composing. > > For example, adding operations A, B, C, D, E, and F to the collector > gives the following sequence of states for the operations List > (DocOpCollector's private member field), as the operations are added > one by one. > > [A] > [null, BA] > [C, BA] > [null, null, DCBA] > [E, null, DCBA] > [null, FE, DCBA] > > (Here, I've reversed the composition notation so that the first > operation is on the right.) > > You'll notice that this sequence of states can be interpreted as the > little-endian binary representations of 1, 2, 3, 4, 5, and 6, if you > consider null entries to be 0 and non-null entries to be 1. > > Also, note that every one of the original operations will have an > influence on at most log N of the N-1 compositions performed, since > every time an operation takes part in a composition, it is "promoted" > to reside in a location that represents a more significant binary > digit. So the time complexity for composing the N operations of total > size n is O(n log N), which is actually a bit better than O(n log n). > > Alex > > On 2/12/10, Tad Glines <[email protected]> wrote: >> Ok, I see it now. For number of adds N where N is between 1 and 7 the >> number of compose calls is N-1, but after that the number of calls to >> compose per add starts dropping such that when N = 16 the number of >> compose calls is only 13. >> >> -Tad >> >> On Thu, Feb 11, 2010 at 5:29 PM, Alexandre Mah <[email protected]> wrote: >>> On 2/12/10, Tad Glines <[email protected]> wrote: >>>> The construction of DocOpCollector seems unnecessarily complex. >>>> >>>> Here's the relevant snippet: >>>> >>>> private final List<BufferedDocOp> operations = new >>>> ArrayList<BufferedDocOp>(); >>>> >>>> public void add(BufferedDocOp operation) { >>>> ListIterator<BufferedDocOp> iterator = operations.listIterator(); >>>> while (iterator.hasNext()) { >>>> BufferedDocOp nextOperation = iterator.next(); >>>> if (nextOperation == null) { >>>> iterator.set(operation); >>>> return; >>>> } >>>> iterator.set(null); >>>> operation = compose(nextOperation, operation); >>>> } >>>> operations.add(operation); >>>> } >>>> >>>> public BufferedDocOp composeAll() { >>>> BufferedDocOp result = null; >>>> for (BufferedDocOp operation : operations) { >>>> if (operation != null) { >>>> result = (result != null) ? compose(operation, result) : >>>> operation; >>>> } >>>> } >>>> operations.clear(); >>>> return result; >>>> } >>>> >>>> This seems like it could be more simply (and efficiently) expresses as: >>>> private BufferedDocOp collectedOp = null; >>>> >>>> public void add(BufferedDocOp operation) { >>>> if (collectedOp == null) >>>> collectedOp = operation; >>>> else >>>> collectedOp = compose(collectedOp, operation); >>>> } >>>> >>>> public BufferedDocOp finish() { >>>> BufferedDocOp result = collectedOp; >>>> collectedOp = null; >>>> return result; >>>> } >>>> >>>> Does anyone see a problem with the proposed change? >>> >>> The solution you propose has an O(n^2) time complexity for composing a >>> sequence of operations of total size n, whereas the current code has >>> an O(n log n) time complexity. >>> >>> Perhaps I can explain how the DocOpCollector works, since I think it's >>> pretty interesting. >>> If you interpret the operations List as a (little-endian) binary >>> number, where a null in the list indicates a 0 and a non-null >>> operation indicates a 1, then adding an operation to the collector >>> works exactly like incrementing its binary number representation. >>> >>> An interesting byproduct of this algorithm is that interpreting the >>> operations List as a binary number will tell you exactly how many >>> operations have been collected so far. >>> >>> I hope that clears things up. >>> >>> Alex >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Wave Protocol" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]. >>> For more options, visit this group at >>> http://groups.google.com/group/wave-protocol?hl=en. >>> >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Wave Protocol" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/wave-protocol?hl=en. >> >> > > -- > You received this message because you are subscribed to the Google Groups > "Wave Protocol" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/wave-protocol?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Wave Protocol" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
