Interesting case :-) The implementation of InternalPackedArray is essentially the same as plain JavaScript "Array", the biggest difference is that InternalPackedArray is not exposed to websites and therefore not monkey-patchable. Array.shift() is inherently slow, because it requires all remaining elements to be moved. (Array.push()/Array.pop() are much faster.) V8 uses a nasty trick to make Array.shift() fast: object left-trimming, where the object header is moved, so that the elements themselves can stay in place. This is an ugly hack and we would like to get rid of it. Left-trimming is not possible in "large object space", a special area of V8's heap for large objects. The limit for what constitutes a "large object" is the regular heap's page size, which is 512KB. Arrays over-allocate their backing store in steps; when you start with "new Array()" and keep push()ing onto it, the sequence of backing store capacities is 4, 23, 52, 95, 160, 257, 403, 622, 950, 1442, 2180, 3287, 4948, 7439, 11176, 16781, 25189, 37801, *56719*, 85096, 127661, ... So when the 56720th element is pushed, the backing store is grown to 85096 * 8 bytes = 680768 bytes > 512KB, so the new backing store must be allocated in large-object space. Every subsequent shift() on that array is an O(n) operation. The reason the threshold is four times as high on your Android tablet is because we ship 32-bit Chrome builds to Android (so you can fit 2x as many pointers into the same space), and older V8 versions use 1MB pages (so you can fit 2x as large objects into regular pages).
Long story short: • your patch looks fine • worrying about deopts is probably unnecessary, but only measuring will tell for sure • then again switching between two implementations is probably also unnecessary (encapsulating the behavior you want into one class is probably cleaner) • you can split at 32768, sure, or at any random integer between 1 and 50000 (I wouldn't rely on the specific numbers I quoted above, because that's an implementation detail and could change at any time) • ideally you don't use Array.shift() at all, and instead have a "cursor" index for the next element to be consumed; drop the Queue's "front" when cursor === front.length (this would probably work well with a chunk size in the [100..1000] range or so) • if the chunks you're queuing are indeed typically 1K or bigger, then you probably don't need to bother with arrays at all, and can just use a plain simple linked list (because a few pointers of overhead per element are negligible if they give you guaranteed O(1) push/shift performance regardless of queue length), like so: class Queue { constructor() { this.size = 0; this.front = null; this.back = null; } get length() { return this.size; } push(element) { ++this.size; if (this.size === 1) { this.front = this.back = { value: element, next: null }; } else { var node = { value: element, next: null }; this.back.next = node; this.back = node; } } shift() { --this.size; var result = this.front.value; this.front = this.front.next; return result; } } On Wed, Jan 18, 2017 at 8:24 AM, Jochen Eisinger <joc...@chromium.org> wrote: > +Domenic Denicola <dome...@chromium.org> > > On Wed, Jan 18, 2017 at 4:25 AM Adam Rice <ri...@chromium.org> wrote: > >> I work on the Chrome implementation of ReadableStream and WritableStream >> <https://streams.spec.whatwg.org/>. They are implemented in Javascript >> using v8 extras. >> >> They currently use an InternalPackedArray to implement a queue structure, >> however I have found a scalability issue. The benchmarks and repro can be >> found at http://crbug.com/681493 ("ReadableStream, WritableStream get >> dramatically slower when queue grows to 56720 chunks"). >> >> I have a proposed fix at http://crrev.com/2637863002. If possible, I >> would like someone who is familiar with the implementation of >> InternalPackedArray to review it. >> >> Particular things I'd like to know: >> >> - Is it worth worrying about deopt or would it be better to use >> InternalPackedArray for small queues and only switch to Queue for larger >> ones? >> - Is 32768 a good size to split the arrays at? Can we be reasonably >> sure that it is small enough to get good behaviour on all platforms and >> architectures? >> >> and if possible >> >> - why does performance change so dramatically at a threshold? >> >> Thanks, >> Adam Rice >> >> -- >> -- >> v8-users mailing list >> v8-users@googlegroups.com >> http://groups.google.com/group/v8-users >> --- >> You received this message because you are subscribed to the Google Groups >> "v8-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to v8-users+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > -- > -- > v8-users mailing list > v8-users@googlegroups.com > http://groups.google.com/group/v8-users > --- > You received this message because you are subscribed to the Google Groups > "v8-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to v8-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- -- v8-users mailing list v8-users@googlegroups.com http://groups.google.com/group/v8-users --- You received this message because you are subscribed to the Google Groups "v8-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.