This perf testing is definitely a good way to approach things, Harbs. I
can't reconcile it with my experience from the past, but I was using far
more 'rudimentary' tests using times for simple loops over much larger
ranges, and using the Date class for timing (precision was one reason why I
used much larger loops)

I have to concede for sure that the TypedArray approach is slower. I could
have sworn it was faster when I did comparisons as I was working on this
(which was quite some time ago now). It may have been a result of the way I
was 'testing' as well perhaps.
I don't expect things to change I guess, but it would theoretically be
better testing with the compiled version of BinaryData. Things like the
function call overhead from getTypedArray() vs. referencing something
directly and the repeated_position++ versus one _position op to increment
(e.g. for DataView) in some cases mean there is more work at the byte
level. But I don't expect this to change things based on these tests.

Maybe I can troubleshoot the MD5 thing in the coming weeks. One of the
other things I wondered about for BinaryData is whether it should have it's
own 'autogrow' strategy internally for the buffer with sequential writes.
This is not really PAYG though, so I guess not. It definitely performs much
better on writing with a preallocated buffer length, so maybe that is the
recommended approach for users.



On Mon, Jun 25, 2018 at 8:45 AM Harbs <[email protected]> wrote:

> I switched float and double functions to always use DataView. If we could
> eliminate DataView instantiation we could probably improve performance with
> that. I tried a single instance, but I got bounds errors.
>
> All read and write functions should work correctly now (the DM5 error
> not-withstanding). There should be test cases to cover everything. (Unless
> I missed some edge cases.) It’s possible that the float methods could be
> improved with math, but I’m done for now. If anyone wants to improve things
> further, feel free…
>
> Harbs
>
> > On Jun 24, 2018, at 4:46 AM, Greg Dove <[email protected]> wrote:
> >
> > I think DataView might be a lot faster than it was a couple of years ago.
> > At least it seems to be (on Chrome/windows on the machine I am currently
> > using) (and it should only need a single instance like the Uint8Array,
> for
> > the current buffer, I think, instead of 'new DataView' )
> >
> > On Sun, Jun 24, 2018 at 1:19 PM Greg Dove <[email protected]> wrote:
> >
> >>
> >> I tried to add in the extra test, but messed up I think because it did
> not
> >> copy over the original tests in the revision (even though I was
> 'adding' a
> >> new test). I have not used jsperf before, so probably did something
> stupid.
> >> On the face of it, it certainly does look like the byte level stuff is
> >> faster.
> >>
> >>
> >> On Sun, Jun 24, 2018 at 10:32 AM Greg Dove <[email protected]> wrote:
> >>
> >>>
> >>> Harbs, for the typed array test, try using the ArrayBuffer in the
> >>> constructor instead of the Uint8Array, I think you might see a
> difference
> >>> here.
> >>>
> >>>
> >>> On Sun, Jun 24, 2018 at 10:09 AM Harbs <[email protected]> wrote:
> >>>
> >>>> I created a jsperf which compares different read methods, and the one
> I
> >>>> changed it to seems fastest by far:
> >>>> https://jsperf.com/typedarray-read-performance/1 <
> >>>> https://jsperf.com/typedarray-read-performance/1>
> >>>>
> >>>> The only optimization which might give slight gains is to use shift
> >>>> operators instead of multiplication. Enabling the
> --enable-benchmarking
> >>>> flag in Chrome seemed to indicate that there’s up to a 10% performance
> >>>> increase of the shift operators over multiplication. Logically, shift
> >>>> operators should be cheaper. In Safari, the shift operator gave a
> >>>> performance boost of 3 times the multiplication.
> >>>>
> >>>> I only tested 16 bit uints. It might be worthwhile adding ints and 32
> >>>> bit.
> >>>>
> >>>> Thanks,
> >>>> Harbs
> >>>>
> >>>>> On Jun 22, 2018, at 8:58 AM, Alex Harui <[email protected]>
> >>>> wrote:
> >>>>>
> >>>>> It is definitely true that I don't know how expensive instantiation
> of
> >>>> small instances is in the browser.  Might even be different in
> different
> >>>> browsers.  There should also be function call overhead in calling the
> >>>> constructor as well.  On the other hand, maybe using the typed arrays
> helps
> >>>> with the browser's type inferencing and there is some advantage there
> that
> >>>> also assumes the equivalent of word/long alignments.
> >>>>>
> >>>>> If TypeArrays really aren't that expensive to instantiate, it may be
> >>>> that the way to handle floats/doubles is to copy bytes into a new
> >>>> BinaryData and convert that to the appropriate TypedArray.  That
> should
> >>>> ensure that there won't be alignment issues.
> >>>>>
> >>>>> IMO, this discussion shows the importance of providing choices.  You
> >>>> can have a much simpler variant that assumes certain endian and thus
> save a
> >>>> whole bunch of byte swapping.  You might in fact have a faster
> >>>> implementation if you can guarantee/assume alignment.
> >>>>>
> >>>>> HTH,
> >>>>> -Alex
> >>>>>
> >>>>> On 6/21/18, 3:01 PM, "Greg Dove" <[email protected]> wrote:
> >>>>>
> >>>>>   ' If you have the time to resolve these remaining issues, that
> >>>> would be
> >>>>>   awesome.'
> >>>>>
> >>>>>   Happy to commit to doing this by mid-July if that works. I do have
> >>>> downtime
> >>>>>   in July that I can use for Royale, but I also need to set up for
> >>>> Royale
> >>>>>   development again because I have changed machines.
> >>>>>
> >>>>>
> >>>>>   On Fri, Jun 22, 2018 at 9:13 AM Harbs <[email protected]>
> >>>> wrote:
> >>>>>
> >>>>>> Cool.
> >>>>>>
> >>>>>> FYI there are still some issues to resolve:
> >>>>>>
> >>>>>> 1. readFloat and readDouble should have the same issues as the
> >>>> various int
> >>>>>> read methods. I did not fix that yet because calculating floats and
> >>>> doubles
> >>>>>> is harder than ints. I’m not sure of the best way to handle that.
> One
> >>>> way
> >>>>>> is to create a new temporary ArrayBuffer with only the necessary
> >>>> bytes and
> >>>>>> read that. Another would be to use DataViews for floats and
> doubles. A
> >>>>>> third would be to actually do the math for floating point numbers…
> >>>>>>
> >>>>>> 2. I did not fix any of the write methods. They should have the same
> >>>> issue
> >>>>>> as the read methods.
> >>>>>>
> >>>>>> If you have the time to resolve these remaining issues, that would
> be
> >>>>>> awesome. :-)
> >>>>>>
> >>>>>> Harbs
> >>>>>>
> >>>>>>> On Jun 21, 2018, at 11:55 PM, Greg Dove <[email protected]>
> wrote:
> >>>>>>>
> >>>>>>> Thanks Harbs - that short explanation makes it very clear.  Thanks
> >>>> for
> >>>>>>> fixing my broken implementation :). I am guilty of not covering
> 'real
> >>>>>>> world' combinations in the tests, which was what I used to guide
> >>>>>>> development when I was working on this originally.
> >>>>>>>
> >>>>>>> I am pretty sure I did have also have an interim implementation
> >>>> doing the
> >>>>>>> bytewise reads/writes and dealing with endian-ness branching along
> >>>> the
> >>>>>> way,
> >>>>>>> then I switched to using the typed arrays when I saw a performance
> >>>> boost.
> >>>>>>> I wish I had written something like your new test before I did that
> >>>> :).
> >>>>>>> I did a quick search for this and read up on the issue about the
> >>>> offset.
> >>>>>> I
> >>>>>>> guess that also partly explains why the typed array performance is
> >>>>>> 'faster'
> >>>>>>> (possibly due to the alignment I mean). And I did not do memory
> >>>>>> profiling,
> >>>>>>> so maybe Alex is right - there could also have been a massive GC
> dump
> >>>>>> after
> >>>>>>> my large loop tests, something that I was not seeing because I was
> >>>> only
> >>>>>>> focused on speed at the time.
> >>>>>>>
> >>>>>>> I had to add the 'sysEndian' flag approach internally when I
> >>>> switched to
> >>>>>>> using the typed arrays (and only reordering the result if it was
> >>>>>> different
> >>>>>>> to the 'native' endian used in the typed arrays). If everything is
> >>>> based
> >>>>>> on
> >>>>>>> bytewise approach now, that flag should no longer be needed I think
> >>>> (you
> >>>>>>> may already have done that, I did not look at your changes in
> detail
> >>>> yet
> >>>>>> -
> >>>>>>> will check them out this weekend).
> >>>>>>>
> >>>>>>> Obviously having this working correctly comes first, but with a
> >>>>>>> comprehensive set of tests, we do have the ability to more easily
> >>>> respond
> >>>>>>> to changes in the browser engines over time and update the
> >>>> implementation
> >>>>>>> if, for example, DataView becomes faster across the board. I did
> see
> >>>> it
> >>>>>>> working faster in one of the browsers I tested in, but I am not
> >>>> confident
> >>>>>>> that I kept a good record of all this...
> >>>>>>>
> >>>>>>> I will look in more detail this weekend, but for now it sounds like
> >>>> you
> >>>>>>> have resolved the issue - well done! I hope I can get into some
> more
> >>>> work
> >>>>>>> on Royale again soon.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Thu, 21 Jun 2018, 21:04 Harbs, <[email protected]> wrote:
> >>>>>>>
> >>>>>>>> I just added a unit test for this.
> >>>>>>>>
> >>>>>>>> The test will fail with a RTE using the old implementation.
> >>>>>>>>
> >>>>>>>> Explanation: After reading the first byte, the position is 1 which
> >>>>>> doesn’t
> >>>>>>>> align with a 16IntArray.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Harbs
> >>>>>>>>
> >>>>>>>>> On Jun 21, 2018, at 1:48 AM, Greg Dove <[email protected]>
> >>>> wrote:
> >>>>>>>>>
> >>>>>>>>> I am not sure what the int16/int-aligned issue is yet (I have not
> >>>> read
> >>>>>>>> back
> >>>>>>>>> through all the relevant history) but if we can set up a unit
> test
> >>>> for
> >>>>>>>> it,
> >>>>>>>>> this stuff all becomes much clearer I think.
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
>
>

Reply via email to