Peter, Very reasonable and worth considering. The main reason the power of 2 copy works well is that the source is (almost) always cached.
I thought about this a bit at the beginning and wondered about introducing Arrays.fill(T[] dst, T[] src) where dst is filled repeatedly from src. This can then become an intrinsic and “do the right thing” based on hardware. String::repeat can be adapted later. I didn’t pursue because I lacked other use cases. File a performance against the implementation (once checked in) and we can see what the performance team thinks. Cheers, — Jim > On Mar 1, 2018, at 7:50 AM, Peter Levart <peter.lev...@gmail.com> wrote: > > Hi, > > On 03/01/2018 03:13 AM, Paul Sandoz wrote: >> Hi Jim, >> >> Looks good. I like the power of 2 copying. > > Is this really the fastest way? Say you are doing this: > > String s = ... 64 bytes ... > > s.repeat(16384); > > ...the last arraycopy will be copying 512 KiB from one part of memory to the > other part. It means that the source 512 KiB range will not fit into L1 > cache. Neither fully into L2 cache. > > The fastest way might be to employ power-of-two copying until the range > reaches L1 caches size / 2 for example (16 K ?) and then use the same source > range repeatedly as a "stamp" until the rounded end. > > What do you think? > > Regards, Peter >