Peter,

Very reasonable and worth considering. The main reason the power of 2 copy 
works well is that the source is (almost) always cached.

I thought about this a bit at the beginning and wondered about introducing 
Arrays.fill(T[] dst, T[] src) where dst is filled repeatedly from src. This can 
then become an intrinsic and “do the right thing” based on hardware. 
String::repeat can be adapted later. I didn’t pursue because I lacked other use 
cases.

File a performance against the implementation (once checked in) and we can see 
what the performance team thinks.

Cheers,

— Jim


> On Mar 1, 2018, at 7:50 AM, Peter Levart <peter.lev...@gmail.com> wrote:
> 
> Hi,
> 
> On 03/01/2018 03:13 AM, Paul Sandoz wrote:
>> Hi Jim,
>> 
>> Looks good. I like the power of 2 copying.
> 
> Is this really the fastest way? Say you are doing this:
> 
> String s = ... 64 bytes ...
> 
> s.repeat(16384);
> 
> ...the last arraycopy will be copying 512 KiB from one part of memory to the 
> other part. It means that the source 512 KiB range will not fit into L1 
> cache. Neither fully into L2 cache.
> 
> The fastest way might be to employ power-of-two copying until the range 
> reaches L1 caches size / 2 for example (16 K ?) and then use the same source 
> range repeatedly as a "stamp" until the rounded end.
> 
> What do you think?
> 
> Regards, Peter
> 

Reply via email to