Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v4]

2023-05-30 Thread Chang Peng
On Tue, 30 May 2023 22:20:23 GMT, David Holmes wrote: > What testing was done on this fix before integration? I don't even see Git > Hub Actions being run. @dholmes-ora I did see earlier that Github Action ran (In the 'Checks' tab) and finished, and I believed the Windows failure is not

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v4]

2023-05-30 Thread David Holmes
On Mon, 29 May 2023 02:20:07 GMT, Chang Peng wrote: >> In Vector API Java level, vector mask is represented as a boolean array with >> 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it >> is loaded into vector register, e.g. Neon, the in-memory format will be >>

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v4]

2023-05-30 Thread Tobias Hartmann
On Mon, 29 May 2023 02:20:07 GMT, Chang Peng wrote: >> In Vector API Java level, vector mask is represented as a boolean array with >> 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it >> is loaded into vector register, e.g. Neon, the in-memory format will be >>

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v4]

2023-05-29 Thread Eric Liu
On Mon, 29 May 2023 02:20:07 GMT, Chang Peng wrote: >> In Vector API Java level, vector mask is represented as a boolean array with >> 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it >> is loaded into vector register, e.g. Neon, the in-memory format will be >>

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v4]

2023-05-28 Thread Chang Peng
> In Vector API Java level, vector mask is represented as a boolean array with > 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it > is loaded into vector register, e.g. Neon, the in-memory format will be > converted to in-register format with 0/-1 value for each lane

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v3]

2023-05-25 Thread Andrew Haley
On Thu, 18 May 2023 09:50:13 GMT, Chang Peng wrote: >> In Vector API Java level, vector mask is represented as a boolean array with >> 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it >> is loaded into vector register, e.g. Neon, the in-memory format will be >>

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v3]

2023-05-22 Thread Eric Liu
On Thu, 18 May 2023 09:50:13 GMT, Chang Peng wrote: >> In Vector API Java level, vector mask is represented as a boolean array with >> 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it >> is loaded into vector register, e.g. Neon, the in-memory format will be >>

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v3]

2023-05-18 Thread Chang Peng
On Mon, 15 May 2023 10:59:11 GMT, Andrew Haley wrote: > > > This looks like it might be removed by loop opts. I think you might need > > > a blackhole somewhere. > > > > > > `m` will be updated in every iteration of this loop, so `m` is not a > > loop-invariants actually. I can see the

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v3]

2023-05-18 Thread Chang Peng
> In Vector API Java level, vector mask is represented as a boolean array with > 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it > is loaded into vector register, e.g. Neon, the in-memory format will be > converted to in-register format with 0/-1 value for each lane

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon [v2]

2023-05-18 Thread Chang Peng
> In Vector API Java level, vector mask is represented as a boolean array with > 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it > is loaded into vector register, e.g. Neon, the in-memory format will be > converted to in-register format with 0/-1 value for each lane

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon

2023-05-15 Thread Andrew Haley
On Mon, 15 May 2023 10:04:22 GMT, Chang Peng wrote: > > This looks like it might be removed by loop opts. I think you might need a > > blackhole somewhere. > > `m` will be updated in every iteration of this loop, so `m` is not a > loop-invariants actually. I can see the assembly code of this

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon

2023-05-15 Thread Chang Peng
On Mon, 15 May 2023 08:57:30 GMT, Andrew Haley wrote: > This looks like it might be removed by loop opts. I think you might need a > blackhole somewhere. ```m``` will be updated in every iteration of this loop, so ```m``` is not a loop-invariants actually. I can see the assembly code of this

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon

2023-05-15 Thread Chang Peng
On Mon, 15 May 2023 08:56:37 GMT, Andrew Haley wrote: > That makes sense. Is it likely that there are more of these combined > operations on vector masks that could be matched? if so, it might make sense > to do the job earlier, in the C2 optimizer. Thanks for your review. I have tried to

Re: RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon

2023-05-15 Thread Andrew Haley
On Mon, 15 May 2023 02:58:46 GMT, Chang Peng wrote: > In Vector API Java level, vector mask is represented as a boolean array with > 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it > is loaded into vector register, e.g. Neon, the in-memory format will be >

RFR: 8307795: AArch64: Optimize VectorMask.truecount() on Neon

2023-05-14 Thread Chang Peng
In Vector API Java level, vector mask is represented as a boolean array with 0x00/0x01 (8 bits of each element) as values, aka in-memory format. When it is loaded into vector register, e.g. Neon, the in-memory format will be converted to in-register format with 0/-1 value for each lane (lane