On Fri, 9 Oct 2020 13:37:57 GMT, Kevin Rushforth <k...@openjdk.org> wrote:

>> The performance improvements of the first change were self-evident, but
>> I agree that the current changes need to be explained.
>> 
>> BitSet Features
>> * When using BitSet, memory usage (N/8) is wasted in the case of removing 
>> the last one in the list. This is often caused
>>   by user actions.
>> 
>> PR Features
>> * The proposal records the run length of the flag, so the memory usage can 
>> be kept small in the case of consecutive
>>   deletion indexes. The worst case is the case where the deletion indexes 
>> are not contiguous, which results in 4*N memory
>>   consumption, but this happens very rarely. It's also the same as the 
>> memory usage of getSelectedIndices(). (Memory
>>   usage is further improved by setting it to int[]).
>> * Also, the delete loop is less CPU intensive and faster than BitSet's flag 
>> scan.
>> 
>> I will attach a test, but you'll have to wait until my next leisure time.
>
> I understand that this will improve the performance of this method in some 
> cases (but not all as you correctly point
> out), but what I really meant by my questions was: When does this matter to 
> an application's overall performance and
> what is the performance gain you would expect to see?  As a general rule, we 
> do not make this sort of performance
> optimization without a compelling reason. Improving the performance of a 
> method is not by itself compelling unless
> there is a demonstrable (not theoretical) improvement to applications. I look 
> forward to seeing the test program that
> shows the performance problem with the current implementation and allows us 
> to measure the improvement with your
> proposed patch.

### Improved space efficiency

* The BitSet approach has a problem where the last delete index determines the 
size of the internal BitSet array; the
  Run-Length approach eliminates this memory waste.

* The Run-Lengths approach has its drawbacks, but the worst case can be avoided 
by switching to the BitSet approach when
  the array is expanded.

* **The BitSet approach (original source) has the wrong initial size.** The 
exact initialization size can be determined by
    evaluating from the end of the index. As a result, the expansion of the 
internal array can be suppressed and
    unnecessary memory copy does not occur. Therefore, there is still room for 
optimization in the BitSet approach.

* The worst case of the BitSet approach is likely to occur, and the worst case 
of the Run-Length approach is unlikely.
  Because the worst case of the BitSet approach happens at least one, but 
Run-Length requires N / 2 deletions.

* The point at which the space efficiency of Run-Length and BitSet switches can 
be determined during the execution of the
  Run-Length approach.

#### Memory consumption to remove the last item in the list.

**The BitSet approach**

ceil(N/64)*8

* 12,504 B for 100,000 items
* 1,256 B for 10,000 items
* 1,28 B for 1000 items
* 16 B in 100 items

**Run-Lengths Approach**
For consecutive indexes, regardless of N

4 * 4
= 16 B

4 * (N +1) in the worst case

* The Run-Lengths approach has its drawbacks, but the worst case can be avoided 
by switching to the BitSet approach at
  the time of array expansion (hybrid approach).

-------------

PR: https://git.openjdk.java.net/jfx/pull/305

Reply via email to