On Fri, 9 Oct 2020 13:37:57 GMT, Kevin Rushforth <k...@openjdk.org> wrote:
>> The performance improvements of the first change were self-evident, but >> I agree that the current changes need to be explained. >> >> BitSet Features >> * When using BitSet, memory usage (N/8) is wasted in the case of removing >> the last one in the list. This is often caused >> by user actions. >> >> PR Features >> * The proposal records the run length of the flag, so the memory usage can >> be kept small in the case of consecutive >> deletion indexes. The worst case is the case where the deletion indexes >> are not contiguous, which results in 4*N memory >> consumption, but this happens very rarely. It's also the same as the >> memory usage of getSelectedIndices(). (Memory >> usage is further improved by setting it to int[]). >> * Also, the delete loop is less CPU intensive and faster than BitSet's flag >> scan. >> >> I will attach a test, but you'll have to wait until my next leisure time. > > I understand that this will improve the performance of this method in some > cases (but not all as you correctly point > out), but what I really meant by my questions was: When does this matter to > an application's overall performance and > what is the performance gain you would expect to see? As a general rule, we > do not make this sort of performance > optimization without a compelling reason. Improving the performance of a > method is not by itself compelling unless > there is a demonstrable (not theoretical) improvement to applications. I look > forward to seeing the test program that > shows the performance problem with the current implementation and allows us > to measure the improvement with your > proposed patch. ### Improved space efficiency * The BitSet approach has a problem where the last delete index determines the size of the internal BitSet array; the Run-Length approach eliminates this memory waste. * The Run-Lengths approach has its drawbacks, but the worst case can be avoided by switching to the BitSet approach when the array is expanded. * **The BitSet approach (original source) has the wrong initial size.** The exact initialization size can be determined by evaluating from the end of the index. As a result, the expansion of the internal array can be suppressed and unnecessary memory copy does not occur. Therefore, there is still room for optimization in the BitSet approach. * The worst case of the BitSet approach is likely to occur, and the worst case of the Run-Length approach is unlikely. Because the worst case of the BitSet approach happens at least one, but Run-Length requires N / 2 deletions. * The point at which the space efficiency of Run-Length and BitSet switches can be determined during the execution of the Run-Length approach. #### Memory consumption to remove the last item in the list. **The BitSet approach** ceil(N/64)*8 * 12,504 B for 100,000 items * 1,256 B for 10,000 items * 1,28 B for 1000 items * 16 B in 100 items **Run-Lengths Approach** For consecutive indexes, regardless of N 4 * 4 = 16 B 4 * (N +1) in the worst case * The Run-Lengths approach has its drawbacks, but the worst case can be avoided by switching to the BitSet approach at the time of array expansion (hybrid approach). ------------- PR: https://git.openjdk.java.net/jfx/pull/305