[ 
https://issues.apache.org/jira/browse/COLLECTIONS-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Claude Warren updated COLLECTIONS-820:
--------------------------------------
    Description: 
    Support a merge without checking and then provide a method add with 
checking for a change in the filter. This will duplicate a lot of (simple) 
code. But it provides a nice method to check if a filter contains an object and 
do something if it did not, rather than calling contains before then doing a 
merge, or doing a merge and checking cardinality change afterwards.

For a bitmap bloom filter this can be done branchless by accumulating all the 
zero bits that are set to 1 using:

long[] flag = {0};
int[] idx = new int[1];
bitMapProducer.forEachBitMap(value -> {
    int i = idx[0]++;
    flag[0] |= (~bitMap[i] & value);
    bitMap[i] |= value;
    return true;
});
return flag[0] != 0;

For a filter with an explicit cardinality you just check the cardinality before 
and after the merge.

h3. 
!https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24!  
*[aherbert|https://github.com/aherbert]*  [on 27 
Feb|https://github.com/apache/commons-collections/pull/258#discussion_r813419143]

 

With the provision of the {{copy}} method the {{merge}} method seems redundant 
with {{{}mergeInPlace{}}}. It is equivalent to two lines of code:

 
{{BloomFilter filter2 = filter.copy(); }}
{{filter2.mergeInPlace(other);}}
 
I think {{merge}} should be the collections equivalent of {{{}add{}}}. This 
acts in place. Very few collections operations return new collections unless 
they are views of the current collection, e.g. List.subList. Note that the 
CountingBloomFilter has {{add}} which acts in place. There is no {{add}} 
returning a new CountingBloomFilter and then an {{{}addInPlace{}}}. So there is 
some inconsistency here.

IMO the more common operation is to merge a filter in place. So this should be 
the action of merge. Then the mergeInPlace can be dropped or replaced with 
{{copyAndMerge}} or {{{}mergeToCopy{}}}. You then also have the boolean result 
of merge to consider. Should the copy not be returned if the merge returns 
false? Since it is 2 lines of code I would not bother and drop this.

The BloomFilter then has {{merge}} and CountingBloomFilter has {{{}add{}}}, 
both act in place (the common operation). The user can create a copy if desired 
for a merge if that is useful to them.

Another comment on {{merge}} suggests changing it to support {{add}} and 
{{merge}} as both have their own use cases.

  was:
h3. 
!https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24!  
*[aherbert|https://github.com/aherbert]*  [on 27 
Feb|https://github.com/apache/commons-collections/pull/258#discussion_r813419143]

 

With the provision of the {{copy}} method the {{merge}} method seems redundant 
with {{{}mergeInPlace{}}}. It is equivalent to two lines of code:

 
{{BloomFilter filter2 = filter.copy(); }}
{{filter2.mergeInPlace(other);}}
 
I think {{merge}} should be the collections equivalent of {{{}add{}}}. This 
acts in place. Very few collections operations return new collections unless 
they are views of the current collection, e.g. List.subList. Note that the 
CountingBloomFilter has {{add}} which acts in place. There is no {{add}} 
returning a new CountingBloomFilter and then an {{{}addInPlace{}}}. So there is 
some inconsistency here.

IMO the more common operation is to merge a filter in place. So this should be 
the action of merge. Then the mergeInPlace can be dropped or replaced with 
{{copyAndMerge}} or {{{}mergeToCopy{}}}. You then also have the boolean result 
of merge to consider. Should the copy not be returned if the merge returns 
false? Since it is 2 lines of code I would not bother and drop this.

The BloomFilter then has {{merge}} and CountingBloomFilter has {{{}add{}}}, 
both act in place (the common operation). The user can create a copy if desired 
for a merge if that is useful to them.

Another comment on {{merge}} suggests changing it to support {{add}} and 
{{merge}} as both have their own use cases.


> BloomFilter: renaming merge/mergeInPlace
> ----------------------------------------
>
>                 Key: COLLECTIONS-820
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-820
>             Project: Commons Collections
>          Issue Type: Improvement
>          Components: Collection
>    Affects Versions: 4.5
>            Reporter: Claude Warren
>            Priority: Minor
>              Labels: bloom-filter
>
>     Support a merge without checking and then provide a method add with 
> checking for a change in the filter. This will duplicate a lot of (simple) 
> code. But it provides a nice method to check if a filter contains an object 
> and do something if it did not, rather than calling contains before then 
> doing a merge, or doing a merge and checking cardinality change afterwards.
> For a bitmap bloom filter this can be done branchless by accumulating all the 
> zero bits that are set to 1 using:
> long[] flag = {0};
> int[] idx = new int[1];
> bitMapProducer.forEachBitMap(value -> {
>     int i = idx[0]++;
>     flag[0] |= (~bitMap[i] & value);
>     bitMap[i] |= value;
>     return true;
> });
> return flag[0] != 0;
> For a filter with an explicit cardinality you just check the cardinality 
> before and after the merge.
> h3. 
> !https://avatars.githubusercontent.com/u/886334?s=48&v=4|width=24,height=24!  
> *[aherbert|https://github.com/aherbert]*  [on 27 
> Feb|https://github.com/apache/commons-collections/pull/258#discussion_r813419143]
>  
> With the provision of the {{copy}} method the {{merge}} method seems 
> redundant with {{{}mergeInPlace{}}}. It is equivalent to two lines of code:
>  
> {{BloomFilter filter2 = filter.copy(); }}
> {{filter2.mergeInPlace(other);}}
>  
> I think {{merge}} should be the collections equivalent of {{{}add{}}}. This 
> acts in place. Very few collections operations return new collections unless 
> they are views of the current collection, e.g. List.subList. Note that the 
> CountingBloomFilter has {{add}} which acts in place. There is no {{add}} 
> returning a new CountingBloomFilter and then an {{{}addInPlace{}}}. So there 
> is some inconsistency here.
> IMO the more common operation is to merge a filter in place. So this should 
> be the action of merge. Then the mergeInPlace can be dropped or replaced with 
> {{copyAndMerge}} or {{{}mergeToCopy{}}}. You then also have the boolean 
> result of merge to consider. Should the copy not be returned if the merge 
> returns false? Since it is 2 lines of code I would not bother and drop this.
> The BloomFilter then has {{merge}} and CountingBloomFilter has {{{}add{}}}, 
> both act in place (the common operation). The user can create a copy if 
> desired for a merge if that is useful to them.
> Another comment on {{merge}} suggests changing it to support {{add}} and 
> {{merge}} as both have their own use cases.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to