[jira] [Resolved] (COLLECTIONS-821) BloomFilter: replace ArrayTracker
[ https://issues.apache.org/jira/browse/COLLECTIONS-821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claude Warren resolved COLLECTIONS-821. --- Resolution: Done Fixed with https://github.com/apache/commons-collections/pull/317 > BloomFilter: replace ArrayTracker > - > > Key: COLLECTIONS-821 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-821 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Assignee: Claude Warren >Priority: Minor > Labels: bloom-filter > > h3. > !https://avatars.githubusercontent.com/u/886334?s=48=4|width=24,height=24! > *[aherbert|https://github.com/aherbert]* [on 27 > Feb|https://github.com/apache/commons-collections/pull/258#discussion_r813388387] > The Hasher.ArrayTracker class could be replaced with an open addressed hash > table. I have created an implementation that is very efficient and can handle > up to 2^29 unique indices (with a load factor of 0.5). It is approximately 7x > faster than TreeSet for add operations, even without explicitly setting the > upper capacity. It is only outperformed by a BitMap type array when the > density of indices exceeds 2-4 bits per long which matches up to how the > IndexFilter is choosing the implementation type based on size. I added the > simple array tracker to the benchmark as it is still faster when the number > of hash functions is very small (e.g. 5). > The same int hash table can be used for the SparseBloomFilter but it would > not support a saturated filter if the number of bits is above 2^29. The > second caveat is that iteration of indices in order is slow, they are > naturally unsorted. Iteration in unsorted raw form is fast (as fast as > TreeMap) but sorted requires extracting the indices and sorting them which is > about 4-6x slower than TreeMap iteration. The sorted form is required to > support the BitMapProducer interface. This then becomes a question of whether > to optimise a sparse filter for merge and contains operations or for bitmap > iteration (required for set operations). > Currently this and the BitMapTracker are public. I would move them to package > private implementation details (i.e. out of this class). This allows them to > be changed later. All access should be through {{{}IndexTracker.create{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [commons-collections] Claudenw merged pull request #317: moved IndexFilter to its own file.
Claudenw merged PR #317: URL: https://github.com/apache/commons-collections/pull/317 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Work logged] (LANG-1662) Let ReflectionToStringBuilder only reflect given field names
[ https://issues.apache.org/jira/browse/LANG-1662?focusedWorklogId=788237=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788237 ] ASF GitHub Bot logged work on LANG-1662: Author: ASF GitHub Bot Created on: 06/Jul/22 13:28 Start Date: 06/Jul/22 13:28 Worklog Time Spent: 10m Work Description: GutoVeronezi commented on PR #849: URL: https://github.com/apache/commons-lang/pull/849#issuecomment-1176222681 Hi, @kinow, thanks for the reply. @garydgregory, it feels a bit weird for us here on the other side proposing things to contribute back to commons-lang and interacting with reviewers to see somebody "taking over the changes". If the PR is missing something or there is a better way to do something, I would expect the reviewers to point what can be improved, or at least it being merged with some changes/rebase and so on (as described by @kinow https://github.com/apache/commons-lang/pull/849#issuecomment-1175649542). This way of applying the changes (closing the PR and pushing it as yours, without co-authoring) discourages new contributors, because we do not receive the credit for all the work we did (neither in Git history nor in GitHub Insights). Anyways, thanks for accepting my changes, I will now work on our side to use the new release of commons-lang as soon as it is released. Issue Time Tracking --- Worklog Id: (was: 788237) Remaining Estimate: 2h 10m (was: 2h 20m) Time Spent: 3h 50m (was: 3h 40m) > Let ReflectionToStringBuilder only reflect given field names > > > Key: LANG-1662 > URL: https://issues.apache.org/jira/browse/LANG-1662 > Project: Commons Lang > Issue Type: Improvement > Components: lang.builder.* >Reporter: Daniel Augusto Veronezi Salvador >Priority: Minor > Fix For: 3.13.0 > > Original Estimate: 6h > Time Spent: 3h 50m > Remaining Estimate: 2h 10m > > *ReflectionToStringBuilder* has methods to exclude fields from *toString*; If > we have an object with several fields and want to reflect only a fews, we > have to list all the fields that we don't want to reflect and pass to > *excludeFieldNames*. > Would be valid implement a way to pass the fields that we want and reflect > only the selected fields? > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [commons-lang] GutoVeronezi commented on pull request #849: [LANG-1662] Let ReflectionToStringBuilder only reflect given field names
GutoVeronezi commented on PR #849: URL: https://github.com/apache/commons-lang/pull/849#issuecomment-1176222681 Hi, @kinow, thanks for the reply. @garydgregory, it feels a bit weird for us here on the other side proposing things to contribute back to commons-lang and interacting with reviewers to see somebody "taking over the changes". If the PR is missing something or there is a better way to do something, I would expect the reviewers to point what can be improved, or at least it being merged with some changes/rebase and so on (as described by @kinow https://github.com/apache/commons-lang/pull/849#issuecomment-1175649542). This way of applying the changes (closing the PR and pushing it as yours, without co-authoring) discourages new contributors, because we do not receive the credit for all the work we did (neither in Git history nor in GitHub Insights). Anyways, thanks for accepting my changes, I will now work on our side to use the new release of commons-lang as soon as it is released. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] tvand merged pull request #97: Bump derby from 10.11.1.1 to 10.14.2.0 in /commons-jcs-jcache-openjpa
tvand merged PR #97: URL: https://github.com/apache/commons-jcs/pull/97 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] dependabot[bot] commented on pull request #96: Bump derby from 10.11.1.1 to 10.16.1.1
dependabot[bot] commented on PR #96: URL: https://github.com/apache/commons-jcs/pull/96#issuecomment-1176169513 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting `@dependabot ignore this major version` or `@dependabot ignore this minor version`. You can also ignore all major, minor, or patch releases for a dependency by adding an [`ignore` condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore) with the desired `update_types` to your config file. If you change your mind, just re-open this PR and I'll resolve any conflicts on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] tvand closed pull request #96: Bump derby from 10.11.1.1 to 10.16.1.1
tvand closed pull request #96: Bump derby from 10.11.1.1 to 10.16.1.1 URL: https://github.com/apache/commons-jcs/pull/96 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] tvand commented on pull request #96: Bump derby from 10.11.1.1 to 10.16.1.1
tvand commented on PR #96: URL: https://github.com/apache/commons-jcs/pull/96#issuecomment-1176169482 Incompatible change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] dependabot[bot] commented on pull request #98: Bump tomcat-catalina from 7.0.72 to 7.0.81 in /commons-jcs-jcache-extras
dependabot[bot] commented on PR #98: URL: https://github.com/apache/commons-jcs/pull/98#issuecomment-1176168528 OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting `@dependabot ignore this major version` or `@dependabot ignore this minor version`. You can also ignore all major, minor, or patch releases for a dependency by adding an [`ignore` condition](https://docs.github.com/en/code-security/supply-chain-security/configuration-options-for-dependency-updates#ignore) with the desired `update_types` to your config file. If you change your mind, just re-open this PR and I'll resolve any conflicts on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] tvand closed pull request #98: Bump tomcat-catalina from 7.0.72 to 7.0.81 in /commons-jcs-jcache-extras
tvand closed pull request #98: Bump tomcat-catalina from 7.0.72 to 7.0.81 in /commons-jcs-jcache-extras URL: https://github.com/apache/commons-jcs/pull/98 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] tvand commented on pull request #98: Bump tomcat-catalina from 7.0.72 to 7.0.81 in /commons-jcs-jcache-extras
tvand commented on PR #98: URL: https://github.com/apache/commons-jcs/pull/98#issuecomment-1176168491 Incompatible change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-jcs] tvand merged pull request #95: Bump actions/cache from 3.0.3 to 3.0.4
tvand merged PR #95: URL: https://github.com/apache/commons-jcs/pull/95 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (VFS-797) JIRA Fix VFS-741 is causing an error in reading the files from azure host using azsb protocol. Error is: org.apache.commons.vfs2.FileSystemException: Invalid descenden
[ https://issues.apache.org/jira/browse/VFS-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary D. Gregory closed VFS-797. --- Resolution: Information Provided Closing: no more feedback. > JIRA Fix VFS-741 is causing an error in reading the files from azure host > using azsb protocol.Error is: > org.apache.commons.vfs2.FileSystemException: Invalid descendent file name > ".//somepath/subfolder/file.txt" > -- > > Key: VFS-797 > URL: https://issues.apache.org/jira/browse/VFS-797 > Project: Commons VFS > Issue Type: Bug >Reporter: New User >Assignee: Gary D. Gregory >Priority: Minor > > I have updated the VFS version to from 2.2 to 2.7.0. Since then I have been > getting an error for my application in reading the files from azure host > using 'azsb' protocol. By default, it adds the prefix ./ in the filename > which is throwing an exception in resolving the filename. > > The error that is thrown is as below: > Caused by: org.apache.commons.vfs2.FileSystemException: Could not find files > in "azsb://somehost/somepath/subfolder".Caused by: > org.apache.commons.vfs2.FileSystemException: Could not find files in > "azsb://somehost/somepath/subfolder". at > org.apache.commons.vfs2.provider.AbstractFileObject.findFiles(AbstractFileObject.java:1015) > at > org.apache.commons.vfs2.provider.AbstractFileObject.listFiles(AbstractFileObject.java:1663) > at > org.apache.commons.vfs2.provider.AbstractFileObject.findFiles(AbstractFileObject.java:990) > > Caused by: org.apache.commons.vfs2.FileSystemException: Invalid descendent > file name ".//somepath/subfolder/file.txt". at > org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveName(DefaultFileSystemManager.java:806) > at > org.apache.commons.vfs2.provider.AbstractFileObject.getChildren(AbstractFileObject.java:1116) > at > org.apache.commons.vfs2.provider.AbstractFileObject.traverse(AbstractFileObject.java:93) > at > org.apache.commons.vfs2.provider.AbstractFileObject.findFiles(AbstractFileObject.java:1012) > ... 7 common frames omitted > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (VFS-797) JIRA Fix VFS-741 is causing an error in reading the files from azure host using azsb protocol. Error is: org.apache.commons.vfs2.FileSystemException: Invalid d
[ https://issues.apache.org/jira/browse/VFS-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17299812#comment-17299812 ] Gary D. Gregory edited comment on VFS-797 at 7/6/22 11:39 AM: -- Hi [~alir3] Do you have a reproducible test we can run? Is com.bsc.intg.svcs.core.vfs.BSCVFSProvider your code or is the source available? 2.2 to 2.7.0 is a big jump, may you try intermediate versions to narrow it down? TY for your report. was (Author: garydgregory): Hi [~alir3] Do you have a reproducible test we can run? Is com.bsc.intg.svcs.core.vfs.BSCVFSProvider your code or is the source available? 2.2 to 2.7.0 is a big jump, may you try intermediate versions to narrow it down? TY for you report. > JIRA Fix VFS-741 is causing an error in reading the files from azure host > using azsb protocol.Error is: > org.apache.commons.vfs2.FileSystemException: Invalid descendent file name > ".//somepath/subfolder/file.txt" > -- > > Key: VFS-797 > URL: https://issues.apache.org/jira/browse/VFS-797 > Project: Commons VFS > Issue Type: Bug >Reporter: New User >Assignee: Gary D. Gregory >Priority: Minor > > I have updated the VFS version to from 2.2 to 2.7.0. Since then I have been > getting an error for my application in reading the files from azure host > using 'azsb' protocol. By default, it adds the prefix ./ in the filename > which is throwing an exception in resolving the filename. > > The error that is thrown is as below: > Caused by: org.apache.commons.vfs2.FileSystemException: Could not find files > in "azsb://somehost/somepath/subfolder".Caused by: > org.apache.commons.vfs2.FileSystemException: Could not find files in > "azsb://somehost/somepath/subfolder". at > org.apache.commons.vfs2.provider.AbstractFileObject.findFiles(AbstractFileObject.java:1015) > at > org.apache.commons.vfs2.provider.AbstractFileObject.listFiles(AbstractFileObject.java:1663) > at > org.apache.commons.vfs2.provider.AbstractFileObject.findFiles(AbstractFileObject.java:990) > > Caused by: org.apache.commons.vfs2.FileSystemException: Invalid descendent > file name ".//somepath/subfolder/file.txt". at > org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveName(DefaultFileSystemManager.java:806) > at > org.apache.commons.vfs2.provider.AbstractFileObject.getChildren(AbstractFileObject.java:1116) > at > org.apache.commons.vfs2.provider.AbstractFileObject.traverse(AbstractFileObject.java:93) > at > org.apache.commons.vfs2.provider.AbstractFileObject.findFiles(AbstractFileObject.java:1012) > ... 7 common frames omitted > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [commons-collections] Claudenw commented on a diff in pull request #317: moved IndexFilter to its own file.
Claudenw commented on code in PR #317: URL: https://github.com/apache/commons-collections/pull/317#discussion_r914728425 ## src/main/java/org/apache/commons/collections4/bloomfilter/IndexFilter.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.collections4.bloomfilter; + +import java.util.function.IntPredicate; + +/** + * A convenience class for Hasher implementations to filter out duplicate indices. + * + * If the index is negative the behavior is not defined. + * + * This is conceptually a unique filter implemented as a {@code IntPredicate}. + * @since 4.5 + */ +public final class IndexFilter implements IntPredicate { +private final IntPredicate tracker; +private final int size; +private final IntPredicate consumer; + +/** + * Creates an instance optimized for the specified shape. + * @param shape The shape that is being generated. + * @param consumer The consumer to accept the values. + * @return an IndexFilter optimized for the specified shape. + */ +public static IndexFilter create(Shape shape, IntPredicate consumer) { Review Comment: OK. It is changed now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NET-408) problem connecting to ProFTPD with FTPES
[ https://issues.apache.org/jira/browse/NET-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17563042#comment-17563042 ] Stefan Cordes commented on NET-408: --- In the meantime (2009) the vsftpd released a server version which can workaround the missing implementation in the FTPSClient: {*}require_ssl_reuse{*}=NO [https://scarybeastsecurity.blogspot.com/2009/02/vsftpd-210-released.html] _> If your SSL FTP client does not re-use sessions, you can turn this off but you would do better to change FTP clients._ > problem connecting to ProFTPD with FTPES > > > Key: NET-408 > URL: https://issues.apache.org/jira/browse/NET-408 > Project: Commons Net > Issue Type: Bug > Components: FTP >Affects Versions: 2.2, 3.0 > Environment: ProFTPD 1.3.3d on SUSE Linux Enterprise Server 10.1 > 32bit, Kernel 2.6.16.46-0.12-default (config file attached) > ProFTPD 1.3.3d on OpenSUSE 64bit Linux 2.6.34.8-0.2-desktop > Java 1.5 >Reporter: Michael Voigt >Priority: Major > Attachments: BCFTPSClient.java, FTPSClientWithTLSResumption.zip, > PTFTPSClient.java, ftpes.jpg, proftpd.conf > > > I have a problem with the FTPClient connecting to a ProFTPD server. > If the server uses the configuration option "TLSProtocol TLSv1", I > cannot connect to it at all. I recieve the following error message: > - javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection > On the server side I see in the log: > unable to accept TLS connection: protocol error: > - (1) error:14094416:SSL routines:SSL3_READ_BYTES:sslv3 alert > certificate unknown > - TLS/TLS-C negotiation failed on control channel > If the server uses the configuration option "TLSProtocol SSLv23", I > can connect to it but I cant transfer any files. In the server log I > see: > - starting TLS negotiation on data connection > - TLSv1/SSLv3 renegotiation accepted, using cipher RC4-MD5 (128 bits) > - client did not reuse SSL session, rejecting data connection (see > TLSOption NoSessionReuseRequired) > - unable to open data connection: TLS negotiation failed > If I add the NoSessionReuseRequired parameter to the ProFTPD config > everything works fine. > Here is my code: >FTPClient ftpClient = new FTPClient(); >ftpClient = new FTPSClient("TLS"); >// this throws an exception with TLSProtocol TLSv1 >ftpClient.connect(host, port); >int reply = ftpClient.getReplyCode(); >if (!FTPReply.isPositiveCompletion(reply)) { >ftpClient.disconnect(); >log.error("The FTP Server did not return a positive > completion reply!"); >throw new > FtpTransferException(ECCUtils.ERROR_FTP_CONNECTION); >} >boolean loginSuccessful = ftpClient.login(userName, password); >if (!loginSuccessful) { >log.error("Login to the FTP Server failed! The > credentials are not valid."); >throw new > FtpTransferException(ECCUtils.ERROR_FTP_LOGIN); >} >ftpClient.execPBSZ(0); >ftpClient.execPROT("P"); >boolean success = ftpClient.storeFile(fileName, fis); >if (!success) { >// this is false if "NoSessionReuseRequired" is not set >} > Now my question is if it is generally possible to connect to a server > with "TLSProtocol TLSv1" or "TLSProtocol SSLv23" without the > "NoSessionReuseRequired" parameter? Could someone provide a piece of > example code for this? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [commons-collections] aherbert commented on a diff in pull request #317: moved IndexFilter to its own file.
aherbert commented on code in PR #317: URL: https://github.com/apache/commons-collections/pull/317#discussion_r914529367 ## src/main/java/org/apache/commons/collections4/bloomfilter/IndexFilter.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.collections4.bloomfilter; + +import java.util.function.IntPredicate; + +/** + * A convenience class for Hasher implementations to filter out duplicate indices. + * + * If the index is negative the behavior is not defined. + * + * This is conceptually a unique filter implemented as a {@code IntPredicate}. + * @since 4.5 + */ +public final class IndexFilter implements IntPredicate { +private final IntPredicate tracker; +private final int size; +private final IntPredicate consumer; + +/** + * Creates an instance optimized for the specified shape. + * @param shape The shape that is being generated. + * @param consumer The consumer to accept the values. + * @return an IndexFilter optimized for the specified shape. + */ +public static IndexFilter create(Shape shape, IntPredicate consumer) { Review Comment: I do not think this matters. The current choice is an optimisation is based on memory usage. It could be changed in the future, for example based on a JMH benchmark test for performance. All that matters is that the test code exercises all paths and the returned predicate functions as expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [commons-collections] Claudenw commented on a diff in pull request #317: moved IndexFilter to its own file.
Claudenw commented on code in PR #317: URL: https://github.com/apache/commons-collections/pull/317#discussion_r914490496 ## src/main/java/org/apache/commons/collections4/bloomfilter/IndexFilter.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.collections4.bloomfilter; + +import java.util.function.IntPredicate; + +/** + * A convenience class for Hasher implementations to filter out duplicate indices. + * + * If the index is negative the behavior is not defined. + * + * This is conceptually a unique filter implemented as a {@code IntPredicate}. + * @since 4.5 + */ +public final class IndexFilter implements IntPredicate { +private final IntPredicate tracker; +private final int size; +private final IntPredicate consumer; + +/** + * Creates an instance optimized for the specified shape. + * @param shape The shape that is being generated. + * @param consumer The consumer to accept the values. + * @return an IndexFilter optimized for the specified shape. + */ +public static IndexFilter create(Shape shape, IntPredicate consumer) { Review Comment: Changing to returning the IntPredicate means that the test code can not test the selection of the tracker type. Is this acceptable? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Reopened] (COLLECTIONS-762) Add Simple Bloom filter with Generic type
[ https://issues.apache.org/jira/browse/COLLECTIONS-762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert reopened COLLECTIONS-762: -- > Add Simple Bloom filter with Generic type > - > > Key: COLLECTIONS-762 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-762 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.4 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > > Provide a class that implements a Bloom filter that accepts a generic object > as a parameter to the merge() and contains() methods. > Implementation should use a Function style implementation to convert from the > generic type to a Builder. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (COLLECTIONS-762) Add Simple Bloom filter with Generic type
[ https://issues.apache.org/jira/browse/COLLECTIONS-762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert resolved COLLECTIONS-762. -- Resolution: Done > Add Simple Bloom filter with Generic type > - > > Key: COLLECTIONS-762 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-762 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.4 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > > Provide a class that implements a Bloom filter that accepts a generic object > as a parameter to the merge() and contains() methods. > Implementation should use a Function style implementation to convert from the > generic type to a Builder. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (COLLECTIONS-749) Better documentation needed for HashFunctionIdentity.Signedness
[ https://issues.apache.org/jira/browse/COLLECTIONS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert resolved COLLECTIONS-749. -- Resolution: Done > Better documentation needed for HashFunctionIdentity.Signedness > --- > > Key: COLLECTIONS-749 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-749 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > Time Spent: 20m > Remaining Estimate: 0h > > From Alex Herbert: > HashFunctionIdentity.Signedness > > This is not fully documented as to what the sign applies to. There is no > javadoc on the enum values for SIGNED and UNSIGNED. The current javadoc > states "Identifies the signedness of the calculations for this function". > > I assume this applies to the Hash computation in 'long > HashFunction.apply(byte[], int)' > > Does this mean the hash algorithm has a variant that can treat the > bytes/seed as signed or unsigned. If so which one because 2 enum values > cannot cover all 4 possibilities. Since there is no javadoc it is > unclear exactly what this property is supposed to indicate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (COLLECTIONS-750) Add a Bloom filter Hasher implementation that minimizes information leakage
[ https://issues.apache.org/jira/browse/COLLECTIONS-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert resolved COLLECTIONS-750. -- Resolution: Done > Add a Bloom filter Hasher implementation that minimizes information leakage > --- > > Key: COLLECTIONS-750 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-750 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > > The current implementations of Hasher either leak the contents that are being > hashed or are fixed to a specific shape. This change is to add a Hasher that > only tracks the hashed values so that secure systems may use the Bloom filter > implementation without fear of information leakage beyond that of the hashed > values. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (COLLECTIONS-749) Better documentation needed for HashFunctionIdentity.Signedness
[ https://issues.apache.org/jira/browse/COLLECTIONS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert reopened COLLECTIONS-749: -- > Better documentation needed for HashFunctionIdentity.Signedness > --- > > Key: COLLECTIONS-749 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-749 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > Time Spent: 20m > Remaining Estimate: 0h > > From Alex Herbert: > HashFunctionIdentity.Signedness > > This is not fully documented as to what the sign applies to. There is no > javadoc on the enum values for SIGNED and UNSIGNED. The current javadoc > states "Identifies the signedness of the calculations for this function". > > I assume this applies to the Hash computation in 'long > HashFunction.apply(byte[], int)' > > Does this mean the hash algorithm has a variant that can treat the > bytes/seed as signed or unsigned. If so which one because 2 enum values > cannot cover all 4 possibilities. Since there is no javadoc it is > unclear exactly what this property is supposed to indicate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (COLLECTIONS-750) Add a Bloom filter Hasher implementation that minimizes information leakage
[ https://issues.apache.org/jira/browse/COLLECTIONS-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert reopened COLLECTIONS-750: -- > Add a Bloom filter Hasher implementation that minimizes information leakage > --- > > Key: COLLECTIONS-750 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-750 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > > The current implementations of Hasher either leak the contents that are being > hashed or are fixed to a specific shape. This change is to add a Hasher that > only tracks the hashed values so that secure systems may use the Bloom filter > implementation without fear of information leakage beyond that of the hashed > values. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (COLLECTIONS-823) BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap
[ https://issues.apache.org/jira/browse/COLLECTIONS-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert reopened COLLECTIONS-823: -- > BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap > > > Key: COLLECTIONS-823 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-823 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Assignee: Claude Warren >Priority: Minor > Labels: bloom-filter > Fix For: 4.5 > > Time Spent: 10m > Remaining Estimate: 0h > > > > Member > h3. > !https://avatars.githubusercontent.com/u/886334?s=48=4|width=24,height=24! > *[aherbert|https://github.com/aherbert]* [on 27 > Feb|https://github.com/apache/commons-collections/pull/258#discussion_r812499923] > This converts all the non zero indices to a bitmap long[] array. But to do so > requires using the {{forEachIndex}} method with a conditional boolean check > on each loop iteration. I wonder if this should be brought inline for > efficiency: > {noformat} > @Override > public boolean forEachBitMap(LongPredicate consumer) { > Objects.requireNonNull(consumer, "consumer"); > long[] result = new > long[BitMap.numberOfBitMaps(shape.getNumberOfBits())]; > for (int i = 0; i < counts.length; i++) { > if (counts[i] != 0) { > // Avoids a second check on the predicate result in > forEachIndex > BitMap.set(result, i); > } > } > return BitMapProducer.fromBitMapArray(result).forEachBitMap(consumer); > } > {noformat} > Or better yet, avoid the {{long[]}} array: > {noformat} > int blocksm1 = BitMap.numberOfBitMaps(shape.getNumberOfBits()) - 1; > int i = 0; > long value; > for (int j = 0; j < blocksm1; j++) { > value = 0; > for (int k = 0; k < Long.SIZE; k++) { > if (counts[i++] != 0) { > value |= BitMap.getLongBit(k); > } > } > if (!consumer.test(value)) { > return false; > } > } > // Final block > value = 0; > for (int k = 0; i < counts.length; k++) { > if (counts[i++] != 0) { > value |= BitMap.getLongBit(k); > } > } > return consumer.test(value); > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (COLLECTIONS-823) BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap
[ https://issues.apache.org/jira/browse/COLLECTIONS-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Herbert resolved COLLECTIONS-823. -- Resolution: Done > BloomFilter: Optimize ArrayCountingBloomFilter.ForEachBitMap > > > Key: COLLECTIONS-823 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-823 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Assignee: Claude Warren >Priority: Minor > Labels: bloom-filter > Fix For: 4.5 > > Time Spent: 10m > Remaining Estimate: 0h > > > > Member > h3. > !https://avatars.githubusercontent.com/u/886334?s=48=4|width=24,height=24! > *[aherbert|https://github.com/aherbert]* [on 27 > Feb|https://github.com/apache/commons-collections/pull/258#discussion_r812499923] > This converts all the non zero indices to a bitmap long[] array. But to do so > requires using the {{forEachIndex}} method with a conditional boolean check > on each loop iteration. I wonder if this should be brought inline for > efficiency: > {noformat} > @Override > public boolean forEachBitMap(LongPredicate consumer) { > Objects.requireNonNull(consumer, "consumer"); > long[] result = new > long[BitMap.numberOfBitMaps(shape.getNumberOfBits())]; > for (int i = 0; i < counts.length; i++) { > if (counts[i] != 0) { > // Avoids a second check on the predicate result in > forEachIndex > BitMap.set(result, i); > } > } > return BitMapProducer.fromBitMapArray(result).forEachBitMap(consumer); > } > {noformat} > Or better yet, avoid the {{long[]}} array: > {noformat} > int blocksm1 = BitMap.numberOfBitMaps(shape.getNumberOfBits()) - 1; > int i = 0; > long value; > for (int j = 0; j < blocksm1; j++) { > value = 0; > for (int k = 0; k < Long.SIZE; k++) { > if (counts[i++] != 0) { > value |= BitMap.getLongBit(k); > } > } > if (!consumer.test(value)) { > return false; > } > } > // Final block > value = 0; > for (int k = 0; i < counts.length; k++) { > if (counts[i++] != 0) { > value |= BitMap.getLongBit(k); > } > } > return consumer.test(value); > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[GitHub] [commons-collections] aherbert commented on a diff in pull request #317: moved IndexFilter to its own file.
aherbert commented on code in PR #317: URL: https://github.com/apache/commons-collections/pull/317#discussion_r914478848 ## src/main/java/org/apache/commons/collections4/bloomfilter/IndexFilter.java: ## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.collections4.bloomfilter; + +import java.util.function.IntPredicate; + +/** + * A convenience class for Hasher implementations to filter out duplicate indices. + * + * If the index is negative the behavior is not defined. + * + * This is conceptually a unique filter implemented as a {@code IntPredicate}. + * @since 4.5 + */ +public final class IndexFilter implements IntPredicate { +private final IntPredicate tracker; +private final int size; +private final IntPredicate consumer; + +/** + * Creates an instance optimized for the specified shape. + * @param shape The shape that is being generated. + * @param consumer The consumer to accept the values. + * @return an IndexFilter optimized for the specified shape. + */ +public static IndexFilter create(Shape shape, IntPredicate consumer) { Review Comment: I do not think this needs to return an IndexFilter. It is more flexible for the implementation to return an `IntPredicate`: ```Java public final class IndexFilter /* not required - implements IntPredicate */ { public static IntPredicate create(Shape shape, IntPredicate consumer) { return new IndexFilter(shape, consumer)::test; } ``` All the documentation from the `test` method can be moved to the static constructor method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@commons.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (COLLECTIONS-822) BloomFilter: change ArrayCountinBloomFilter constructor exception type
[ https://issues.apache.org/jira/browse/COLLECTIONS-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claude Warren resolved COLLECTIONS-822. --- Fix Version/s: 4.5 Resolution: Won't Fix Changing exception matches documented behaviour > BloomFilter: change ArrayCountinBloomFilter constructor exception type > -- > > Key: COLLECTIONS-822 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-822 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Labels: bloom-filter > Fix For: 4.5 > > > [src/main/java/org/apache/commons/collections4/bloomfilter/ArrayCountingBloomFilter.java|https://github.com/apache/commons-collections/pull/258/files/c3d78d5a9c033e4ded1f95a3868395b71dbfcc12#diff-b4b8848c4ea950c78499756d5fcad26bda95cf076423283f8eb77d26838fcf95] > > > | try {| > | filter.add(BitCountProducer.from(hasher.uniqueIndices(shape)));| > | } catch (IndexOutOfBoundsException e) {| > | throw new IllegalArgumentException(| > > > > Member > h3. > !https://avatars.githubusercontent.com/u/886334?s=48=4|width=24,height=24! > *[aherbert|https://github.com/aherbert]* [on 27 > Feb|https://github.com/apache/commons-collections/pull/258#discussion_r813354186] > Why change the IOOB exception to an IAE? Neither are documented to be thrown > by either add or merge. So here you have inconsistent exceptions being thrown. > Note: If you rethrow the exception you should include the original exception > as the cause. > I would just leave this as an IOOB exception and add it to the method > javadocs that this will occur for invalid indices. > > h3. > !https://avatars.githubusercontent.com/u/89772101?s=48=4|width=24,height=24! > *[Claude-at-Instaclustr|https://github.com/Claude-at-Instaclustr]* [on 10 > Mar|https://github.com/apache/commons-collections/pull/258#discussion_r823925019] > I think this is a documentation issue. We have specific comments for > mergeInPlace that states it throws an illgetlArgumentException on numbers out > of range. merge() and mergeInPlace() need to be consistent across all the > implementations. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (COLLECTIONS-762) Add Simple Bloom filter with Generic type
[ https://issues.apache.org/jira/browse/COLLECTIONS-762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claude Warren closed COLLECTIONS-762. - Resolution: Fixed This issue was resolved with simplification of Bloom filter code > Add Simple Bloom filter with Generic type > - > > Key: COLLECTIONS-762 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-762 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.4 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > > Provide a class that implements a Bloom filter that accepts a generic object > as a parameter to the merge() and contains() methods. > Implementation should use a Function style implementation to convert from the > generic type to a Builder. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (COLLECTIONS-750) Add a Bloom filter Hasher implementation that minimizes information leakage
[ https://issues.apache.org/jira/browse/COLLECTIONS-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claude Warren closed COLLECTIONS-750. - Fix Version/s: 4.5 Resolution: Fixed This issue was resolved with simplification of Bloom filter code > Add a Bloom filter Hasher implementation that minimizes information leakage > --- > > Key: COLLECTIONS-750 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-750 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > > The current implementations of Hasher either leak the contents that are being > hashed or are fixed to a specific shape. This change is to add a Hasher that > only tracks the hashed values so that secure systems may use the Bloom filter > implementation without fear of information leakage beyond that of the hashed > values. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (COLLECTIONS-749) Better documentation needed for HashFunctionIdentity.Signedness
[ https://issues.apache.org/jira/browse/COLLECTIONS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claude Warren updated COLLECTIONS-749: -- Fix Version/s: 4.5 > Better documentation needed for HashFunctionIdentity.Signedness > --- > > Key: COLLECTIONS-749 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-749 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Fix For: 4.5 > > Time Spent: 20m > Remaining Estimate: 0h > > From Alex Herbert: > HashFunctionIdentity.Signedness > > This is not fully documented as to what the sign applies to. There is no > javadoc on the enum values for SIGNED and UNSIGNED. The current javadoc > states "Identifies the signedness of the calculations for this function". > > I assume this applies to the Hash computation in 'long > HashFunction.apply(byte[], int)' > > Does this mean the hash algorithm has a variant that can treat the > bytes/seed as signed or unsigned. If so which one because 2 enum values > cannot cover all 4 possibilities. Since there is no javadoc it is > unclear exactly what this property is supposed to indicate. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (COLLECTIONS-749) Better documentation needed for HashFunctionIdentity.Signedness
[ https://issues.apache.org/jira/browse/COLLECTIONS-749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Claude Warren closed COLLECTIONS-749. - Resolution: Fixed This issue was resolved with simplification of Bloom filter code > Better documentation needed for HashFunctionIdentity.Signedness > --- > > Key: COLLECTIONS-749 > URL: https://issues.apache.org/jira/browse/COLLECTIONS-749 > Project: Commons Collections > Issue Type: Improvement > Components: Collection >Affects Versions: 4.5 >Reporter: Claude Warren >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > From Alex Herbert: > HashFunctionIdentity.Signedness > > This is not fully documented as to what the sign applies to. There is no > javadoc on the enum values for SIGNED and UNSIGNED. The current javadoc > states "Identifies the signedness of the calculations for this function". > > I assume this applies to the Hash computation in 'long > HashFunction.apply(byte[], int)' > > Does this mean the hash algorithm has a variant that can treat the > bytes/seed as signed or unsigned. If so which one because 2 enum values > cannot cover all 4 possibilities. Since there is no javadoc it is > unclear exactly what this property is supposed to indicate. -- This message was sent by Atlassian Jira (v8.20.10#820010)