Re: [all] OSS Fuzz
+1 for oss fuzz. Fabian also got in contact a few days earlier, and asked me about using it with Commons Imaging. I told him it had to be discussed here first, but that I thought it could be useful (we are parsing several image file formats, probably a few things could be improved). As for the mailing list, for me it depends on the amount of messages, and false-positives. i.e. if we get 50 e-mails in security@commons in one week, and turns out only 1 is actually a security issue, and the others are either normal bugs and no bugs, then eventually I think I'd just create a filter to move all the security@commons to a folder and have a look someday. I think we don't have any idea how many e-mails we might get enabling it for one or for a few components. So I'd be OK with - sending e-mails to security@commons initially, but if it spams the list with non-security related e-mails, then move to a separate mailing list; OR - create the new mailing list (probably private too? until we filter the issues?) and use it for a few weeks/months. If the traffic is low, or most issues are really security related, then move to security@commons if others agree Either way would be OK for me. Cheers Bruno On Wednesday, 14 April 2021, 4:49:31 am NZST, Stefan Bodewig wrote: Hi all I want to pick up (and finish) the discussion that started in Compress[1]. Short Recap: OSS Fuzz[2] runs fuzz testing for open source projects by invoking methods of our code with random data looking for unexpected outcomes (undeclared exceptions or worse code that never returns because it is stuck in an infinite loop for example). For Compress Fabian (who started [1]) has already identified and reported several issues, one of which would have become a CVE if the code in question had been part of any release of Compress. In the past other people have run different fuzzers and found "interesting" results in Compress as well. Compress may be especially vulnerable as it basically tries to make sense out of a bunch of user supplied bytes - but the same is probably true for codec or imaging for example. Fabian has offered to set up OSS Fuzz for Compress. Given that the issues OSS Fuzz detects may or may not be security sensitive, I don't feel it would be a good idea to have the tool send reports to a public mailing list. Therefore I propose to create another subscription moderated list just for these kinds of reports. I'm afraid it could be too noisy for security@commons. Proposal Unless anybody objects until then I will create such a list (I believe there is a self-service thingy for that, otherwise I'll ask the infra folks) on the coming Sunday. I'd add myself as a moderator but we will need more moderators. Also I'll gladly accept ideas for the name of the list. If there are objections against yet another mailing list I'll ask Fabian to set things up using a private mail alias. If you want to receive the messages as well, please tell me. Cheers Stefan [1] https://lists.apache.org/thread.html/rb34ea7d9272b8e600437ea705b13aba1bcc2f23ceb55880bce27e479%40%3Cdev.commons.apache.org%3E [2] https://google.github.io/oss-fuzz/ - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Vote] Create a "machine learning" component
Le mar. 13 avr. 2021 à 18:21, Avijit Basak a écrit : > > Hi > > Please find my comments below. > > >> I don't follow the distinction "prod" vs "non-prod". > -- Actually in Prod we really need a very high performing system. So > use of implicit parallelism in spark would help us to achieve it. But for > other types of work like POC or R we may not need such performance. Isn't a GA inherently parallel? If so, why not take advantage of the concurrency tools provided by the JDK? > >> the question was actually whether you are willing to modularize CM > -- I am not much aware of other ml components in commons. I would look > into it. I've mentioned them in earlier messages: * Self-organizing feature map (artificial neural net) * Clustering The former is multi-threaded; the latter should be refactored to take advantage of multi-threading. > >>You did not expand about the usability/performance (e.g. the issue of > multi-threading) > -- Are we planning to incorporate parallel GA. Aren't you? > Then multi-threading > would be a more appropriate option. IMHO, a necessary one. > >> So, as a way forward, I would suggest that you create a project on > GitHub (copying all the settings from a *Commons modular* component, such as > "Commons Numbers") > -- Could you kindly share the GitHub repository URL for any Commons > modular component. https://github.com/apache/commons-rng https://github.com/apache/commons-numbers https://github.com/apache/commons-geometry https://github.com/apache/commons-statistics > > Thanks & Regards > --Avijit Basak > > > On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski wrote: > > > Hello. > > > > Le lun. 12 avr. 2021 à 17:21, Avijit Basak a > > écrit : > > > > > > Hi > > > > > > Sorry for the delayed response. Thanks for your patience. Please > > > find my comments below: > > > > > > (1) Why not Spark? [At least post over there (?).] > > > --We can move to Spark. But it will be very much useful if the > > things > > > can also run without Spark. The use of Spark would make more sense in a > > > production environment. But the portability of the library will be more > > > useful for the non-prod environment. > > > > I don't follow the distinction "prod" vs "non-prod". > > > > > Definitely, we can reach the Spark > > > team and query. > > > > That would be a good idea... > > > > > (2) Further develop a monolithic CM? [Who will do it?] > > >--I can help with the upgrade of the existing library related to > > GA > > > functionality. > > > > Sure, but nobody is currently working on (2). > > > > > (3) Modularize CM? [Who will do it?] > > >--I can help with the upgrade of the existing library related to > > GA > > > functionality. > > > > I don't doubt it; but the question was actually whether you are willing > > to modularize CM (that is: in addition to, and before, contributing to > > the GA functionality). > > > > > (4) New component (with another name) with the proposed contents? > > >--This is the best option if permitted. > > > > Currently, only the two of us are in favour of this alternative. > > > > Nobody, by their action, is really in favour of any of the other > > alternatives. > > So, as a way forward, I would suggest that you create a project on GitHub > > (copying all the settings from a Commons modular component, such as > > "Commons Numbers"), to be eventually integrated here, once its potential > > has been demonstrated. > > > > > The code which I have written can be reused with minor > > modifications. > > > So it won't take too much effort for this activity. > > > > You did not expand about the usability/performance (e.g. the issue of > > multi-threading)... > > > > Regards, > > Gilles > > > > >> [...] > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [all] OSS Fuzz
Please don't use @security for automated emails, that ML IMO should be for humans. If you want to setup a new ML for bots that's fine, we can direct GitHub's Dependanot emails there if GitHub allows for that. Gary On Tue, Apr 13, 2021, 12:57 Mark Thomas wrote: > On 13/04/2021 17:49, Stefan Bodewig wrote: > > > > > Fabian has offered to set up OSS Fuzz for Compress. Given that the > > issues OSS Fuzz detects may or may not be security sensitive, I don't > > feel it would be a good idea to have the tool send reports to a public > > mailing list. Therefore I propose to create another subscription > > moderated list just for these kinds of reports. I'm afraid it could be > > too noisy for security@commons. > > Following the "split by audience, not by topic" guideline, I'd suggest > using security@commons.a.o rather than a separate list. Much, much > bigger projects than Compress use OSS Fuzz and direct traffic to their > security list where it seems to be manageable. > > Mark > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >
Re: [all] OSS Fuzz
On 13/04/2021 17:49, Stefan Bodewig wrote: Fabian has offered to set up OSS Fuzz for Compress. Given that the issues OSS Fuzz detects may or may not be security sensitive, I don't feel it would be a good idea to have the tool send reports to a public mailing list. Therefore I propose to create another subscription moderated list just for these kinds of reports. I'm afraid it could be too noisy for security@commons. Following the "split by audience, not by topic" guideline, I'd suggest using security@commons.a.o rather than a separate list. Much, much bigger projects than Compress use OSS Fuzz and direct traffic to their security list where it seems to be manageable. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[all] OSS Fuzz
Hi all I want to pick up (and finish) the discussion that started in Compress[1]. Short Recap: OSS Fuzz[2] runs fuzz testing for open source projects by invoking methods of our code with random data looking for unexpected outcomes (undeclared exceptions or worse code that never returns because it is stuck in an infinite loop for example). For Compress Fabian (who started [1]) has already identified and reported several issues, one of which would have become a CVE if the code in question had been part of any release of Compress. In the past other people have run different fuzzers and found "interesting" results in Compress as well. Compress may be especially vulnerable as it basically tries to make sense out of a bunch of user supplied bytes - but the same is probably true for codec or imaging for example. Fabian has offered to set up OSS Fuzz for Compress. Given that the issues OSS Fuzz detects may or may not be security sensitive, I don't feel it would be a good idea to have the tool send reports to a public mailing list. Therefore I propose to create another subscription moderated list just for these kinds of reports. I'm afraid it could be too noisy for security@commons. Proposal Unless anybody objects until then I will create such a list (I believe there is a self-service thingy for that, otherwise I'll ask the infra folks) on the coming Sunday. I'd add myself as a moderator but we will need more moderators. Also I'll gladly accept ideas for the name of the list. If there are objections against yet another mailing list I'll ask Fabian to set things up using a private mail alias. If you want to receive the messages as well, please tell me. Cheers Stefan [1] https://lists.apache.org/thread.html/rb34ea7d9272b8e600437ea705b13aba1bcc2f23ceb55880bce27e479%40%3Cdev.commons.apache.org%3E [2] https://google.github.io/oss-fuzz/ - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [Vote] Create a "machine learning" component
Hi Please find my comments below. >> I don't follow the distinction "prod" vs "non-prod". -- Actually in Prod we really need a very high performing system. So use of implicit parallelism in spark would help us to achieve it. But for other types of work like POC or R we may not need such performance. >> the question was actually whether you are willing to modularize CM -- I am not much aware of other ml components in commons. I would look into it. >>You did not expand about the usability/performance (e.g. the issue of multi-threading) -- Are we planning to incorporate parallel GA. Then multi-threading would be a more appropriate option. >> So, as a way forward, I would suggest that you create a project on GitHub (copying all the settings from a *Commons modular* component, such as "Commons Numbers") -- Could you kindly share the GitHub repository URL for any Commons modular component. Thanks & Regards --Avijit Basak On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski wrote: > Hello. > > Le lun. 12 avr. 2021 à 17:21, Avijit Basak a > écrit : > > > > Hi > > > > Sorry for the delayed response. Thanks for your patience. Please > > find my comments below: > > > > (1) Why not Spark? [At least post over there (?).] > > --We can move to Spark. But it will be very much useful if the > things > > can also run without Spark. The use of Spark would make more sense in a > > production environment. But the portability of the library will be more > > useful for the non-prod environment. > > I don't follow the distinction "prod" vs "non-prod". > > > Definitely, we can reach the Spark > > team and query. > > That would be a good idea... > > > (2) Further develop a monolithic CM? [Who will do it?] > >--I can help with the upgrade of the existing library related to > GA > > functionality. > > Sure, but nobody is currently working on (2). > > > (3) Modularize CM? [Who will do it?] > >--I can help with the upgrade of the existing library related to > GA > > functionality. > > I don't doubt it; but the question was actually whether you are willing > to modularize CM (that is: in addition to, and before, contributing to > the GA functionality). > > > (4) New component (with another name) with the proposed contents? > >--This is the best option if permitted. > > Currently, only the two of us are in favour of this alternative. > > Nobody, by their action, is really in favour of any of the other > alternatives. > So, as a way forward, I would suggest that you create a project on GitHub > (copying all the settings from a Commons modular component, such as > "Commons Numbers"), to be eventually integrated here, once its potential > has been demonstrated. > > > The code which I have written can be reused with minor > modifications. > > So it won't take too much effort for this activity. > > You did not expand about the usability/performance (e.g. the issue of > multi-threading)... > > Regards, > Gilles > > >> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Vote] Create a "machine learning" component
Hello. Le lun. 12 avr. 2021 à 17:21, Avijit Basak a écrit : > > Hi > > Sorry for the delayed response. Thanks for your patience. Please > find my comments below: > > (1) Why not Spark? [At least post over there (?).] > --We can move to Spark. But it will be very much useful if the things > can also run without Spark. The use of Spark would make more sense in a > production environment. But the portability of the library will be more > useful for the non-prod environment. I don't follow the distinction "prod" vs "non-prod". > Definitely, we can reach the Spark > team and query. That would be a good idea... > (2) Further develop a monolithic CM? [Who will do it?] >--I can help with the upgrade of the existing library related to GA > functionality. Sure, but nobody is currently working on (2). > (3) Modularize CM? [Who will do it?] >--I can help with the upgrade of the existing library related to GA > functionality. I don't doubt it; but the question was actually whether you are willing to modularize CM (that is: in addition to, and before, contributing to the GA functionality). > (4) New component (with another name) with the proposed contents? >--This is the best option if permitted. Currently, only the two of us are in favour of this alternative. Nobody, by their action, is really in favour of any of the other alternatives. So, as a way forward, I would suggest that you create a project on GitHub (copying all the settings from a Commons modular component, such as "Commons Numbers"), to be eventually integrated here, once its potential has been demonstrated. > The code which I have written can be reused with minor modifications. > So it won't take too much effort for this activity. You did not expand about the usability/performance (e.g. the issue of multi-threading)... Regards, Gilles >> [...] - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [lang] Failing test on Java 16-EA.
Hello Gary, I had a look at this one and I was able to reproduce this. Based on my reading of the code and what it does, IMO, this is a JDK issue. Since this was previously raised and reported in this list here[1] and a JDK issue was created https://bugs.openjdk.java.net/browse/JDK-8262108, I decided to reopen that issue and have included the necessary details of my investigation there. [1] https://www.mail-archive.com/dev@commons.apache.org/msg70599.html P.S: I'm not subscribed to this commons dev mailing list and I just watch/reply from the Apache mailing list tools, so my responses might be delayed. -Jaikiran On 2021/03/28 17:17:13, Gary Gregory wrote: > I'm till looking for help on getting LANG working on Java 16... > > Gary > > On Sat, Mar 20, 2021, 21:39 Gary Gregory wrote: > > > Now that Java 16 is out, we really need to look at this IMO but I would > > like help from the community. > > > > My initial guess that this a JDK bug might be wrong and it could be an > > issue in our code. > > > > Gary > > > > On Tue, Feb 23, 2021, 22:13 Gary Gregory wrote: > > > >> Hi All: > >> > >> If you feel so inclined, I'd like help with > >> FastDateParserTest.java#testParsesKnownJava16Ea25Failure(). > >> > >> The test fails on Java 16 Early Access build 25 and above, I am now > >> testing with build > >> 36. > >> > >> I cannot tell if this a bug in our code or in the underlying JRE. > >> > >> TY! > >> Gary > >> > > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org