date:20210413

Re: [all] OSS Fuzz

2021-04-13 Thread Bruno P. Kinoshita

+1 for oss fuzz. Fabian also got in contact a few days earlier, and asked me
about using it with Commons Imaging. I told him it had to be discussed here
first, but that I thought it could be useful (we are parsing several image file
formats, probably a few things could be improved).

As for the mailing list, for me it depends on the amount of messages, and
false-positives. i.e. if we get 50 e-mails in security@commons in one week, and
turns out only 1 is actually a security issue, and the others are either normal
bugs and no bugs, then eventually I think I'd just create a filter to move all
the security@commons to a folder and have a look someday.

I think we don't have any idea how many e-mails we might get enabling it for
one or for a few components. So I'd be OK with

- sending e-mails to security@commons initially, but if it spams the list with
non-security related e-mails, then move to a separate mailing list; OR
- create the new mailing list (probably private too? until we filter the
issues?) and use it for a few weeks/months. If the traffic is low, or most
issues are really security related, then move to security@commons if others
agree

Either way would be OK for me.

Cheers
Bruno

On Wednesday, 14 April 2021, 4:49:31 am NZST, Stefan Bodewig
wrote:

Hi all

I want to pick up (and finish) the discussion that started in
Compress[1].

Short Recap:

OSS Fuzz[2] runs fuzz testing for open source projects by invoking
methods of our code with random data looking for unexpected outcomes
(undeclared exceptions or worse code that never returns because it is
stuck in an infinite loop for example).

For Compress Fabian (who started [1]) has already identified and
reported several issues, one of which would have become a CVE if the
code in question had been part of any release of Compress. In the past
other people have run different fuzzers and found "interesting" results
in Compress as well.

Compress may be especially vulnerable as it basically tries to make
sense out of a bunch of user supplied bytes - but the same is probably
true for codec or imaging for example.

Fabian has offered to set up OSS Fuzz for Compress. Given that the
issues OSS Fuzz detects may or may not be security sensitive, I don't
feel it would be a good idea to have the tool send reports to a public
mailing list. Therefore I propose to create another subscription
moderated list just for these kinds of reports. I'm afraid it could be
too noisy for security@commons.

Proposal

Unless anybody objects until then I will create such a list (I believe
there is a self-service thingy for that, otherwise I'll ask the infra
folks) on the coming Sunday. I'd add myself as a moderator but we will
need more moderators. Also I'll gladly accept ideas for the name of the
list.

If there are objections against yet another mailing list I'll ask Fabian
to set things up using a private mail alias. If you want to receive the
messages as well, please tell me.

Cheers

Stefan

[1]
https://lists.apache.org/thread.html/rb34ea7d9272b8e600437ea705b13aba1bcc2f23ceb55880bce27e479%40%3Cdev.commons.apache.org%3E

[2] https://google.github.io/oss-fuzz/

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Vote] Create a "machine learning" component

2021-04-13 Thread Gilles Sadowski

Le mar. 13 avr. 2021 à 18:21, Avijit Basak  a écrit :
>
> Hi
>
>   Please find my comments below.
>
> >> I don't follow the distinction "prod" vs "non-prod".
>  -- Actually in Prod we really need a very high performing system. So
> use of implicit parallelism in spark would help us to achieve it. But for
> other types of work like POC or R we may not need such performance.

Isn't a GA inherently parallel?
If so, why not take advantage of the concurrency tools provided by the JDK?

> >> the question was actually whether you are willing to modularize CM
>  -- I am not much aware of other ml components in commons. I would look
> into it.

I've mentioned them in earlier messages:
 * Self-organizing feature map (artificial neural net)
 * Clustering

The former is multi-threaded; the latter should be refactored to
take advantage of multi-threading.

> >>You did not expand about the usability/performance (e.g. the issue of
> multi-threading)
>  -- Are we planning to incorporate parallel GA.

Aren't you?

> Then multi-threading
> would be a more appropriate option.

IMHO, a necessary one.

> >> So, as a way forward, I would suggest that you create a project on
> GitHub (copying all the settings from a *Commons modular* component, such as
> "Commons Numbers")
>  -- Could you kindly share the GitHub repository URL for any Commons
> modular component.

https://github.com/apache/commons-rng
https://github.com/apache/commons-numbers
https://github.com/apache/commons-geometry
https://github.com/apache/commons-statistics

>
> Thanks & Regards
> --Avijit Basak
>
>
> On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Le lun. 12 avr. 2021 à 17:21, Avijit Basak  a
> > écrit :
> > >
> > > Hi
> > >
> > >  Sorry for the delayed response. Thanks for your patience. Please
> > > find my comments below:
> > >
> > >  (1) Why not Spark?  [At least post over there (?).]
> > >   --We can move to Spark. But it will be very much useful if the
> > things
> > > can also run without Spark. The use of Spark would make more sense in a
> > > production environment. But the portability of the library will be more
> > > useful for the non-prod environment.
> >
> > I don't follow the distinction "prod" vs "non-prod".
> >
> > > Definitely, we can reach the Spark
> > > team and query.
> >
> > That would be a good idea...
> >
> > >  (2) Further develop a monolithic CM?  [Who will do it?]
> > >--I can help with the upgrade of the existing library related to
> > GA
> > > functionality.
> >
> > Sure, but nobody is currently working on (2).
> >
> > >  (3) Modularize CM? [Who will do it?]
> > >--I can help with the upgrade of the existing library related to
> > GA
> > > functionality.
> >
> > I don't doubt it; but the question was actually whether you are willing
> > to modularize CM (that is: in addition to, and before, contributing to
> > the GA functionality).
> >
> > >  (4) New component (with another name) with the proposed contents?
> > >--This is the best option if permitted.
> >
> > Currently, only the two of us are in favour of this alternative.
> >
> > Nobody, by their action, is really in favour of any of the other
> > alternatives.
> > So, as a way forward, I would suggest that you create a project on GitHub
> > (copying all the settings from a Commons modular component, such as
> > "Commons Numbers"), to be eventually integrated here, once its potential
> > has been demonstrated.
> >
> > >   The code which I have written can be reused with minor
> > modifications.
> > > So it won't take too much effort for this activity.
> >
> > You did not expand about the usability/performance (e.g. the issue of
> > multi-threading)...
> >
> > Regards,
> > Gilles
> >
> > >> [...]
> >

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

2021-04-13 Thread Gary Gregory

Please don't use @security for automated emails, that ML IMO should be for
humans.

If you want to setup a new ML for bots that's fine, we can direct GitHub's
Dependanot emails there if GitHub allows for that.

Gary

On Tue, Apr 13, 2021, 12:57 Mark Thomas  wrote:

> On 13/04/2021 17:49, Stefan Bodewig wrote:
>
> 
>
> > Fabian has offered to set up OSS Fuzz for Compress. Given that the
> > issues OSS Fuzz detects may or may not be security sensitive, I don't
> > feel it would be a good idea to have the tool send reports to a public
> > mailing list. Therefore I propose to create another subscription
> > moderated list just for these kinds of reports. I'm afraid it could be
> > too noisy for security@commons.
>
> Following the "split by audience, not by topic" guideline, I'd suggest
> using security@commons.a.o rather than a separate list. Much, much
> bigger projects than Compress use OSS Fuzz and direct traffic to their
> security list where it seems to be manageable.
>
> Mark
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [all] OSS Fuzz

2021-04-13 Thread Mark Thomas


On 13/04/2021 17:49, Stefan Bodewig wrote:




Fabian has offered to set up OSS Fuzz for Compress. Given that the
issues OSS Fuzz detects may or may not be security sensitive, I don't
feel it would be a good idea to have the tool send reports to a public
mailing list. Therefore I propose to create another subscription
moderated list just for these kinds of reports. I'm afraid it could be
too noisy for security@commons.


Following the "split by audience, not by topic" guideline, I'd suggest 
using security@commons.a.o rather than a separate list. Much, much 
bigger projects than Compress use OSS Fuzz and direct traffic to their 
security list where it seems to be manageable.


Mark

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[all] OSS Fuzz

2021-04-13 Thread Stefan Bodewig

Hi all

I want to pick up (and finish) the discussion that started in
Compress[1].

Short Recap:

Compress may be especially vulnerable as it basically tries to make
sense out of a bunch of user supplied bytes - but the same is probably
true for codec or imaging for example.

Proposal

If there are objections against yet another mailing list I'll ask Fabian
to set things up using a private mail alias. If you want to receive the
messages as well, please tell me.

Cheers

Stefan

[1]
https://lists.apache.org/thread.html/rb34ea7d9272b8e600437ea705b13aba1bcc2f23ceb55880bce27e479%40%3Cdev.commons.apache.org%3E

[2] https://google.github.io/oss-fuzz/

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Vote] Create a "machine learning" component

2021-04-13 Thread Avijit Basak

Hi

  Please find my comments below.

>> I don't follow the distinction "prod" vs "non-prod".
 -- Actually in Prod we really need a very high performing system. So
use of implicit parallelism in spark would help us to achieve it. But for
other types of work like POC or R we may not need such performance.
>> the question was actually whether you are willing to modularize CM
 -- I am not much aware of other ml components in commons. I would look
into it.
>>You did not expand about the usability/performance (e.g. the issue of
multi-threading)
 -- Are we planning to incorporate parallel GA. Then multi-threading
would be a more appropriate option.
>> So, as a way forward, I would suggest that you create a project on
GitHub (copying all the settings from a *Commons modular* component, such as
"Commons Numbers")
 -- Could you kindly share the GitHub repository URL for any Commons
modular component.

Thanks & Regards
--Avijit Basak


On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 12 avr. 2021 à 17:21, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> >  Sorry for the delayed response. Thanks for your patience. Please
> > find my comments below:
> >
> >  (1) Why not Spark?  [At least post over there (?).]
> >   --We can move to Spark. But it will be very much useful if the
> things
> > can also run without Spark. The use of Spark would make more sense in a
> > production environment. But the portability of the library will be more
> > useful for the non-prod environment.
>
> I don't follow the distinction "prod" vs "non-prod".
>
> > Definitely, we can reach the Spark
> > team and query.
>
> That would be a good idea...
>
> >  (2) Further develop a monolithic CM?  [Who will do it?]
> >--I can help with the upgrade of the existing library related to
> GA
> > functionality.
>
> Sure, but nobody is currently working on (2).
>
> >  (3) Modularize CM? [Who will do it?]
> >--I can help with the upgrade of the existing library related to
> GA
> > functionality.
>
> I don't doubt it; but the question was actually whether you are willing
> to modularize CM (that is: in addition to, and before, contributing to
> the GA functionality).
>
> >  (4) New component (with another name) with the proposed contents?
> >--This is the best option if permitted.
>
> Currently, only the two of us are in favour of this alternative.
>
> Nobody, by their action, is really in favour of any of the other
> alternatives.
> So, as a way forward, I would suggest that you create a project on GitHub
> (copying all the settings from a Commons modular component, such as
> "Commons Numbers"), to be eventually integrated here, once its potential
> has been demonstrated.
>
> >   The code which I have written can be reused with minor
> modifications.
> > So it won't take too much effort for this activity.
>
> You did not expand about the usability/performance (e.g. the issue of
> multi-threading)...
>
> Regards,
> Gilles
>
> >> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak

Re: [Vote] Create a "machine learning" component

2021-04-13 Thread Gilles Sadowski

Hello.

Le lun. 12 avr. 2021 à 17:21, Avijit Basak  a écrit :
>
> Hi
>
>  Sorry for the delayed response. Thanks for your patience. Please
> find my comments below:
>
>  (1) Why not Spark?  [At least post over there (?).]
>   --We can move to Spark. But it will be very much useful if the things
> can also run without Spark. The use of Spark would make more sense in a
> production environment. But the portability of the library will be more
> useful for the non-prod environment.

I don't follow the distinction "prod" vs "non-prod".

> Definitely, we can reach the Spark
> team and query.

That would be a good idea...

>  (2) Further develop a monolithic CM?  [Who will do it?]
>--I can help with the upgrade of the existing library related to GA
> functionality.

Sure, but nobody is currently working on (2).

>  (3) Modularize CM? [Who will do it?]
>--I can help with the upgrade of the existing library related to GA
> functionality.

I don't doubt it; but the question was actually whether you are willing
to modularize CM (that is: in addition to, and before, contributing to
the GA functionality).

>  (4) New component (with another name) with the proposed contents?
>--This is the best option if permitted.

Currently, only the two of us are in favour of this alternative.

Nobody, by their action, is really in favour of any of the other alternatives.
So, as a way forward, I would suggest that you create a project on GitHub
(copying all the settings from a Commons modular component, such as
"Commons Numbers"), to be eventually integrated here, once its potential
has been demonstrated.

>   The code which I have written can be reused with minor modifications.
> So it won't take too much effort for this activity.

You did not expand about the usability/performance (e.g. the issue of
multi-threading)...

Regards,
Gilles

>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [lang] Failing test on Java 16-EA.

2021-04-13 Thread Jaikiran Pai

Hello Gary,

I had a look at this one and I was able to reproduce this. Based on my reading 
of the code and what it does, IMO, this is a JDK issue. Since this was 
previously raised and reported in this list here[1] and a JDK issue was created 
https://bugs.openjdk.java.net/browse/JDK-8262108, I decided to reopen that 
issue and have included the necessary details of my investigation there.

[1] https://www.mail-archive.com/dev@commons.apache.org/msg70599.html

P.S: I'm not subscribed to this commons dev mailing list and I just watch/reply 
from the Apache mailing list tools, so my responses might be delayed.

-Jaikiran

On 2021/03/28 17:17:13, Gary Gregory  wrote: 
> I'm till looking for help on getting LANG working on Java 16...
> 
> Gary
> 
> On Sat, Mar 20, 2021, 21:39 Gary Gregory  wrote:
> 
> > Now that Java 16 is out, we really need to look at this IMO but I would
> > like help from the community.
> >
> > My initial guess that this a JDK bug might be wrong and it could be an
> > issue in our code.
> >
> > Gary
> >
> > On Tue, Feb 23, 2021, 22:13 Gary Gregory  wrote:
> >
> >> Hi All:
> >>
> >> If you feel so inclined, I'd like help with
> >> FastDateParserTest.java#testParsesKnownJava16Ea25Failure().
> >>
> >> The test fails on Java 16 Early Access build 25 and above, I am now
> >> testing with build
> >> 36.
> >>
> >> I cannot tell if this a bug in our code or in the underlying JRE.
> >>
> >> TY!
> >> Gary
> >>
> >
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [all] OSS Fuzz

Re: [Vote] Create a "machine learning" component

Re: [all] OSS Fuzz

Re: [all] OSS Fuzz

[all] OSS Fuzz

Re: [Vote] Create a "machine learning" component

Re: [Vote] Create a "machine learning" component

Re: [lang] Failing test on Java 16-EA.

8 matches

Site Navigation

Mail list logo

Footer information