Thank u for the info. On Tue, Jul 2, 2019, 7:00 PM Dave Fisher <dave2w...@comcast.net> wrote:
> Hi - > > To me you have two parts. One fits Apache and the other would need to be > outside. > > (1) Open Source Software which is the library, service and CLI tools. This > is something that an Apache Community could grow around and be governed in > the Apache Way. This part can be incubated. > > (2) Open Data. Justin refers to Kibble and Pony Mail which are incubating > projects around consuming Apache Community data mostly. I would point out > that you could host the data portion of your community elsewhere by some > community members or others outside of Apache PMCs. Here is a real example. > Apache Tika, PDFBox and POI PMCs all share a set of regression test > documents ( > https://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/) > and a community member Dominik Stadler ( > https://github.com/centic9/CommonCrawlDocumentDownload) that are > retrieved from Common Crawl (http://commoncrawl.org) which uses the AWS > Public Dataset Program (https://aws.amazon.com/opendata/public-datasets/) > > Regards, > Dave > > > On Jul 2, 2019, at 1:59 PM, Alejandro Caceres <acace...@hyperiongray.com> > wrote: > > > > Hi Matt, > > > > Thanks for the response. You are sort of correct, I would say the end > goal > > is a service - an open source engine that is able to grab and ingest this > > highly unstructured security information and turn it into something > useful > > - then provide that back to the user in a few different forms. One would > be > > a web services API for general use exposed to the Internet (a service, > like > > you said), and another would be a series of command line tools and > > libraries that others can use to ingest this information easily. the > third > > goal would be: not only is the code open source, but all data used in the > > application is available itself, so this could easily be used to run a > > personal node of this information for an organization, scylla.sh is > simply > > my instance that I expose to the Internet at large for those that don't > > want to run a "full node". If that is more palatable to the ASF I'm glad > to > > make that the focus. In other words: I'm not married to any model here. > > > > I knew coming in that it's a bit unconventional for Apache, but, I think, > > it is a unique and powerful project that would increase engagement from > the > > infosec community in which I personally, as well as my R&D company have > > some good visibility from. In other words, just testing the waters to see > > how this is received by ASF :). > > > > Alex > > > > > > On Tue, Jul 2, 2019 at 3:44 PM Matt Sicker <boa...@gmail.com> wrote: > > > >> I'm a little unclear about the scope of the project here. This project > >> looks more like a service, and I don't know of any ASF projects that > >> exist to provide services outside the ASF. > >> > >> On Tue, 2 Jul 2019 at 14:28, Alejandro Caceres > >> <acace...@hyperiongray.com> wrote: > >>> > >>> Hey Folks, > >>> > >>> I'm interested in submitting a project as a seedling and am looking > >> exactly > >>> where to start. The project is already off the ground, being used by > >> many, > >>> is stable, reasonably mature (it's in alpha release), open source, and > >>> already Apache licensed. I've been looking at a lot of resources to how > >>> best to submit this to Apache and from what I understand I need to: > >>> > >>> Find a "champion/mentor" for the project and a "sponsor" -> submit an > >>> incubator application -> wait (or do i submit for a vote on general@?) > >> -> > >>> ... -> profit :) > >>> > >>> For a bit more context, my project is http://scylla.sh or > >>> https://github.com/acaceres2176/scylla. This project aggregates and > >> makes > >>> searchable database leaks and other information security data that is > >> easy > >>> for attackers to find (they have blackhat and underground resources) > but > >>> difficult for security professionals trying to defend their network > (they > >>> cannot buy stolen data, are not plugged into the blackhat hacker > >> community, > >>> and frankly generally don't know "where to start"). The Scylla engine > >> aims > >>> to even the playing field by making this data available and completely > >> free > >>> for everyone. The feed is meant to power threat intelligence engines to > >> aid > >>> in the defense of both large corporate networks, but also be accessible > >> to > >>> an average user who wants to check what information of theirs has been > >>> leaked. It's a passion project of mine and have been working on it for > >>> several months already. We have several terabytes of data and good > >>> attention from the infosec community. > >>> > >>> Anyway, sorry for the brain dump above, but I suppose I should mainly > >> ask - > >>> where do I go from here? Do I simply ask this mailing list if there is > a > >>> sponsor and champion willing to bring this in as a podling? > >>> > >>> Thanks! > >>> Alex > >>> > >>> > >>> > >>> -- > >>> ___ > >>> > >>> Alejandro Caceres > >>> Hyperion Gray, LLC > >>> Owner/CTO > >> > >> > >> > >> -- > >> Matt Sicker <boa...@gmail.com> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >> For additional commands, e-mail: general-h...@incubator.apache.org > >> > >> > > > > -- > > ___ > > > > Alejandro Caceres > > Hyperion Gray, LLC > > Owner/CTO > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@community.apache.org > For additional commands, e-mail: dev-h...@community.apache.org > >