OK, then let's do it! As soon as we've agreed on a name, of course :) Regards, Matthias
On Wednesday, 2012-09-26, Rahul wrote: > Hi, > > I believe every project has a bunch of interesting users which can > provide additional food for thought to others. Hadoop provides lots > of random opportunities to people and the same should be possible > with crunch. I would be delighted to see what people are able to > pull off using the existing things. These contributions should be > kept in crunch as we are pretty young and at times we will go under > various refactorings, keeping them in crunch will keep them up-to > date. > > And yes, +1 to the idea of keeping dependencies to crunch-core only. > > regards, > rahul > On 26-09-2012 04:32, Josh Wills wrote: > >I like the idea of having a place in the project that showcases the > >cool things that you can do with it-- something more advanced and > >broadly applicable than the starter pipelines we have in > >crunch-examples, the kind of stuff that you can't easy do using tools > >like Hive and Pig. > > > >I also agree that we don't want to get into dependency creep, so I'd > >be inclined to limit crunch-bytes (crunch-berries? crunch-bars? > >crunch-abs?) to just those dependencies that are also in crunch-core. > >I think the Bloom Filter stuff meets this criteria. > > > >The project is still young enough that our problem is much more likely > >to be attracting new folks than it is to be getting overwhelmed with > >random contributions, so my inclination is to be welcoming. > > > >On Tue, Sep 25, 2012 at 11:29 AM, Matthias Friedrich <[email protected]> wrote: > >>Hi Rahul, > >> > >>I think it would be really great to have an ecosystem of > >>micro-libraries around Crunch for all kinds of cool stuff that is > >>relevant for smaller audiences, just like your Bloom filters. > >> > >>But since I expect most of this stuff to be so extremely special, it > >>would in my opinion make more sense to put this into small, focused > >>and independent projects that can be released separately from each > >>other and don't need to go through Crunch's review process. It would > >>make dependency management easier for users, too, in case a library > >>needs additional dependencies. > >> > >>We could maintain a registry of these projects on Crunch's homepage > >>so people can find them easily (I expect most of them would end up > >>at GitHub because it's perfect for this kind of thing). If a project > >>turns out to be interesting for a larger audience, we can still add it > >>to Crunch core. > >> > >>Regards, > >> Matthias > >> > >>On Tuesday, 2012-09-25, Rahul wrote: > >>>There can be interesting use-cases like BloomFilters which do not > >>>have a place in the current set of Crunch modules. These functions > >>>are kind of utility functions that can be used in Crunch. We need to > >>>create a place where users can share such functions. In the earlier > >>>discussion for BloomFilters we thought of some thing that is well > >>>along the lines of PiggyBank. I had a look at the module but in > >>>Pig's structure the module is branched under contrib module as there > >>>are other modules like peeny for monitering and zebra for storage. > >>> > >>>I have created a module name *crunch-bytes* , for issue > >>>https://issues.apache.org/jira/browse/CRUNCH-75, which is direct > >>>sub-module in crunch-parent. I named it so because I felt it will > >>>providing a space to have all those interesting data computations > >>>that we can not have in core. > >>> > >>>Please share your thoughts for the same. > >>> > >>>regards, > >>>rahul > >>> > > > > >
