thanks andy - i agree with most of your opinions around continuing to build standard packages.. but can you clarify what was offensive ? must be a misinterpretation somewhere.
1) To be clear, i am 100% behind supporting standard hadoop build rpms that we have now. Thats the core product and will be for the forseeable future, absolutely ! 2) The idea (and its just an idea i want to throw out - to keep us on our toes), is that some folks may be interested in hacking around, in a separate branch - on some bleeding edge bigdata deployments - which attempts to incorporate resource managers and containers as first-class citizens. Again this is all just ideas - not in any way meant to derail the packaging efforts - but rather - just to gauge folks interest level in the bleeding edge, docker, mesos, simplified processing stacks, and so on. On Mon, Jun 15, 2015 at 12:39 PM, Andrew Purtell <apurt...@apache.org> wrote: > > gridgain or spark can do what 90% of the hadoop ecosystem already does, > supporting streams, batch,sql all in one) > > If something like this becomes the official position of the Bigtop > project, some day, then it will turn off people. I can see where you are > coming from, I think. Correct me if I'm wrong: We have limited bandwidth, > we should move away from Roman et. al.'s vision of Bigtop as an inclusive > distribution of big data packages, and instead become highly opinionated > and tightly focused. If that's accurate, I can sum up my concern as > follows: To the degree we become more opinionated, the less we may have to > look at in terms of inclusion - both software and user communities. For > example, I find the above quoted statement a bit offensive as a participant > on not-Spark and not-Gridgain projects. I roll my eyes sometimes at the > Docker over-hype. Is there still a place for me here? > > > > On Mon, Jun 15, 2015 at 9:22 AM, jay vyas <jayunit100.apa...@gmail.com> > wrote: > >> Hi folks. Every few months, i try to reboot the conversation about the >> next generation of bigtop. >> >> There are 3 things which i think we should consider : A backplane (rather >> than deploy to machines, the meaning of the term "ecosystem" in a >> post-spark in-memory apacolypse, and containerization. >> >> 1) BACKPLANE: The new trend is to have a backplane that provides >> networking abstractions for you (mesos, kubernetes, yarn, and so on). Is >> it time for us to pick a resource manager? >> >> 2) ECOSYSTEM?: Nowadays folks don't necessarily need the whole hadoop >> ecosystem, and there is a huge shift to in-memory, monolithic stacks >> happening (i.e. gridgain or spark can do what 90% of the hadoop ecosystem >> already does, supporting streams, batch,sql all in one). >> >> 3) CONTAINERS: we are doing a great job w/ docker in our build infra. >> Is it time to start experimenting with running docker tarballs ? >> >> Combining 1+2+3 - i could see a useful bigdata upstream distro which (1) >> just installed an HCFS implementation (gluster,HDFS,...) along side, say, >> (2) mesos as a backplane for the tooling for [[ hbase + spark + ignite ]] >> --- and then (3) do the integration testing of available mesos-framework >> plugins for ignite and spark underneath. If other folks are interested, >> maybe we could create the "1x" or "in-memory" branch to start hacking on it >> sometime ? Maybe even bring the flink guys in as well, as they are >> interested in bigtop packaging. >> >> >> >> -- >> jay vyas >> > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > -- jay vyas