Hi!
On Sat, Dec 6, 2014 at 3:23 PM, jay vyas <[email protected]> wrote:
> hi bigtop !
>
> I thought id start a thread a few vaguely related thoughts i have around
> next couple iterations of bigtop.
I think in general I see two major ways for something like
Bigtop to evolve:
#1 remain a 'box of LEGO bricks' with very little opinion on
how these pieces need to be integrated
#2 start driving oppinioned use-cases for the particular kind of
bigdata workloads
#1 is sort of what all of the Linux distros have been doing for
the majority of time they existed. #2 is close to what CentOS
is doing with SIGs.
Honestly, given the size of our community so far and a total
lack of corporate backing (with a small exception of Cloudera
still paying for our EC2 time) I think #1 is all we can do. I'd
love to be wrong, though.
> 1) Hive: How will bigtop to evolve to support it, now that it is much more
> than a mapreduce query wrapper?
I think Hive will remain a big part of Hadoop workloads for forseeable
future. What I'd love to see more of is rationalizing things like how
HCatalog, etc. need to be deployed.
> 2) I wonder wether we should confirm cassandra interoperability of spark in
> bigtop distros,
Only if there's a significant interest from cassandra community and even
then my biggest fear is that with cassandra we're totally changing the
requirements for the underlying storage subsystem (nothing wrong with
that, its just that in Hadoop ecosystem everything assumes very HDFS'ish
requirements for the scale-out storage).
> 4) in general, i think bigtop can move in one of 3 directions.
>
> EXPAND ? : Expanding to include new components, with just basic interop,
> and let folks evolve their own stacks on top of bigtop on their own.
>
> CONTRACT+FOCUS ? Contracting to focus on a lean set of core components,
> with super high quality.
>
> STAY THE COURSE ? Staying the same ~ a packaging platform for just
> hadoop's direct ecosystem.
>
> I am intrigued by the idea of A and B both have clear benefits and costs...
> would like to see the opinions of folks --- do we lean in one direction or
> another? What is the criteria for adding a new feature, package, stack to
> bigtop?
>
> ... Or maybe im just overthinking it and should be spending this time
> testing spark for 0.9 release....
I'd love to know what other think, but for 0.9 I'd rather stay the course.
Thanks,
Roman.
P.S. There are also market forces at play that may fundamentally change
the focus of what we're all working on in the year or so.