Re: [DISCUSS] MiniAccumuloCluster goals and approach

Bill Havanki Fri, 28 Mar 2014 10:48:32 -0700

I've been watching the conversation on the side, but I wanted to mention
that it seems the focus isn't so much on "mini" clusters anymore. You're
thinking of programmatic cluster management, whether one node or many. The
idea of a basic cluster management interface, with MAC as an
implementation, is promising. A package name of just "cluster" could work.


Carry on :)

Bill H


On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey <[email protected]>wrote:

> If you decide to go the mapred/mapreduce way, you could go with the package
> name "mini".
>
> alternatively, we can do a multi-stage change out
>
> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>
> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
> TestAccumuloCluster, @deprecate TestAccumuloCluster
>
> 3) 1.6 + 2 major: remove TestAccumuloCluster
>
> Or just go with TestAccumuloCluster as the interface, have
> MiniAccumuloCluster as the local pseudo distributed implementation, and
> then call your new one something like YarnAccumuloCluster.
>
> In that case we could use the deprecation cycle to move the MAC class out
> of the public api.
>
>
> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <[email protected]> wrote:
>
> > Thoughts on if this would be an acceptable change for 1.6.0 to alleviate
> > future cruft?
> >
> > Suggestions on the new package and/or class name would be greatly
> > appreciated over "NewMiniAccumuloC*".
> >
> >
> > On 3/26/14, 3:37 PM, Josh Elser wrote:
> >
> >> Those who are interested: check out
> >> https://github.com/joshelser/accumulo/commit/
> >> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
> >>
> >>
> >> tl;dr I could create some real interfaces for the cluster and config,
> >> which are "hidden" under the covers by the 1.4 and 1.5
> >> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
> >> default implementation, gives us the ability to hide "implementation
> >> details" if wanted, and moves us towards some factory methods instead of
> >> calling a class directly.
> >>
> >> Thoughts?
> >>
> >> On 3/26/14, 1:21 PM, Josh Elser wrote:
> >>
> >>> Yes, very much experimental at this point.
> >>>
> >>> What I'm most concerned about is having reasonable hooks up front, not
> >>> trying to make an implementation for inclusion 1.6.0.
> >>>
> >>> Regarding additions, the implementations already contains most things I
> >>> would want to expose. I haven't come up with anything that would be
> >>> generally returned through the "API" rather than through this proposed
> >>> implementation (e.g. YARN connection information)
> >>>
> >>> On 3/26/14, 11:57 AM, Keith Turner wrote:
> >>>
> >>>> What you are trying to do sounds interesting.  It also sounds
> >>>> experimental
> >>>> and in the early stages.   Is there anything specific you think
> >>>> should be
> >>>> done for 1.6.0 w/ regards to MAC API?
> >>>>
> >>>>
> >>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <[email protected]>
> >>>> wrote:
> >>>>
> >>>>  On 3/26/14, 11:13 AM, Keith Turner wrote:
> >>>>>
> >>>>>  On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <[email protected]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>   On 3/26/14, 10:57 AM, Keith Turner wrote:
> >>>>>>
> >>>>>>>
> >>>>>>>   Can you give an example of what you are thinking of? I don't
> >>>>>>> understand
> >>>>>>>
> >>>>>>>> you
> >>>>>>>> viewpoint either
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>  Sure. One limitation of MAC, in general as a testing harness, is
> >>>>>>> that it
> >>>>>>> doesn't adequately exercise multi-node implementations. You can run
> >>>>>>> multiple tservers, but they are all on the same host which limits
> the
> >>>>>>> validity of a "robust" test. This is my immediate goal.
> >>>>>>>
> >>>>>>> Multi-node deployments are capable using something like Mesos or
> >>>>>>> Yarn.
> >>>>>>> Given that there is already functioning support to deploy Accumulo
> on
> >>>>>>> Yarn,
> >>>>>>> this was my goal.
> >>>>>>>
> >>>>>>> My goal is to be able to have the ability to run all of our
> >>>>>>> AbstractMacIT
> >>>>>>> implementations against "real" hardware without changing a single
> >>>>>>> line of
> >>>>>>> test code (ok - maybe a line or two to do injection of the MAC
> >>>>>>> implementation). The point is, I believe there could be a huge
> >>>>>>> testing
> >>>>>>> gain
> >>>>>>> from being able to write tests which leverage yarn, have the same
> >>>>>>> programmatic configuration API from MAC, and provide near "real"
> >>>>>>> Accumulo
> >>>>>>> semantics.
> >>>>>>>
> >>>>>>>
> >>>>>>>  Ok so you want to MAC to be an interface so that you can provide a
> >>>>>> completely different implementation?
> >>>>>>
> >>>>>>
> >>>>>>  Correct. Some things would serve well in a common abstract base
> (e.g.
> >>>>> numTservers, siteXml configuration), but all the nonsense about
> >>>>> creating
> >>>>> directory structures and managing Processes is implementation
> specific.
> >>>>>
> >>>>> Perhaps I could create a new interface that the current
> implementation
> >>>>> implements which still provides the same semantics from 1.4 and 1.5.
> >>>>> Let me
> >>>>> see if I can mock up what I'm thinking -- that will probably be
> >>>>> easier than
> >>>>> me trying to write it out.
> >>>>>
> >>>>>
> >>>>
>



-- 
// Bill Havanki
// Solutions Architect, Cloudera Govt Solutions
// 443.686.9283

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Reply via email to