MAC, in its common state, is probably not something we'd want to include in this proposed tarball. The reasoning being that MAC (and related classes) aren't something that people would need on your "Hadoop Cluster" to talk to Accumulo. It's something that can just be obtained via Maven.

However, if you're more referring to MAC as the generic "AccumuloCluster" interface (an attempt to make running tests against MAC and a real Accumulo cluster transparent -- StandaloneAccumuloCluster), then I could see some JAR that we'd include which would contain the necessary classes (on top of accumulo-client.jar) for users to run code seamlessly against a traditional MAC or the StandaloneAccumuloCluster.

On 1/5/18 4:22 PM, Michael Wall wrote:
I like the idea of a client jar that has less dependencies.  Josh, where
are thinking the MiniAccumuloCluster fits in here?

On Fri, Jan 5, 2018 at 3:57 PM Christopher <ctubb...@apache.org> wrote:

On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <ke...@deenlo.com> wrote:

On Thu, Jan 4, 2018 at 7:43 PM, Christopher <ctubb...@apache.org> wrote:
tl;dr : I would prefer not to add another tarball as part of our
"official"

I am not opposed to replacing the current single tarball with client
and server tarballs.   What I find appealing about this is if the
client tarball has less deps.

However I think a lot of thought should be put into the scripts if
this is done.  For example the client tar and server tar should
probably not both have accumulo commands that do different things.


Agreed on Keith's point about the scripts and it requiring some
consideration.


releases, but I'd be in favor of a blog instructions, script, or build
profile, which users could read/execute/activate to create a
client-centric
package.

I've long believed that supporting different downstream packaging
scenarios
should be prioritized over upstream binary packaging. I have argued in

These "downstream" packaging could be done within the Apache Accumulo
project also.  Like accumulo-docker.  Creating other packaging
projects within Accumulo is something to consider.


+1; When I say "downstream", it's a role, not an entity. The point is that
it's a distinct activity. accumulo-docker is a perfect example of a
"downstream packaging" project maintained by the upstream community. I find
it frustrating sometimes when supporting users that they can't tell the
difference between what is "Accumulo" and what is "this specific
packaging/configuration/deployment of Accumulo", because we don't make
those lines clear. I think we can draw these lines a bit more clearly.


favor of removing our current tarball entirely, while supporting
efforts
to

Apache Accumulo needs some sort of tarball that makes it easy to run
the code on a cluster, otherwise how can we test Accumulo on a cluster
for releases?


A binary tarball may be the best for this, but it's little more than the
jars in Maven Central and a few text files. It could be trivially replaced
with a simple script and manifest; it could also be replaced with an RPM, a
docker image, or any number of things. A tarball is just one type of
packaging for Accumulo's binaries.

In any case, I wasn't talking about removing the ability to produce a
binary tarball from source. Only removing it from our release artifacts and
downloads. It is not a popular opinion, but I still think it's reasonable,
with both pros and cons.


enable downstream packaging by modularizing the server code,
supporting a
client-API jar (future work), and decoupling code from launch scripts.
I
think we should continue to do these kinds of improvements to support
different packaging scenarios downstream, but I'd prefer to avoid
additional "official" binary releases.

I agree, I think if the Accumulo Java code made less assumptions about
its runtime env it would result in code that is easier to maintain and
package for different environments.

In Fluo we have recently done a lot of work in order to support
Docker, Mesos, and Kubernetes.  This work has really cleaned up the
core Fluo code making it easier to run in any environment.

I suspect pulling the Accumuo tar ball into a separate git repo and
out of the main repo may help highlight some of the assumptions
Accumulo Java code makes about the environment.


This is basically what the assemble module is now. It's why I moved the bin
and conf directories into it, and have made its dependencies optional so
they wouldn't be resolved transitively, and why I made the assembly plugin
gather up the libs instead of the dependency plugin which used to drop them
in a lib directory at the root of the source checkout. This module is the
"downstream packaging" for the current "all-in-one" binary tarball package.


I think these clean up issues are related to what Josh is suggesting,
but are not prerequisites.  So it makes sense to discuss them at this
point, but I don't think they should block work on two tarballs if
that seems like a good idea.


Agreed. That discussion can be deferred. Much depends on how it is to be
split up.



Rather than provide additional packages, I'd prefer to work with
downstream
to make the source more "packagable" to suit the needs of these
downstream
vendor/community packagers. One way we can do that here is by either
documenting what would be needed in a client-centric package, or by
providing a script or build profile to create it from source, so that
your
$dayjob or any other downstream packager doesn't have to figure that
out
from scratch.

On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <josh.el...@gmail.com>
wrote:

Hi,

$dayjob presented me with a request to break up the current tarball
into
two: one suitable for "users" and another for the Accumulo services.
The
ultimate goal is to make upgrade scenarios a bit easier by having
client
and server centric packaging.

The "client" tarball would be something suitable for most users
providing the ability to do things like:

* Launch a java app against Accumulo
* Launch a MapReduce job against Accumulo
* Launch the Accumulo shell

Essentially, the client tarball is just a pared down version of our
"current" tarball and the server-tarball is likely equivalent to our
"current" tarball (given that we have little code which would be
considered client-only).

Obviously, there are many ways to go about this. If there is buy-in
from
other folks, adding some new assembly descriptors and making it a part
of the Maven build (perhaps, optionally generated) would be the
easiest
in terms of maintenance. However, I don't want to push for that if
it's
just going to be ignored by folks. I'll be creating something to
support
this one way or another.

Any thoughts/opinions? Would this have any value to other folks?

- Josh




Reply via email to