I think it would depend how much other "stuff" has to come in to support
the *Clusters. I assumed it would be a bit, but, if it's not, I have no
objections to a single jar.
On 1/5/18 4:38 PM, Michael Wall wrote:
Yeah, I was thinking more like your second paragraph. Thinking I would use
the proposed client jar to develop against the MiniAccumuloCluster
(typically the StandaloneMiniAccumuloCluster for me) and then deploy that
code to run against a real cluster. Would like to flesh that usecase out a
little more. Do you think it has to be another jar on top of the client
jar?
On Fri, Jan 5, 2018 at 4:31 PM Josh Elser <josh.el...@gmail.com> wrote:
MAC, in its common state, is probably not something we'd want to include
in this proposed tarball. The reasoning being that MAC (and related
classes) aren't something that people would need on your "Hadoop
Cluster" to talk to Accumulo. It's something that can just be obtained
via Maven.
However, if you're more referring to MAC as the generic
"AccumuloCluster" interface (an attempt to make running tests against
MAC and a real Accumulo cluster transparent --
StandaloneAccumuloCluster), then I could see some JAR that we'd include
which would contain the necessary classes (on top of
accumulo-client.jar) for users to run code seamlessly against a
traditional MAC or the StandaloneAccumuloCluster.
On 1/5/18 4:22 PM, Michael Wall wrote:
I like the idea of a client jar that has less dependencies. Josh, where
are thinking the MiniAccumuloCluster fits in here?
On Fri, Jan 5, 2018 at 3:57 PM Christopher <ctubb...@apache.org> wrote:
On Fri, Jan 5, 2018 at 10:30 AM Keith Turner <ke...@deenlo.com> wrote:
On Thu, Jan 4, 2018 at 7:43 PM, Christopher <ctubb...@apache.org>
wrote:
tl;dr : I would prefer not to add another tarball as part of our
"official"
I am not opposed to replacing the current single tarball with client
and server tarballs. What I find appealing about this is if the
client tarball has less deps.
However I think a lot of thought should be put into the scripts if
this is done. For example the client tar and server tar should
probably not both have accumulo commands that do different things.
Agreed on Keith's point about the scripts and it requiring some
consideration.
releases, but I'd be in favor of a blog instructions, script, or build
profile, which users could read/execute/activate to create a
client-centric
package.
I've long believed that supporting different downstream packaging
scenarios
should be prioritized over upstream binary packaging. I have argued in
These "downstream" packaging could be done within the Apache Accumulo
project also. Like accumulo-docker. Creating other packaging
projects within Accumulo is something to consider.
+1; When I say "downstream", it's a role, not an entity. The point is
that
it's a distinct activity. accumulo-docker is a perfect example of a
"downstream packaging" project maintained by the upstream community. I
find
it frustrating sometimes when supporting users that they can't tell the
difference between what is "Accumulo" and what is "this specific
packaging/configuration/deployment of Accumulo", because we don't make
those lines clear. I think we can draw these lines a bit more clearly.
favor of removing our current tarball entirely, while supporting
efforts
to
Apache Accumulo needs some sort of tarball that makes it easy to run
the code on a cluster, otherwise how can we test Accumulo on a cluster
for releases?
A binary tarball may be the best for this, but it's little more than the
jars in Maven Central and a few text files. It could be trivially
replaced
with a simple script and manifest; it could also be replaced with an
RPM, a
docker image, or any number of things. A tarball is just one type of
packaging for Accumulo's binaries.
In any case, I wasn't talking about removing the ability to produce a
binary tarball from source. Only removing it from our release artifacts
and
downloads. It is not a popular opinion, but I still think it's
reasonable,
with both pros and cons.
enable downstream packaging by modularizing the server code,
supporting a
client-API jar (future work), and decoupling code from launch scripts.
I
think we should continue to do these kinds of improvements to support
different packaging scenarios downstream, but I'd prefer to avoid
additional "official" binary releases.
I agree, I think if the Accumulo Java code made less assumptions about
its runtime env it would result in code that is easier to maintain and
package for different environments.
In Fluo we have recently done a lot of work in order to support
Docker, Mesos, and Kubernetes. This work has really cleaned up the
core Fluo code making it easier to run in any environment.
I suspect pulling the Accumuo tar ball into a separate git repo and
out of the main repo may help highlight some of the assumptions
Accumulo Java code makes about the environment.
This is basically what the assemble module is now. It's why I moved the
bin
and conf directories into it, and have made its dependencies optional so
they wouldn't be resolved transitively, and why I made the assembly
plugin
gather up the libs instead of the dependency plugin which used to drop
them
in a lib directory at the root of the source checkout. This module is
the
"downstream packaging" for the current "all-in-one" binary tarball
package.
I think these clean up issues are related to what Josh is suggesting,
but are not prerequisites. So it makes sense to discuss them at this
point, but I don't think they should block work on two tarballs if
that seems like a good idea.
Agreed. That discussion can be deferred. Much depends on how it is to be
split up.
Rather than provide additional packages, I'd prefer to work with
downstream
to make the source more "packagable" to suit the needs of these
downstream
vendor/community packagers. One way we can do that here is by either
documenting what would be needed in a client-centric package, or by
providing a script or build profile to create it from source, so that
your
$dayjob or any other downstream packager doesn't have to figure that
out
from scratch.
On Thu, Jan 4, 2018 at 7:17 PM Josh Elser <josh.el...@gmail.com>
wrote:
Hi,
$dayjob presented me with a request to break up the current tarball
into
two: one suitable for "users" and another for the Accumulo services.
The
ultimate goal is to make upgrade scenarios a bit easier by having
client
and server centric packaging.
The "client" tarball would be something suitable for most users
providing the ability to do things like:
* Launch a java app against Accumulo
* Launch a MapReduce job against Accumulo
* Launch the Accumulo shell
Essentially, the client tarball is just a pared down version of our
"current" tarball and the server-tarball is likely equivalent to our
"current" tarball (given that we have little code which would be
considered client-only).
Obviously, there are many ways to go about this. If there is buy-in
from
other folks, adding some new assembly descriptors and making it a
part
of the Maven build (perhaps, optionally generated) would be the
easiest
in terms of maintenance. However, I don't want to push for that if
it's
just going to be ignored by folks. I'll be creating something to
support
this one way or another.
Any thoughts/opinions? Would this have any value to other folks?
- Josh