Nice stuff, Stack!
Two quick questions:
First, on provenance: this codebase primarily came from Mike Wingert on
https://issues.apache.org/jira/browse/HBASE-15320? Just saw that the
commit came from your email addr -- wasn't sure if that Mike was still
involved (or you took it to completion).
Second, I assume this new Git repo had all of the normal email-hooks set
up. Do you know where they are being sent (dev, commit, or issues)? I'm
also assuming that this is a Gitbox repo -- are we OK with pull-requests
to this repo (as well as operator-tools) but still create a Jira issue?
- Josh
On 10/31/18 6:43 PM, Stack wrote:
To tie-off this thread, this nice feature was just pushed on
hbase-connector. See
https://github.com/apache/hbase-connectors/tree/master/kafka for how-to.
Review and commentary welcome.
Thanks,
S
On Fri, Aug 3, 2018 at 6:32 AM Hbase Janitor <[email protected]> wrote:
I opened hbase-21002 to start the scripts and assembly.
Mike
On Thu, Aug 2, 2018, 19:29 Stack <[email protected]> wrote:
Up in https://issues.apache.org/jira/browse/HBASE-20934 I created an
hbase-connectors repo. I put some form on it using the v19 patch from
HBASE-15320 "HBase connector for Kafka Connect". It builds and tests
pass. Here are some remaining TODOs:
* Figure how to do start scripts: e.g. we need to start up the kafka
proxy. It wants some hbase jars, conf dir, and others on the CLASSPATH
(Depend on an HBASE_HOME and then source bin/hbase?)
* Can any of the connectors make-do with the shaded client?
* Make connectors standalone or have them share conf, bin, etc?
* Need to do an assembly. Not done.
* Move over REST and thrift next. Mapreduce after?
The poms could do w/ a review. Hacked them over from hbase-thirdparty.
File issues and apply patches up in JIRA if your up for any of the above.
Thanks,
S
On Wed, Jul 25, 2018 at 10:46 PM Stack <[email protected]> wrote:
On Tue, Jul 24, 2018 at 10:01 PM Misty Linville <[email protected]>
wrote:
I like the idea of a separate connectors repo/release vehicle, but
I'm a
little concerned about the need to release all together to update just
one
of the connectors. How would that work? What kind of compatibility
guarantees are we signing up for?
I hate responses that begin "Good question" -- so fawning -- but, ahem,
good question Misty (in the literal, not flattering, sense).
I think hbase-connectors will be like hbase-thirdparty. The latter
includes netty, pb, guava and a few other bits and pieces so yeah,
sometimes a netty upgrade or an improvement on our patch to pb will
require
us releasing all though we are fixing one lib only. Usually, if bothering
to make a release, we'll check for fixes or updates we can do in the
other
bundled components.
On the rate of releases, I foresee a flurry of activity around launch
as
we fill missing bits and address critical bug fixes, but that then it
will
settle down to be boring, with just the occasional update. Thrift and
REST
have been stable for a good while now (not saying this is a good thing).
Our Sean just suggested moving mapreduce to connectors too -- an
interesting idea -- and this has also been stable too (at least until
recently with the shading work). We should talk about the Spark connector
when it comes time. It might not be as stable as the others.
On the compatibility guarantees, we'll semver it so if an incompatible
change in a connector or if the connectors have to change to match a new
version of hbase, we'll make sure the hbase-connector version number is
changed appropriately. On the backend, what Mike says; connectors use
HBase
Public APIs (else they can't be moved to the hbase-connector repo).
S
On Tue, Jul 24, 2018, 9:41 PM Stack <[email protected]> wrote:
Grand. I filed https://issues.apache.org/jira/browse/HBASE-20934.
Let me
have a go at making the easy one work first (the kafka proxy). Lets
see how
it goes. I'll report back here.
S
On Tue, Jul 24, 2018 at 2:43 PM Sean Busbey <[email protected]>
wrote:
Key functionality for the project's adoption should be in the
project.
Please do not suggest we donate things to Bahir.
I apologize if this is brisk. I have had previous negative
experiences
with folks that span our communities trying to move work I spent a
lot
of time contributing to within HBase over to Bahir in an attempt
to
bypass an agreed upon standard of quality.
On Tue, Jul 24, 2018 at 3:38 PM, Artem Ervits <
[email protected]>
wrote:
Why not just donating the connector to http://bahir.apache.org/
?
On Tue, Jul 24, 2018, 12:51 PM Lars Francke <
[email protected]>
wrote:
I'd love to have the Kafka Connector included.
@Mike thanks so much for the contribution (and your planned
ones)
I'm +1 on adding it to the core but I'm also +1 on having a
separate
repository under Apache governance
On Tue, Jul 24, 2018 at 6:01 PM, Josh Elser <[email protected]
wrote:
+1 to the great point by Duo about use of non-IA.Public
classes
+1 for Apache for the governance (although, I wouldn't care
if
we
use
Github PRs to try to encourage more folks to contribute), a
repo
with
the
theme of "connectors" (to include Thrift, REST, and the
like).
Spark
too
--
I think we had suggested that prior, but it could be a mental
invention
of
mine..
On 7/24/18 10:16 AM, Hbase Janitor wrote:
Hi everyone,
I'm the author of the patch. A separate repo for all the
connectors
is
a
great idea! I can make whatever changes necessary to the
patch to
help.
I have several other integration type projects like this
planned.
Mike
On Tue, Jul 24, 2018, 00:03 Mike Drob <[email protected]>
wrote:
I would be ok with all of the connectors in a single repo.
Doing a
repo
per
connector seems like a large amount of overhead work.
On Mon, Jul 23, 2018, 9:12 PM Clay B. <[email protected]>
wrote:
[Non-binding]
I am all for the Kafka Connect(er) as indeed it makes
HBase
"more
relevant" and generates buzz to help me sell HBase
adoption
in my
endeavors.
Also, I would like to see a connectors repo a lot as I
would
expect it
can
make the HBase source and releases more obvious in what is
changing.
Not
to distract from Kafka, but Spark has in the past been a
hang-up
and
seems
a good fit in such a repo too; as such, I would prefer
Apache
over
GitHub.
-Clay
On Mon, 23 Jul 2018, Andrew Purtell wrote:
Would we make a new repo called hbase-connectors and move
REST,
thrift,
and this new patch there?
I like this idea. We are already releasing
hbase-thirdparty like
this.
On Mon, Jul 23, 2018 at 5:47 PM Stack <[email protected]>
wrote:
(Thanks for the good discussion)
Where we think 'outside of HBase' would be?
Github seems too 'remote' from project and from Apache?
Would
we
make
a
new
repo called hbase-connectors and move REST, thrift, and
this new
patch
there?
Thanks,
S
On Mon, Jul 23, 2018 at 3:50 PM Josh Elser <
[email protected]>
wrote:
I'm -0 for including this into the main hbase tree. I
feel like
we've
made a bit of progress in cleaning up our core, and
this
strikes me
as
a
step in the wrong direction.
At the same time, the integration seems nice enough
(for
the
same
reasons Andrew points out). Is there a reason this
couldn't
exist
outside of HBase (at the ASF or otherwise)? Given a
quick
glance at
the
patch, it would be quite trivial to keep separate (just
requires
some
heavier scripting to get it off the ground that the
HBase
scripts
do
setup for). I feel like that will decrease our debt
while we
see if
people start using it. Our API should be more than
stable
enough to
prevent any worry about drift happening from core to
this
project.
On 7/23/18 6:35 PM, Stack wrote:
We have a very nice contrib sitting up in HBASE-15320
which
via a
proxy
--
so minimal dependencies -- adds source and sink for
Kafka
Connect.
It
is
nicely contained inside two new hbase-kafka-* modules.
We good w/ taking on this new feature?
It looks good to me. Check it out up on HBASE-15320. I
was
going
to
commit
to tip of branch-2 so it'd show up in hbase-2.2.x
unless you
all
want
some
backporting action going on.
S
--
Best regards,
Andrew
Words like orphans lost among the crosstalk, meaning torn
from
truth's
decrepit hands
- A23, Crosstalk