[
https://issues.apache.org/jira/browse/HBASE-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842342#action_12842342
]
Kay Kay commented on HBASE-2170:
--------------------------------
| But from a deployment & operation perspective, it is a nightmare.
Why ? It is a nightmare now. Theoretically - if there were a client for HBase
doing some M-R / inserts / scans - we have to add the hbase-*.jar and
everything in lib/*.jar as well , because it is monolithic. Currently - we just
add hbase-0.xx.jar and add other jars as required until there is no
classnotfoundexception, which is really not the job of the client developer,
but that of the system publisher to make the distinction.
I guess that holds true for any client/server system, to publish a light-weight
client , to focus on the server development better with different release
cycles, depending on the need.
| Can somebody please enumerate which jars are currently required by a hbase
client application and also enumerate which jars will be needed by after this
patch?
The point is to shift the responsibility from the hbase client user to the
hbase maintainers , to decide which dependencies need to go with the client, so
the client does not need to do *add a jar from lib until no class not found
exception occurs* algorithm. On top of my head - I can think of log4j / zk /
thrift / rest , that would come in here.
| Also, can somebody who is using hbase vouch for the fact that the splitting
of jars is helpful?
To begin with , *we do* . We have a HBase farm set up , consisting of the data
set / and being inserted from the outside world and independent developers who
develop on top of HBase ( on top of the ramp up curve, that they have to get on
to the platform) are plain confused by the list of jars and adding them to the
client namespace, and if it does not make sense to add , say hdfs.jar in the
client lib of hbase , if all they wanted to do was to do some scans / M-R on
the hbase data, as that information is immaterial. We have plans to start using
mahout and build algorithms on top of the platform, and it makes no sense
whatsoever to bring in the hidden dependencies of hbase and expose it to the
hbase client to discourage them entirely.
| The analogy I draw is that the hadoop libraries are the same for the hadoop
clients as well as servers. I have found it to be great help for operation
purposes: there is lesser chance of mixing up different incompatible versions.
But the hbase client is not meant to be on the same machine as a server ? From
an operational purpose - you will be having a different set of jars for client
and the server. And the common jar, would not be released explicitly but be
part of the client and the server as appropriate.
| in a world of copying jars around, a single jar reduces the mess-up
possibility significantly. In an automated ops-world this is somewhat less
important I hope'
The client is susceptible to less changes and the server a lot more. If there
were 2 jars - client and the server, would not that make it more clear as to
the developer and the ops as to which jar needs to be replaced and which
machines are affected ?
> hbase lightweight client library as a distribution
> ---------------------------------------------------
>
> Key: HBASE-2170
> URL: https://issues.apache.org/jira/browse/HBASE-2170
> Project: Hadoop HBase
> Issue Type: Wish
> Reporter: Kay Kay
>
> As a wish - it would be nice to have a hbase client library (subset of the
> current hbase distribution) that needs to be present at the hbase client
> level to interact with the master/region servers.
> From an app integration - users of hbase can just link against the client
> library as opposed to getting the entire library to link against.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.