I am sorry if my question was interpreted as an aggression.
My system will have an HBase cluster on EC2, which will serve a number
of clients on EC2 as well.
If I am developing my client application using the Java API, I need to
set the configuration files to point to the HBase cluster. How would I
be able to connect to HBase from my clients using the java API, if
there is another layer hiding the cluster?
Yes, I am new to EC2, and HBase, therefore I might be asking some
basic questions, that no expert would want to answer and therefore no
transfer of knowledge would occur.
Many thanks.
Quoting Andrew Purtell <[email protected]>:
From: acc2
Subject: Re: How do I setup authentication/permissions for an hbase
database?
Having no security is a big issue for me, since I am using
Hbase on EC2.
No matter what, you are not going to want to let the world connect
to your database directly. That's simply very poor system
architecture. Would you set up a MySQL or Oracle database on EC2 and
open the database service port to the world? No, you would not. The
database would be behind an application layer, and would be
firewalled off from the world. Your PHP front end or whatever would
be interacting with the database, not users directly.
Knowing the internal IP of the Hbase master is the only
thing a hacker needs to bring my database down.
Given how EC2 security groups work, this would only happen if you do
not know what you are doing.
Your answers are telling me not to commit my designs to
Hbase and have another system to fall back to.
[...]
However I believe that security should have been the first
priority in the development process. It just makes sense to
me.
Your statements are telling me you are unfamiliar not only with
HBase and Hadoop, which is quite understandable, but also system
architecture and operation regards EC2. Your point is well taken but
it would be more meaningful if you were better informed before
making it.
As I said before Hadoop was originally designed for single tenant
operation in a walled garden, as a grid computing system. F.e.,
firewalled away with the rest of your back end systems. This is
hardly an unreasonable design and does not demonstrate negligence in
any way. Since HBase is a client of Hadoop services and Hadoop did
not have any notion of strong authentication or access control until
this year, any prior consideration for secure access in the HBase
API would have been pointless.
On your part this is probably unintentional but I find it ironic
that now that there are security features available in Hadoop, and
HBase, unlike most of "NoSQL", is now working on similarly adding
strong authentication and access control in the database -- rather
than expect it to be handled in another layer -- and right at this
precise time someone shows up to knock us for having "no security".
- Andy