On Sat, Oct 2, 2010 at 2:30 AM, <[email protected]> wrote:

> Thank you Andrew and all who replied,
>
> It is good to know what's available and what's not, so that I can plan the
> way my
> application works.
>
> Having no security is a big issue for me, since I am using Hbase on EC2.
>
> Knowing the internal IP of the Hbase master is the only thing a hacker
> needs to bring my
> database down.
>
> In fact I could write a script now, to go and create a table in any Hbase
> running out
> there on ec2. Of course, I don't have the motivation or time to to do that,
> but others
> might do.
>
>
Security is best practiced in layers, and  regardless of where you run HBase
(in EC2 or an internal network), appropriate firewall rules should be used
to limit access to your cluster, the same as any other database.  This is
how Hadoop and HBase are typically secured at the moment.

This is precisely what EC2 security groups are for, and if you use the
hbase-ec2 scripts your cluster will not be accessible from other nodes
inside EC2 or the outside world.  In fact, the more common problem with
running in EC2 is not having sufficient access from the outside.  If you
have any questions specific to EC2, feel free to ask.  There are a number of
folks on the list with experience who may be able to help out.


Your answers are telling me not to commit my designs to Hbase and have
> another system to
> fall back to. Or maybe just learn how to build an application around Hbase,
> while the
> latter is being developed/improved/patched up.
>
> I understand that I should not expect to have all features I would like
> available in
> Hbase, not least because it is provided free of charge and there is a
> number of
> committed, good people trying to make everyone happy.
>
> However I believe that security should have been the first priority in the
> development
> process. It just makes sense to me.
>
>
Since Hadoop did not have a usable version of secure file access until
within the past few months, any previous implementation of security in HBase
would have been meaningless.  It's not very useful to be able to prevent
client access within the HBase APIs when it's trivial to impersonate a user
to Hadoop and read or modify the data files directly.

Fortunately, a lot of work has gone into producing a version of Hadoop with
strong authentication based on Kerberos and secured RPC via SASL.  This
gives us a good foundation for building security into HBase.  But regardless
you'll still want to lockdown network access to a cluster via a firewall.
Have one in place does not obviate the need for the other.

I actually think the picture for HBase security is pretty good at the
moment.  There is a lot of progress being made and if anything I think we're
comparatively ahead of many similar projects in that regard.

Gary

Reply via email to