On Sat, Oct 2, 2010 at 2:30 AM, <[email protected]> wrote: > Thank you Andrew and all who replied, > > It is good to know what's available and what's not, so that I can plan the > way my > application works. > > Having no security is a big issue for me, since I am using Hbase on EC2. > > Knowing the internal IP of the Hbase master is the only thing a hacker > needs to bring my > database down. > > In fact I could write a script now, to go and create a table in any Hbase > running out > there on ec2. Of course, I don't have the motivation or time to to do that, > but others > might do. > > Security is best practiced in layers, and regardless of where you run HBase (in EC2 or an internal network), appropriate firewall rules should be used to limit access to your cluster, the same as any other database. This is how Hadoop and HBase are typically secured at the moment.
This is precisely what EC2 security groups are for, and if you use the hbase-ec2 scripts your cluster will not be accessible from other nodes inside EC2 or the outside world. In fact, the more common problem with running in EC2 is not having sufficient access from the outside. If you have any questions specific to EC2, feel free to ask. There are a number of folks on the list with experience who may be able to help out. Your answers are telling me not to commit my designs to Hbase and have > another system to > fall back to. Or maybe just learn how to build an application around Hbase, > while the > latter is being developed/improved/patched up. > > I understand that I should not expect to have all features I would like > available in > Hbase, not least because it is provided free of charge and there is a > number of > committed, good people trying to make everyone happy. > > However I believe that security should have been the first priority in the > development > process. It just makes sense to me. > > Since Hadoop did not have a usable version of secure file access until within the past few months, any previous implementation of security in HBase would have been meaningless. It's not very useful to be able to prevent client access within the HBase APIs when it's trivial to impersonate a user to Hadoop and read or modify the data files directly. Fortunately, a lot of work has gone into producing a version of Hadoop with strong authentication based on Kerberos and secured RPC via SASL. This gives us a good foundation for building security into HBase. But regardless you'll still want to lockdown network access to a cluster via a firewall. Have one in place does not obviate the need for the other. I actually think the picture for HBase security is pretty good at the moment. There is a lot of progress being made and if anything I think we're comparatively ahead of many similar projects in that regard. Gary
