From the users end, I think its better to let them know that they don't have proper authorizations ( when they actually don't have ), rather than returning empty result. I am in favor of having a setting which restores early access denial.
On Apr 24, 2014, at 10:49 AM, Andrew Purtell <apurt...@apache.org> wrote: > Thanks. The perspective is valuable. > > Unfortunately we had to commit these changes to get them reviewed. But > we've flagged HFileV3 as experimental through the 0.98 cycle in public > comments about 0.98 (blog posts, presentations), and these features all > depend on HFileV3, so I think allows us some freedom of movement. Someone > speak up if you disagree. > > >> > > from the > > > "outsider" perspective, I would have guessed that the table-level ACLs > > > would have two different permissions: > > READ (default VISIBLE) and READ > > > (default INVISIBLE). If a user has the former, then they can see all cells > > > that aren't explicitly made invisible to them, and if the user has the > > > latter, they can't see any cells unless made explicitly visible. > > There is a per-operation attribute that lets the querying application flip > between those two modes of evaluating cell permissions on a request by > request basis. The motivation was performance optimization, avoiding a scan > over the store where possible, and flexibility. Based on what Enis, > > Vandana, and you are saying that flexibility is a net negative. I will file > a JIRA for a per-table setting that restores early-out access denial if the > user has no access at the table or CF level since this looks like 3 votes > in favor. > >> > This will also come into play once we do a better job of safeguarding META. >> AFAIK today a user can scan META and see the row keys for region > boundaries >> regardless of their access to those tables. > > That's a requirement of BigTable fundamentally. > > Cell ACLs let you grant arbitrary permissions to users on the cluster. How > would a user find those cells for which they are authorized of whole > sections of the table, within which those cells granting access may be > found, are invisible to their client library? > > It's an interesting idea but a lot of thought and work without a real need > for it. Don't encode sensitive information in row keys. That's been a > consideration for HBase schema design forever. > > > On Thu, Apr 24, 2014 at 10:21 AM, Todd Lipcon <t...@cloudera.com> wrote: > >> On Thu, Apr 24, 2014 at 10:13 AM, Andrew Purtell <apurt...@apache.org >>> wrote: >> >>>> Does this leave us open to leaking row existence due to timing >>> differences? >>> >>> I think I have to answer yes because we've never considered a defense >>> against this kind of attack against HBase data sources ever. As you say >> it >>> would depend on schema design. Do you think defending against timing >>> attacks is something HBase should do? >> >> >> In certain cases... (see below) >> >> >>> Is this a feature offered by MySQL or >>> Postgres or commercial RDBMSes? >> >> >> AFAIK most commercial databases don't offer the "hidden visibility" access >> control differently than "access denied". That is to say, you may deny >> access to a table, but in that case you get an error with any access to the >> table rather than an empty result. >> >> In our case we're probably leaking table size as well -- a scan with no key >> range attached should take time proportional to the amount of data in the >> table, even if you have no access. In commercial DBs I would be surprised >> if a user can get these types of estimates for a table they're disallowed >> from. >> >> >>> Or perhaps your point is more that the >>> original behavior of the AccessController is better because the number of >>> users able to perform this kind of attack would be limited to explicit >>> grants at the table or CF level. >>> >> >> Right. If I have a multitenant system and I deny you access, I wouldn't >> except you to be able to perform these kinds of attacks. >> >> I'm a bit of an outsider (haven't followed the implementation of the >> security features or why the design choices were made), but >> >> from the >> "outsider" perspective, I would have guessed that the table-level ACLs >> would have two different permissions: >> >> READ (default VISIBLE) and READ >> (default INVISIBLE). If a user has the former, then they can see all cells >> that aren't explicitly made invisible to them, and if the user has the >> latter, they can't see any cells unless made explicitly visible. But if >> they have neither type of READ permission on the table level, then they >> shouldn't be able to access the table at all. >> >> >> This will also come into play once we do a better job of safeguarding META. >> AFAIK today a user can scan META and see the row keys for region boundaries >> regardless of their access to those tables. This seems like the kind of >> thing that you'd need to allow for a user who has READ (even if they have >> default invisible), but you wouldn't want to allow for an arbitrary user on >> the cluster. >> >> -Todd >> >> >>> >>> >>> >>> On Thu, Apr 24, 2014 at 10:04 AM, Todd Lipcon <t...@cloudera.com> wrote: >>> >>>> Does this leave us open to leaking row existence due to timing >>> differences? >>>> >>>> For example, imagine you had a table where I happened to know (eg from >>>> reading your design docs on the wiki) that the key is made up of social >>>> security numbers. If I wanted to come up with a list of valid SSNs, I >>> could >>>> issue GETs against your table. If I issue a GET for an invalid SSN, the >>>> response will come back on average quite a bit faster than if I issued >> a >>>> GET for a valid SSN (since the invalid SSN would be filtered by blooms >>>> where the valid one would not). >>>> >>>> -Todd >>>> >>>> >>>> On Thu, Apr 24, 2014 at 9:49 AM, Andrew Purtell <apurt...@apache.org> >>>> wrote: >>>> >>>>> This is an intended change that was done as part of introducing cell >>>> ACLs. >>>>> Otherwise we can't support use cases where the user has no >>> authorization >>>> on >>>>> the table or CF level but cell ACLs grant exceptional access. It also >>>>> brings the AccessController behavior in line with the new >>>>> VisibilityController - cells which the user are not authorized to see >>> are >>>>> invisible in both settings. >>>>> >>>>> Enis recently brought up the same issue, let me copy that here: >>>>> >>>>>>>> >>>>> >>>>> Subject: Get / Scan without table ACLs no longer throws >>>>> AccessDeniedException >>>>> >>>>> I was a bit surprised to find out about the case where there is a >>>>> behavioral change in trying to read from tables that the user do not >>> have >>>>> table/cf level permission. >>>>> [...] >>>>> Also this behavioral change is applicable to the audit log as well, >> we >>> no >>>>> longer mark the access granted / denied requests for gets and scans >> in >>>> the >>>>> audit log which is concerning. >>>>> >>>>> From the lsat paragraph in >>>>> https://blogs.apache.org/hbase/entry/hbase_cell_security, Andrew >>> states >>>>> that there are two modes now, check cell first or not >>>>> (Query.setACLStrategy()). >>>>> >>>>> However, my understanding was that the default behavior should check >>>> table >>>>> first, and then not do the scan at all if that is denied. From the >> code >>>>> TableAuthManager.authorize(), it does not look to be the case. My >>>> questions >>>>> are: >>>>> 1) This is a behavioral change, and changes the default behavior as >>> well >>>>> regardless of whether cell level security is used or not. Should we >>>> revert >>>>> back to the original behavior? >>>>> 2) Even if we do not revert, should we record get / scans in the >> audit >>>> log >>>>> ? >>>>> 3) Are we targeting two use cases (a) user do not have table level >>> auth, >>>>> but selectively have cell level access, and (b) user do have table >>> level >>>>> auth but selectively NOT have cell level access? For these two use >>> cases, >>>>> should the strategy be a table level property rather than an per-op >>>>> property ? >>>>> >>>>> <<< >>>>> >>>>> To which I replied: >>>>> >>>>>>>> >>>>> >>>>> The answer is #3. >>>>> >>>>> It could be made a table level property. >>>>> >>>>>> Also this behavioral change is applicable to the audit log as well, >>> we >>>> no >>>>> longer mark the access granted / denied requests for gets and scans >> in >>>> the >>>>> audit log which is concerning. >>>>> >>>>> This is some kind of logic bug or oversight, please file a jira. >>>>> >>>>> <<< >>>>> >>>>> So if the consensus is this is too surprising or unwanted, then we >> can >>>>> without much difficulty make the new behavior configurable on a per >>> table >>>>> basis and have the default be the new behavior, with a release note >> and >>>>> paragraph in the security guide explaining how to reintroduce the old >>>>> behavior. I think that covers the bases. >>>>> >>>>> >>>>> >>>>> On Thu, Apr 24, 2014 at 12:35 AM, Vandana Ayyalasomayajula < >>>>> avand...@yahoo-inc.com> wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> We have seen a behavior change in the manner AccessController >> blocks >>>>>> unauthorized users between 0.94 and 0.98. >>>>>> In 0.98, if an unauthorized user tried to perform GET, SCAN empty >>>>> results >>>>>> are returned, whereas the same operations >>>>>> in 0.94 used to throw access denied exceptions. >>>>>> >>>>>> Is this a behavior change or a bug in 0.98 ? It would be of great >>> help >>>> if >>>>>> someone could point me to any jira which has discussions related to >>>>>> these changes. >>>>>> >>>>>> Thanks >>>>>> >> >> Vandana >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> >>>>> - Andy >>>>> >>>>> Problems worthy of attack prove their worth by hitting back. - Piet >>> Hein >>>>> (via Tom White) >>>>> >>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >>>> >>> >>> >>> >>> -- >>> Best regards, >>> >>> - Andy >>> >>> Problems worthy of attack prove their worth by hitting back. - Piet Hein >>> (via Tom White) >>> >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White)