Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-16 Thread Josh Elser
A nice round number to track this work: https://issues.apache.org/jira/browse/ACCUMULO-4500 Josh Elser wrote: Thanks for the reply, Mike. Mike Drob wrote: Hiding this behind the SystemPermission.SYSTEM permission might be sufficient. Superb. Personally, I wouldn't want to piggy-back on

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-14 Thread Josh Elser
Thanks for the reply, Mike. Mike Drob wrote: Hiding this behind the SystemPermission.SYSTEM permission might be sufficient. Superb. Personally, I wouldn't want to piggy-back on SYSTEM.SYSTEM (because that permission implies a lot of other things too), but that's an implementation detail we

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-14 Thread Mike Drob
Hiding this behind the SystemPermission.SYSTEM permission might be sufficient. In a situation where Accumulo data is on an encrypted volume, or the rfiles themselves are encrypted, then a root user wouldn't be able to read the rfiles to generate the histograms. This matches my initial mental

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-14 Thread Josh Elser
Ping Marc/Mike D. Josh Elser wrote: Thanks, Marc. Follow-on question(s) for you: Do you think _any_ such approach should never be pursued by Accumulo (reading into your other replies about doing it outside of Accumulo)? Are the permissions that we have in place not sufficient to protect such

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Christopher
I think SystemPermission.SYSTEM permission should probably be required for any public API retrieving this data. It is, after all, code run on servers, generating data directly from the RFiles. This would also imply that caution is needed if we were to cache the data in, say, the metadata table.

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Josh Elser
I was envisioning public API protected by a system permission (implying some Thrift RPC as well) if that is an important distinction for those with concerns. I am hoping to get more info from Mike/Marc about why they feel this is insufficient WRT Accumulo's security model. Keith Turner wrote:

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Keith Turner
We did discuss making this info available through the public API (and adding thrift calls to gather it). We discussed the possibility of adding a new permission. On Wed, Oct 12, 2016 at 2:35 PM, ivan bella wrote: > I do not see how this invalidates any security of the

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread ivan bella
I do not see how this invalidates any security of the system unless you are summarizing these counters and making them available through a thrift or other call; don't do that unless other security is put in place. To get a summary I would think you would have to use a separate utility to

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Keith Turner
On Wed, Oct 12, 2016 at 12:56 AM, Christopher wrote: > Keith, Russ, myself (and possible others) were discussing this at the > hackathon after the Accumulo Summit, and I think our consensus were > basically this: > > We need a generic pluggable mechanism for injecting

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Keith Turner
On Wed, Oct 12, 2016 at 10:40 AM, ivan bella wrote: > Yes the "owners" could create a visibility counting mechanism separately, > however if we make this RFile metadata a part of the system then we increase > the "ease of use". Unfortunately, system designers rarely think

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Marc P.
My point for discussing implementation outside of accumulo is because I think it does invalidate a core tenant On Wed, Oct 12, 2016, 12:26 PM Josh Elser wrote: > Again, can we please bring this discussion back from discussions of > implementations to security? > > Does the

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Josh Elser
Again, can we please bring this discussion back from discussions of implementations to security? Does the fact that you three were discussing implementations imply that you do not think this invalidates one of the core tenets (security first) of Accumulo? Christopher wrote: Keith, Russ,

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Marc P.
Beyond adding a tool on the side. It doesn't fit in metadata as that requires aggregated reads vs table aggregates data. On Wed, Oct 12, 2016, 11:02 AM Marc P. wrote: > How does it increase ease of use? > > On Wed, Oct 12, 2016, 10:34 AM ivan bella

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Marc P.
How does it increase ease of use? On Wed, Oct 12, 2016, 10:34 AM ivan bella wrote: > Yes the "owners" could create a visibility counting mechanism separately, > however if we make this RFile metadata a part of the system then we > increase the "ease of use". Unfortunately,

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread ivan bella
Yes the "owners" could create a visibility counting mechanism separately, however if we make this RFile metadata a part of the system then we increase the "ease of use". Unfortunately, system designers rarely think about the metadata they need from their system up front. That being said, if

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-12 Thread Marc P.
What prevents the owners of the system from doing this in their own table? Keeping track of that information is a use case of Accumulo. I think this may be an example of external code that the user must install. Placing the onus on the consumer mitigates concern that Mike "Mike" Drob and others

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread Christopher
Keith, Russ, myself (and possible others) were discussing this at the hackathon after the Accumulo Summit, and I think our consensus were basically this: We need a generic pluggable mechanism for injecting arbitrary user counters into the RFiles. We can then use these counters in custom

RE: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread Josh Elser
Trivially. We could do something more intelligent like also cache it in metadata (updating with compactions). Don't read too much into the implementation at this point; it was just the first idea I had about how we could do it :). I'm more concerned with the idea and its security implications

RE: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread dlmarion
So, to get the set of visibilities used in a table, we would have to open all of the rfiles? > -Original Message- > From: Dylan Hutchison [mailto:dhutc...@cs.washington.edu] > Sent: Tuesday, October 11, 2016 3:43 PM > To: Accumulo Dev List > Subject: Re: [DISCUSS] Would a visibility

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread Russ Weeks
> I've always been under the impression that accumulo was not supposed to > confirm the existence of data that a user did not have permission to read. OK, that makes sense, I can see the need for that. But if we follow this path of keeping the summary data structure in the RFile header (footer?)

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread Sean Busbey
I think a new permission would cover the concern about leaking meta-information. Even if only the administrative user could see the histogram (since they can see all data), that'd be a gain. -- Sean Busbey On Oct 11, 2016 16:33, "Mike Drob" wrote: > I've always been under the

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread Mike Drob
I've always been under the impression that accumulo was not supposed to confirm the existence of data that a user did not have permission to read. On Tue, Oct 11, 2016, 2:20 PM Josh Elser wrote: > Today at Accumulo Summit, our own Russ Weeks gave a talk. One topic he >

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread Josh Elser
Hah, funny you mention custom RFile index. I think Adam Fuchs had proposed an idea before similar (probably years ago now) :) re: the monitor, I was more thinking that it would just be an API call to access it. I had not thought about automatically displaying it on the monitor (but it is an

Re: [DISCUSS] Would a visibility histogram on a table be harmful?

2016-10-11 Thread Dylan Hutchison
Interesting idea. It begs the question: should we allow any custom index at the RFile level? If RFile indexes were user-extensible, then a visibility index would be something any developer could write. That said, we can still include such an index as an example, and if we did it could be used