Thanks for sharing that additional information. I think that your approach
is consistent with what I understood you were trying to explain earlier so
that's good to know. You're essentially encoding ACL into the graph itself
(i.e. adding a property key to a graph element) as in:
AccessControl ac = new SimpleAccessControl.SimpleAccessControlBuilder()
.allowNationality(new NationalityAttribute("NZ"))
.allowOrg(new OrgAttribute("RUPTURE_FARMS"))
.andGroup(new GroupAttribute("APPLE_PICKER"))
.orGroup(new GroupAttribute("PEAR_EATER"))
.build();
g.addV("Person").property("access", ac).next();
and then testing "access" in Gremlin traversals which you could do manually
as part of the basic writing of Gremlin or by way of a custom
TraversalStrategy which could take the Principal and inject the appropriate
has() filters to validate. The nice thing about this approach is that it
will work for any graph that can support a serialized AccessControl, but it
comes at the expense of muddying the schema with an "access" property
everywhere. We have a similar pattern that muddies the schema with
PartitionStrategy (putting a "_partition" property key all over the place)
and one might argue that this feature you're suggesting is quite similar to
that. PartitionStrategy is actually a form of SubgraphStrategy and I think
that in my mind the amount of extension on that pattern that we should
maintain in TinkerPop should probably end there (for the reasons I've
mentioned before, but also because AccessControl won't be serializable to
every graph database so we immediately run into generalization issues....at
least in the manner that you've implemented this so far). .
> We are building a component that communicates with a JanusGraph server
on behalf of a user. In this case, the component manages all user aspects,
such as authenticating them, obtained their privileges from an external
source, constructing the Gremlin query based on their query, getting the
data they want and passing it back.
I don't know the nature of your "component", but I think if you're looking
for a JanusGraph specific solution you might just look at extending Gremlin
Server a bit. Sounds like you just need to write a custom netty handler or
two which you will inject after the authentication handler (you would
probably write a custom Authenticator implementation as well) but before
the Gremlin processing. You would then take the authenticated user and grab
their privileges to create the Principal. You would then configure the "g"
for that user with your AclTraversalStrategy (probably a direct copy of the
PartitionStrategy code) that takes the Principal. When an authenticated
user sends their query to the server everything will be rigged up to do all
the ACL work.
I'd be curious how you find this functionality performs when you get it
running at scale. Again, I worry for you falling into in-memory filtering
traps that can't be avoid with JanusGraph indices. It's a shame there
aren't more graphs with this kind of functionality built in. DS Graph is
the only one I know that has really nice "row level access control". Not
sure if Pieter Martin is following this thread but I wonder if sqlg gets
this sort of functionality for free with RDBMs that support this
functionality - that would be cool. Anyone know any other graphs that have
this sort of security baked into it?
On Wed, Jan 8, 2020 at 3:03 AM Mike Lee <[email protected]> wrote:
> The access control property is essentially only a collection of rules. As
> an example from a specific use case I have, users have groups or attributes
> assigned to them in an external system. For particular elements of data
> that are stored in the graph, we want to be able to say something like "to
> access this element, a user must have both the ONCOLOGY and SPECIALIST
> attributes, and at least one of the FINANCE_APPROVER or FINANCE_MANAGER
> attributes". This access control property is then checked against and
> Object representing the user's attributes and the application determines
> whether or not the user can perform some action. I had envisioned that, in
> setting up the application, the administrator could configure the specific
> strategy for how this access control check is resolved.
>
> I have the code in Github in a fork from JanusGraph and here is a link to
> the datatype that I stored on the graph to do the filtering (
> https://github.com/mikelee2082/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/core/attribute/Access.java).
> There are unit tests where I used the SubgraphStrategy in
> https://github.com/mikelee2082/janusgraph/blob/master/janusgraph-test/src/test/java/org/janusgraph/core/attribute/AccessAttributeTest.java
> .
>
> I am a very inexperienced developer and I have not yet thought about how
> it could be done in a way that would be language agnostic. The way I have
> done it at the moment depends on a client being able to create and
> serialize a Java object to store on the graph.
>
> At the moment, we are fairly constrained in our thinking by our use case.
> We are building a component that communicates with a JanusGraph server on
> behalf of a user. In this case, the component manages all user aspects,
> such as authenticating them, obtained their privileges from an external
> source, constructing the Gremlin query based on their query, getting the
> data they want and passing it back. This seems like a lot of effort, but it
> is an essential requirement that we only provide users with the level of
> detail that they are authorized to read. So in our case, the user's don't
> create a TraversalSource - we do that for them. Because of this, we have
> the option of constructing a TraversalSource that is constrained to a
> subgraph. Ideally, we would prefer to have users communicate with the
> server directly.
>
> Again, thanks for taking the time to consider my idea. I hope I have
> answered your questions. My solution right now is fairly primitive, so if
> anyone has any ideas for how it might be done differently, in a way that
> works for larger graphs just as well as small, I'd be really interested in
> giving it a go.
>
> On 2020/01/03 11:44:09, Stephen Mallette <[email protected]> wrote:
> > Thanks for posting your idea. As others on the list may not have seen my
> > post elsewhere, I'll just quickly repeat an approximation of what I
> wrote.
> > Basically, I think that users would like a feature like this, but I
> > wondered if it were something best left to graph providers to implement
> > native to their systems as an unoptimized implementation may not perform
> > well or behave with limited functionality. So, with that in mind, here's
> > some further thoughts/questions:
> >
> > 1. Could you say some more about how the "access control property" is
> > defined? How would you envision such a thing to be generalized across all
> > graphs providers?
> > 2. Could you share some sample code for how you define your "custom
> > predicate" and what the TraversalStrategy does with that (basically,
> please
> > show how all that wraps up with Gremlin)?
> > 3. Please keep in mind that any solution here must be portable across all
> > programming languages - will users be able to define the required objects
> > in python, javascript, etc?
> > 4. As I think about how users initialize a TraversalSource, I can't help
> > thinking that implementing this feature as a TraversalStrategy places it
> at
> > the wrong level of abstraction. The notion of a "user" who has access
> > rights is bound to the RemoteConnection (Cluster/Client). It is through
> > that method that the graph is aware of who the user is and from that
> > initial authenticating handshake can govern the data that the user sees.
> > While that thinking applies to remote graphs, it might also apply to
> > embedded graphs as well where the "user" is supplied by way of the
> > Configuration object given to the Graph instance where subsequent
> > TraversalSource constructs would inherit from that.
> >
> >
> >
> >
> >
> > On Thu, Jan 2, 2020 at 6:55 AM Mike Lee <[email protected]> wrote:
> >
> > > Hello
> > >
> > > Apologies if this is the incorrect forum - I was pointed here from
> another
> > > mailing list.
> > >
> > > I had an idea for an access control scheme that could be applied to
> > > vertices, edges or vertex properties and would allow a server to check
> > > whether a user has permission to retrieve or traverse that particular
> graph
> > > element. The access control property would be a set of rules outlining
> the
> > > attributes, and the combination of those attributes, that establish
> whether
> > > or not a user has sufficient privileges for that graph element. I have
> > > experimented with attempting to use the existing Gremlin language to do
> > > this, but I have so far been unable to achieve the level of
> fine-grained
> > > access control that I believe would be useful in a variety of
> situations. I
> > > have tested this with a prototype that uses a Java object and a custom
> > > predicate that tests a user's profile against the access control, and
> then
> > > used a traversal strategy to constrain a query to those elements which
> pass
> > > the test.
> > >
> > > I was curious as to whether people would see such a feature as
> something
> > > that could be part of Gremlin, or whether it would be better left to
> > > specific implementations of Tinkerpop.
> > >
> > > Thanks for your consideration.
> > >
> >
>