The advice is great, thank you. I agree that what Tinkerpop provides at the moment is sufficient for prototyping the capability. I've trialled identifying the user on their connection to the server and creating a representation of their privileges with a netty handler. The plan at the moment is to look at how it performs across graphs of varying sizes and complexity to see whether it could be a suitable way to proceed or whether there is a different approach that could achieve the goal.
Thanks again for taking the time to reply. I'll take any subsequent discussion back to the Janusgraph forum, as it is probably more suitable for my current approach. On 2020/01/09 19:19:20, Stephen Mallette <[email protected]> wrote: > Thanks for sharing that additional information. I think that your approach > is consistent with what I understood you were trying to explain earlier so > that's good to know. You're essentially encoding ACL into the graph itself > (i.e. adding a property key to a graph element) as in: > > AccessControl ac = new SimpleAccessControl.SimpleAccessControlBuilder() > .allowNationality(new NationalityAttribute("NZ")) > .allowOrg(new OrgAttribute("RUPTURE_FARMS")) > .andGroup(new GroupAttribute("APPLE_PICKER")) > .orGroup(new GroupAttribute("PEAR_EATER")) > .build(); > g.addV("Person").property("access", ac).next(); > > and then testing "access" in Gremlin traversals which you could do manually > as part of the basic writing of Gremlin or by way of a custom > TraversalStrategy which could take the Principal and inject the appropriate > has() filters to validate. The nice thing about this approach is that it > will work for any graph that can support a serialized AccessControl, but it > comes at the expense of muddying the schema with an "access" property > everywhere. We have a similar pattern that muddies the schema with > PartitionStrategy (putting a "_partition" property key all over the place) > and one might argue that this feature you're suggesting is quite similar to > that. PartitionStrategy is actually a form of SubgraphStrategy and I think > that in my mind the amount of extension on that pattern that we should > maintain in TinkerPop should probably end there (for the reasons I've > mentioned before, but also because AccessControl won't be serializable to > every graph database so we immediately run into generalization issues....at > least in the manner that you've implemented this so far). . > > > We are building a component that communicates with a JanusGraph server > on behalf of a user. In this case, the component manages all user aspects, > such as authenticating them, obtained their privileges from an external > source, constructing the Gremlin query based on their query, getting the > data they want and passing it back. > > I don't know the nature of your "component", but I think if you're looking > for a JanusGraph specific solution you might just look at extending Gremlin > Server a bit. Sounds like you just need to write a custom netty handler or > two which you will inject after the authentication handler (you would > probably write a custom Authenticator implementation as well) but before > the Gremlin processing. You would then take the authenticated user and grab > their privileges to create the Principal. You would then configure the "g" > for that user with your AclTraversalStrategy (probably a direct copy of the > PartitionStrategy code) that takes the Principal. When an authenticated > user sends their query to the server everything will be rigged up to do all > the ACL work. > > I'd be curious how you find this functionality performs when you get it > running at scale. Again, I worry for you falling into in-memory filtering > traps that can't be avoid with JanusGraph indices. It's a shame there > aren't more graphs with this kind of functionality built in. DS Graph is > the only one I know that has really nice "row level access control". Not > sure if Pieter Martin is following this thread but I wonder if sqlg gets > this sort of functionality for free with RDBMs that support this > functionality - that would be cool. Anyone know any other graphs that have > this sort of security baked into it? > > On Wed, Jan 8, 2020 at 3:03 AM Mike Lee <[email protected]> wrote: > > > The access control property is essentially only a collection of rules. As > > an example from a specific use case I have, users have groups or attributes > > assigned to them in an external system. For particular elements of data > > that are stored in the graph, we want to be able to say something like "to > > access this element, a user must have both the ONCOLOGY and SPECIALIST > > attributes, and at least one of the FINANCE_APPROVER or FINANCE_MANAGER > > attributes". This access control property is then checked against and > > Object representing the user's attributes and the application determines > > whether or not the user can perform some action. I had envisioned that, in > > setting up the application, the administrator could configure the specific > > strategy for how this access control check is resolved. > > > > I have the code in Github in a fork from JanusGraph and here is a link to > > the datatype that I stored on the graph to do the filtering ( > > https://github.com/mikelee2082/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/core/attribute/Access.java). > > There are unit tests where I used the SubgraphStrategy in > > https://github.com/mikelee2082/janusgraph/blob/master/janusgraph-test/src/test/java/org/janusgraph/core/attribute/AccessAttributeTest.java > > . > > > > I am a very inexperienced developer and I have not yet thought about how > > it could be done in a way that would be language agnostic. The way I have > > done it at the moment depends on a client being able to create and > > serialize a Java object to store on the graph. > > > > At the moment, we are fairly constrained in our thinking by our use case. > > We are building a component that communicates with a JanusGraph server on > > behalf of a user. In this case, the component manages all user aspects, > > such as authenticating them, obtained their privileges from an external > > source, constructing the Gremlin query based on their query, getting the > > data they want and passing it back. This seems like a lot of effort, but it > > is an essential requirement that we only provide users with the level of > > detail that they are authorized to read. So in our case, the user's don't > > create a TraversalSource - we do that for them. Because of this, we have > > the option of constructing a TraversalSource that is constrained to a > > subgraph. Ideally, we would prefer to have users communicate with the > > server directly. > > > > Again, thanks for taking the time to consider my idea. I hope I have > > answered your questions. My solution right now is fairly primitive, so if > > anyone has any ideas for how it might be done differently, in a way that > > works for larger graphs just as well as small, I'd be really interested in > > giving it a go. > > > > On 2020/01/03 11:44:09, Stephen Mallette <[email protected]> wrote: > > > Thanks for posting your idea. As others on the list may not have seen my > > > post elsewhere, I'll just quickly repeat an approximation of what I > > wrote. > > > Basically, I think that users would like a feature like this, but I > > > wondered if it were something best left to graph providers to implement > > > native to their systems as an unoptimized implementation may not perform > > > well or behave with limited functionality. So, with that in mind, here's > > > some further thoughts/questions: > > > > > > 1. Could you say some more about how the "access control property" is > > > defined? How would you envision such a thing to be generalized across all > > > graphs providers? > > > 2. Could you share some sample code for how you define your "custom > > > predicate" and what the TraversalStrategy does with that (basically, > > please > > > show how all that wraps up with Gremlin)? > > > 3. Please keep in mind that any solution here must be portable across all > > > programming languages - will users be able to define the required objects > > > in python, javascript, etc? > > > 4. As I think about how users initialize a TraversalSource, I can't help > > > thinking that implementing this feature as a TraversalStrategy places it > > at > > > the wrong level of abstraction. The notion of a "user" who has access > > > rights is bound to the RemoteConnection (Cluster/Client). It is through > > > that method that the graph is aware of who the user is and from that > > > initial authenticating handshake can govern the data that the user sees. > > > While that thinking applies to remote graphs, it might also apply to > > > embedded graphs as well where the "user" is supplied by way of the > > > Configuration object given to the Graph instance where subsequent > > > TraversalSource constructs would inherit from that. > > > > > > > > > > > > > > > > > > On Thu, Jan 2, 2020 at 6:55 AM Mike Lee <[email protected]> wrote: > > > > > > > Hello > > > > > > > > Apologies if this is the incorrect forum - I was pointed here from > > another > > > > mailing list. > > > > > > > > I had an idea for an access control scheme that could be applied to > > > > vertices, edges or vertex properties and would allow a server to check > > > > whether a user has permission to retrieve or traverse that particular > > graph > > > > element. The access control property would be a set of rules outlining > > the > > > > attributes, and the combination of those attributes, that establish > > whether > > > > or not a user has sufficient privileges for that graph element. I have > > > > experimented with attempting to use the existing Gremlin language to do > > > > this, but I have so far been unable to achieve the level of > > fine-grained > > > > access control that I believe would be useful in a variety of > > situations. I > > > > have tested this with a prototype that uses a Java object and a custom > > > > predicate that tests a user's profile against the access control, and > > then > > > > used a traversal strategy to constrain a query to those elements which > > pass > > > > the test. > > > > > > > > I was curious as to whether people would see such a feature as > > something > > > > that could be part of Gremlin, or whether it would be better left to > > > > specific implementations of Tinkerpop. > > > > > > > > Thanks for your consideration. > > > > > > > > > >
