[
https://issues.apache.org/jira/browse/ATLAS-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149119#comment-16149119
]
Graham Wallis commented on ATLAS-2092:
--------------------------------------
Thanks Nigel,
I agree that it would be better to avoid undesirable workarounds and yes, we
could ship a patched version of the graph code - in fact we do that for other
reasons already.
However I am now somewhat less hopeful than I was that we can fix this in the
graph layer alone. I tried to prototype a fix for the problem by testing with a
locally patched JanusGraph - I introduced a synchronization block around the
test and create of the edge label and its schema vertex. Unfortunately it
didn't work - and I was silly to think that it would - I had expected the
schema vertex to become immediately visible (e.g. in the schema cache) to the
other threads. But that wasn't very sensible, because the system is
transactional. We wouldn't expect the edge label schema vertex or the edge
itself to be visible until the transaction is committed. So I think the 'fix'
currently appears more elaborate - it either needs to apply heavy
synchronization (which I don't think would be desirable), or it needs to split
the label creation and the edge creation into separate transactions, the former
of which would be synchronized. The obvious failure condition (of committing
the create a label but rolling back the creation of the edge) would be benign -
it would just mean there's an unused edge label schema vertex. This split and
synch approach would probably be needed anywhere that we create a schema
vertex, so it could be a little pervasive.
There may be some lower hanging fruit - such as ensuring that we never ask the
real graph vertex to perform a getEdges with a label; but instead to always
query via the GraphHelper. This would not avoid the duplication issue, but it
would mean you get back correct answers to queries. We could (I think) perform
such a redirect in the implementations of AtlasVertex. I spent a little while
this afternoon scanning for places where we ask for edges by label and it looks
like it could be a manageable change. The above only applies to edge labels, so
I am also looking at the other roles of schema vertices - i.e. for property
keys and vertex labels.
> Failures following concurrent updates
> -------------------------------------
>
> Key: ATLAS-2092
> URL: https://issues.apache.org/jira/browse/ATLAS-2092
> Project: Atlas
> Issue Type: Bug
> Components: atlas-core
> Reporter: Graham Wallis
> Attachments: Investigations and findings relating to concurrent
> updates in Atlas.pdf
>
>
> There is a race condition that causes duplication of schema vertices as a
> result of concurrent graph updates. This in turn leads to failure of queries
> that specify a type such as an edge label used in an attribute that
> references another entity. This problem is known to affect Atlas entity refs
> – which create graph edges that use edge label schema vertices. It is likely
> that it also affects other types in Atlas.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)