Hey all,
I'm using OrientDB for a graph database and the performance has mostly been
solid. That said, I've been running into some performance issues which seem
to come from the Tinkerpop GremlinGroovy module, which I'm using with
Tinkerpop frames to issue gremlin queries on specific classes.
For example, retrieving local friends for a given player can be achieved
either this way:
@GremlinGroovy("it.out('PresentIn').in('PresentIn').retain(it.both('FriendOf').toList())")
def getLocalFriends : java.lang.Iterable[GraphPlayerV]
or this way:
val localFriends = session.graph.asInstanceOf[OrientTransactionalGraph].command(
new OCommandSQL("""SELECT
EXPAND(gremlin('current.out("PresentIn").in("PresentIn").retain(current.both("FriendOf").toList())'))
FROM Player WHERE id = """ + id)
).execute().asInstanceOf[java.lang.Iterable[Vertex]].map(v =>
session.manager.frame(v, classOf[GraphPlayerV]))
Strangely, the former lags massively at scale (so as to effectively be
unusable), whereas the latter does not. Does anybody have any clues as to
why that might be?
My suspicion so far is that the GremlinGroovy module has to compile gremlin
code first, and that it does not somehow cache the result in a thread-safe
manner for future re-use. Either that or I'm missing something.
As for the orientdb query, I'm guessing it does not get compiled at all on
the client, but rather on the remote orientdb server.
That begs the question: do orientdb commands run at all against the local
cache? The docs say:
When the client application asks for a record OrientDB checks:
-
if a *transaction* has begun then it searches inside the transaction for
changed records and returns it if found
-
if the *Local cache* is enabled and contains the requested record then
return it
-
otherwise, at this point the record is not in cache, then asks for it to
the *Storage* (disk, memory)
http://orientdb.com/docs/2.1/Caching.html
According to the above, the command should run first against whatever is
cached locally and take into account what has been mutated by the
transaction. That said, this is the method definition of command used above:
/**
* Executes commands against the graph. Commands are executed outside
transaction.
*
* @param iCommand
* Command request between SQL, GREMLIN and SCRIPT commands
*/
public OCommandRequest command(final OCommandRequest iCommand) {
makeActive();
return new OrientGraphCommand(this, getRawGraph().command(iCommand));
}
Note the description in the comments: "Commands are executed outside
transaction." That makes sense, but do they take into account whatever has
been mutated up to that point in the transaction? How could it though if it
includes gremlin which must be compiled on the remote server? My
experiments suggest that they do not, and run purely against what lives on
the server. In that case, what is the proper/performant way to execute a
graph query which takes into account any transaction mutations and local
cache?
It would be great to get some clarity on this issue!
Main questions:
- Does db.command run against the client graph, taking into account
transaction mutations and local cache or is it sent straight to the server?
- Does gremlin code in an orientdb sql query get compiled on the client
or server only?
- Why is there such a performance difference between querying with
db.command vs. Tinkerpop Frames w/ GroovyModule?
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.