Hi Tobias,
let me shed some light on this:
The UPDATE operation can be used on a single document like
UPDATE "docKey" WITH {...} IN coll
or on multiple like so:
FOR doc IN coll UPDATE doc WITH { ... } IN coll
but important to note is, that each document is only *updated once*.
Your traversal returns 0..n documents (movies), and you iterate over their
genre attributes.
The UPDATE is in the FOR loop body of that iteration, which means it is
executed for every element in the genre array - which can be 0..n times per
document (movie).
In addition, the same movie documents can be returned by a graph traversal
multiple times (depends on your edges).
If the same document is updated multiple times, you may observe ignored
updates, which are actually caused by the original documents being read
from a cache and not the already modified versions.
It seems like there are situations however, in which multiple updates are
successful. Maybe someone of the core can comment on that, what the
intended behavior is.
What works in my test scenario is the following query:
FOR u IN users
LET genreStats = MERGE(
FOR m IN OUTBOUND u GRAPH 'ratedGraph' // get all movies a user is
linked to
OPTIONS {uniqueVertices: 'global', bfs: true} // ignore
duplicate movies
FOR g IN m.genre
COLLECT genre = g WITH COUNT INTO count // group by genre
RETURN {[genre]: count} // return one object per genre with
key=genre and value=count (merged into single object by MERGE function in
2nd line)
)
FILTER LENGTH(genreStats) // don't update user documents which are not
linked to any movie
UPDATE u WITH {genreStats} IN users // write stats into user document
as attribute "genreStats"
RETURN NEW // optionally return updated user document
This calculates and writes the genre stats for all users. Note that the
graph traversal is in the scope of a sub-query to keep the aggregation part
per-user.
You could also write the stats directly into the user document like UPDATE
MERGE(genreStats, u), but I find it way cleaner to use a nested object:
If you want to remove the stats, you don't need to know all the names of
the genres (assuming there are additional attributes in the user documents,
which are not genre stats).
A remark about graphs and modifications:
Graph traversals may return vertices from different collections, but UPDATE
/ REPLACE / REMOVE can only be carried out against one pre-defined
collection (UPDATE ... IN collectionName).
It's probably a good idea to add FILTER
IS_SAME_COLLECTION("collectionName", vertex) to the traversal.
Otherwise you might accidentally try to modify a document that is in
another collection (error/warning), or accidentally change a wrong document
that happens to have the same _key like a document from another collection.
Best, Simran
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.