Hi Tobias,

let me shed some light on this:

The UPDATE operation can be used on a single document like
UPDATE "docKey" WITH {...} IN coll

or on multiple like so:
FOR doc IN coll UPDATE doc WITH { ... } IN coll

but important to note is, that each document is only *updated once*.

Your traversal returns 0..n documents (movies), and you iterate over their 
genre attributes.
The UPDATE is in the FOR loop body of that iteration, which means it is 
executed for every element in the genre array - which can be 0..n times per 
document (movie).
In addition, the same movie documents can be returned by a graph traversal 
multiple times (depends on your edges).

If the same document is updated multiple times, you may observe ignored 
updates, which are actually caused by the original documents being read 
from a cache and not the already modified versions.
It seems like there are situations however, in which multiple updates are 
successful. Maybe someone of the core can comment on that, what the 
intended behavior is.

What works in my test scenario is the following query:

FOR u IN users
    LET genreStats = MERGE(
        FOR m IN OUTBOUND u GRAPH 'ratedGraph' // get all movies a user is 
linked to
            OPTIONS {uniqueVertices: 'global', bfs: true} // ignore 
duplicate movies
            FOR g IN m.genre
                COLLECT genre = g WITH COUNT INTO count // group by genre
                RETURN {[genre]: count} // return one object per genre with 
key=genre and value=count (merged into single object by MERGE function in 
2nd line)
    )
    FILTER LENGTH(genreStats) // don't update user documents which are not 
linked to any movie
    UPDATE u WITH {genreStats} IN users // write stats into user document 
as attribute "genreStats"
    RETURN NEW // optionally return updated user document

This calculates and writes the genre stats for all users. Note that the 
graph traversal is in the scope of a sub-query to keep the aggregation part 
per-user.
You could also write the stats directly into the user document like UPDATE 
MERGE(genreStats, u), but I find it way cleaner to use a nested object:
If you want to remove the stats, you don't need to know all the names of 
the genres (assuming there are additional attributes in the user documents, 
which are not genre stats).

A remark about graphs and modifications:
Graph traversals may return vertices from different collections, but UPDATE 
/ REPLACE / REMOVE can only be carried out against one pre-defined 
collection (UPDATE ... IN collectionName).
It's probably a good idea to add FILTER 
IS_SAME_COLLECTION("collectionName", vertex) to the traversal.
Otherwise you might accidentally try to modify a document that is in 
another collection (error/warning), or accidentally change a wrong document 
that happens to have the same _key like a document from another collection.

Best, Simran

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to