Aggregates in MatchStep?

Marko Rodriguez Tue, 23 Jun 2015 07:10:12 -0700

Hi,

Here is a basic recommendation algorithm that we currently can not do in 
match().


g.V(1).as('a').out('likes').aggregate('x'). // what does v[1] like
    in('likes').where(neq('a')).            // who else likes those things that 
are not v[1]
    out('likes').where(not(within('x')).    // what do those people like that 
v[1] doesn't already like
        groupCount()                        // count the number of times each 
liked thing is seen

Why can't we do this in match()? Because we can not aggregate (well, we can, 
but bare with me). The best we can do is:

g.V(1).match('a',
    as('a').out('likes').as('b'),
    as('b').in('likes').as('c'),
    as('c').out('likes').as('d'),
    where('d',neq('b')),
    where('a',neq('c'))).
        select('d').groupCount()

So what's the problem? The problem is that where('d',neq('b')) is only going to 
make sure that the current 'd' is not equal to the current 'b'. Not the 
aggregate of all b's. How do we solve this? Here is how we would "manually" 
solve it using Kuppitz' latest idea of adding a clear()-step, where clear('x') 
=> sideEffect{it.sideEffects('x').clear()}

g.V(1).match('a',
    as('a').clear('{b}').out('likes').aggregate('{b}').as('b'),
    as('b').in('likes').as('c'),
    as('c').out('likes').as('d'),
    where('d',not(within('{b}'))),
    where('a',neq('c'))).
        select('d').groupCount()

However, suppose that we make "{ }" (set) and "[ ]" (list) special label 
characters whereby the above is compiled to from the following:

g.V(1).match('a',
    as('a').out('likes').as('{b}'),
    as('b').in('likes').as('c'),
    as('c').out('likes').as('d'),
    where('d',not(within('{b}'))),
    where('a',neq('c'))).
        select('d').groupCount()

In essence, if an end variable is labeled with "{ }" that means aggregate to 
set ("[ ]" could mean aggregate to list). The variable name inside the "{ }" is 
then the "drain" of the CollectingBarrierStep -- i.e., the single object 
emission post aggregation. Next, the as("{b}") syntax need not be limited to 
match() where, in fact, the first query of this email could be written as:

g.V(1).as('a').out('likes').as('{x}').
    in('likes').where(neq('a')).
    out('likes').where(not(within('{x}')).
        groupCount()

Where does the clear() go in the above? It goes after the first StartStep 
encountered moving "left." Thus, the previous compiles to the following:

g.V(1).as('a').clear('{x}').out('likes').aggregate('{x}').as('x')
    in('likes').where(neq('a')).
    out('likes').where(not(within('{x}')).
        groupCount()

Cool?

Take care,
Marko.

http://markorodriguez.com

Aggregates in MatchStep?

Reply via email to