Hi Simran, Thank's a lot for answering my questions. I'm sorry I'have been slow to answer back.
There was a copy/paste mistake in the Gist. That's why there was an undefined gfi_s. The Gist has been update to reflect better the query. https://gist.github.com/Waateur/6c8c0b2c40d0dfa08cecfe780275cd9f The collect bellow usually return under 5 results COLLECT code=b.code, category_id=b.category_id INTO code_group You are right about this filter FILTER LENGTH(a) == 1 is used to remove b documents that have no match in a. About the data distribution this is what i have in mind A contain 31,025,557 docs and B 20,785,230 doc. 90% of A is link to one or more doc in B 60% of A is link to more than one B. ( mostly 3 and almost never over 5 ) 20% of B is link to more than A. ( usually 2 or 3 ) These kind of duplicate are differentiated by the delivery string which is a "date of upload" information. As for example of documents, they were added to the gist. I don't know if any other information can be useful. Best, Killian Le lun. 12 nov. 2018 à 15:03, Simran Brucherseifer <sim...@arangodb.com> a écrit : > Hi Killian, > > it would be helpful to see your exact index definitions, some example > documents and to know a bit about the data distribution. > > How many documents end up in one group here on average? > > COLLECT code=b.code, category_id=b.category_id INTO code_group > > > What do you actually want to return here? "gfi_s" is not defined in the > Gist: > > FOR a_tmp IN A > ... > RETURN gfi_s > > > What is this for? The sub-query is limited to one result anyway. Is this > for the case of no match or a missing attribute (gfi_s, see above)? > > FILTER LENGTH(a) == 1 > > Best, Simran > > -- > You received this message because you are subscribed to the Google Groups > "ArangoDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to arangodb+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- Killian Janod Datascientist @ iSmart / Kware killian.ja...@kware.fr killian.ja...@ismart.fr <killian.ja...@kware.fr> +33 (0) 6 61 33 34 76 -- You received this message because you are subscribed to the Google Groups "ArangoDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to arangodb+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.