Good day everyone, I am looking for a good way to do the following:
I have dataset A and dataset B, and for each element in dataset A I would like to filter dataset B and obtain the size of the result. To say it short: *for each element a in A -> B.filter( _ < a.propertyx).count* Currently I am doing a cross of dataset A and B, making tuples so I can then filter all the tuples where field2 < field1.propertya and then group by field1.id and get the sizes of the groups.However this is not working out in practice. When the datasets get larger, some Tasks hang on the CHAIN Cross -> Filter probably because there is insufficient memory for the cross to be completed? Does anyone have a suggestion on how I could make this work, especially with datasets that are larger than memory available to a separate Task? Thank you in advance for your time :-) Kind regards, Pieter Hameete