Hi,

I'd like to filter a ListArray, based on whether a particular value is
present in each list. Is there a better approach than the one described
below? Particularly, are there any existing compute functions that I could
use instead?

Here's a concrete example, with rows consisting of variable-length lists of
strings:
["a", "b", "x"]
["c", "d"]
["e", "x", "a"]
["c"]
["d, "e"]

If the element to search for is "x", only the first and third row would be
retained after filtering:
["a", "b", "x"]
["e", "x", "a"]

To implement this, the following should work, but is there a better way?

(1) Run the "equal" compute function on the values of the list:
[false, false, true, false, false, false, true, false, false, false, false]

(2) Linearly scan the result of (1) in lockstep with the list's offsets, to
keep track of which rows matched:
[true, false, true, false, false]

(3) Expand the result of (2) by the list lengths:
[true, true, true, false, false, true, true, true, false, false, false]

(4) Use the "filter" compute function (using the result from (3)) to copy
only the matching values.
["a", "b", "x", "e", "x", "a"]

(5) Using the result of (2), sum up lengths to compute new offsets:
[0, 3, 6]

(2), (3), and (5) are of course not difficult to implement, but is there
maybe a trick to use existing compute functions instead? Particularly for
non-C++ implementations that could make a big performance difference.

Cheers,
Leo

>

Reply via email to