Hi Raphael,

Thanks for the info. Yes, makes sense now and the example in the PR is clear.

I had a question about the separation design however, doesn't that mean the 
Array is iterated through twice? - Once to determine what to filter out, and 
then a second time to actually perform the filtering, would this not be 
inefficient? How come there's no option to filter out on the first iteration 
rather than a 2-step process?

Regards,

Olo


________________________________
From: Raphael Taylor-Davies <[email protected]>
Sent: 04 November 2022 02:05
To: [email protected] <[email protected]>
Subject: Re: [RUST] Compute kernel filtering


Hi Olo,


Typically you will evaluate some predicate in order to obtain a BooleanArray, 
potentially using one of the 
comparison<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Fcomparison%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=If5U9ltVdnMcKDb5NKuCy9bcjk%2Bg1nWJc3Mpfyg5cbw%3D&reserved=0>
 kernels, and then pass this to the filter kernel to filter the actual values. 
This separation not only allows both kernels to be implemented more 
efficiently, but allows filtering an array with a predicate evaluated on a 
different array.


I've created this 
PR<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-rs%2Fpull%2F3014&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=E%2BP4SYzTWqvs0eXpJxt3i1XV3xxfJhaDiT%2BesLgX%2Bmc%3D&reserved=0>
 to add an example doctest of this. Please let me know if anything isn't clear


Kind Regards,


Raphael Taylor-Davies


On 04/11/2022 06:48, Olo Sawyerr wrote:
Hi there,

Hope you're well.

I have a question about the 
arrow<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ADzef7HqEmsOGDjDRBv0ZAhD86nHO068oE%2FewsxWWPY%3D&reserved=0>::compute<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=DTVQVNFxVjdkNjhNrybcV%2FulmoLGGe96ejViJAbDnUk%3D&reserved=0>::kernels<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OzchweJlbpEyBqy%2F5NBuvkWWpDqKQ8c0e8CN3JuorrU%3D&reserved=0>::filter<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Ffilter%2Ffn.filter.html%23&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=htFFG42e9sRpHAKYR0FXb8e2KE2%2BKzwz6bjA4R4DGRU%3D&reserved=0>::filter<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Ffilter%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9SpH0ldv1w3vLe5qQk3mgx2f52wu8J%2Bl7Im%2FU68o8Lg%3D&reserved=0>()
 function. The function takes a predicate which is a BooleanArray to do the 
filtering. How should this be used in practice? One usually doesn't know which 
item should be filtered out or not in advance before you actually do the 
filter, so won't know what the BooleanArray should contain.

I would have thought that the predicate would be a function instead to test if 
each item should be filtered out or not.

How is 
arrow<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ADzef7HqEmsOGDjDRBv0ZAhD86nHO068oE%2FewsxWWPY%3D&reserved=0>::compute<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=DTVQVNFxVjdkNjhNrybcV%2FulmoLGGe96ejViJAbDnUk%3D&reserved=0>::kernels<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OzchweJlbpEyBqy%2F5NBuvkWWpDqKQ8c0e8CN3JuorrU%3D&reserved=0>::filter<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Ffilter%2Ffn.filter.html%23&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=htFFG42e9sRpHAKYR0FXb8e2KE2%2BKzwz6bjA4R4DGRU%3D&reserved=0>::filter()
 meant to be used in practice?

Thanks.

Reply via email to