Thanks Ted! Assuming this works I can probably move sorting out of my fast path. My pressing need would then be to slice pre-sorted record batches using binary search.
On Mon, May 13, 2019 at 1:39 PM Ted Gooch <[email protected]> wrote: > At least for the filtering part, isn't it already possible via gandiva > filters[1]? I had a similar question about pushing record-level filtering > into the parquet reader. > > [1] > > https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_gandiva.py#L86-L100 > > On Mon, May 13, 2019 at 8:51 AM Wes McKinney <[email protected]> wrote: > > > https://issues.apache.org/jira/browse/ARROW-1558 > > > > On Mon, May 13, 2019 at 10:47 AM Micah Kornfield <[email protected]> > > wrote: > > > > > > There are also some open JIRA issues for these sorting in > > > cpp/src/arrow/compute [1][2]. I couldn't find one for filtering but > I'm > > > surprised one doesn't exist. > > > > > > [1] https://issues.apache.org/jira/browse/ARROW-4631 > > > < > > > https://issues.apache.org/jira/browse/ARROW-4631?jql=project%20%3D%20ARROW%20AND%20text%20~%20sort > > > > > > [2] https:// > > > < > > > https://issues.apache.org/jira/browse/ARROW-4631?jql=project%20%3D%20ARROW%20AND%20text%20~%20sort > > > > > > issues.apache.org/jira/browse/ARROW-1566 > > > > > > > > > On Mon, May 13, 2019 at 8:36 AM Wes McKinney <[email protected]> > > wrote: > > > > > > > hi John -- I'd recommend implementing these capabilities as Kernel > > > > functions under cpp/src/arrow/compute, then they can be exposed in > > > > Python easily. > > > > > > > > - Wes > > > > > > > > On Mon, May 13, 2019 at 9:01 AM John Muehlhausen <[email protected]> > wrote: > > > > > > > > > > Does pyarrow currently support filter/sort/search without > conversion > > to > > > > > pandas? I don’t see anything but want to be sure. Sorry if I > > overlooked > > > > it. > > > > > > > > > > Specific needs: > > > > > > > > > > 1- filter an arrow record batch and sort the results into a new > batch > > > > > 2- find slice locations for a sorted batch using binary search > > > > > > > > > > If I wanted to contribute this functionality to pyarrow, how would > I > > plug > > > > > in to that effort? > > > > > > > > > > Thanks, > > > > > John > > > > > > >
