This is an automated email from the ASF dual-hosted git repository.

thisisnic pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-cookbook.git


The following commit(s) were added to refs/heads/main by this push:
     new 82b37fb  ARROW-13751: Recipe for searching for values matching a 
predicate (#79)
82b37fb is described below

commit 82b37fbe91e2c098d460ebec3e02faac7d0b7c42
Author: Alessandro Molina <[email protected]>
AuthorDate: Tue Oct 5 11:31:32 2021 +0200

    ARROW-13751: Recipe for searching for values matching a predicate (#79)
    
    * Recipe for searching for values matching a predicate
    
    * Apply suggestions from code review
    
    Co-authored-by: Weston Pace <[email protected]>
    
    * Wrong heading
    
    Co-authored-by: Weston Pace <[email protected]>
---
 python/source/data.rst | 56 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/python/source/data.rst b/python/source/data.rst
index bdcf648..1d86b56 100644
--- a/python/source/data.rst
+++ b/python/source/data.rst
@@ -180,4 +180,58 @@ We can combine them into a single table using 
:func:`pyarrow.concat_tables`:
   the result will be a table with multiple chunks, each pointing to the 
original 
   data that has been appended. Under some conditions, Arrow might have to 
   cast data from one type to another (if `promote=True`).  In such cases the 
data 
-  will need to be copied and an extra cost will occur.
\ No newline at end of file
+  will need to be copied and an extra cost will occur.
+
+Searching for values matching a predicate in Arrays
+===================================================
+
+If you have to look for values matching a predicate in Arrow arrays
+the :mod:`arrow.compute` module provides several methods that
+can be used to find the values you are looking for.
+
+For example, given an array with numbers from 0 to 9, if we
+want to look only for those greater than 5 we could use the
+func:`arrow.compute.greater` method and get back the elements
+that fit our predicate
+
+.. testcode::
+
+  import pyarrow as pa
+  import pyarrow.compute as pc
+
+  arr = pa.array(range(10))
+  gtfive = pc.greater(arr, 5)
+
+  print(gtfive.to_string())
+
+.. testoutput::
+
+  [
+    false,
+    false,
+    false,
+    false,
+    false,
+    false,
+    true,
+    true,
+    true,
+    true
+  ]
+
+Furthermore we can filter the array to get only the entries
+that match our predicate
+
+.. testcode::
+
+  filtered_array = pc.filter(arr, gtfive)
+  print(filtered_array)
+
+.. testoutput::
+
+  [
+    6,
+    7,
+    8,
+    9
+  ]
\ No newline at end of file

Reply via email to