This is an automated email from the ASF dual-hosted git repository.
thisisnic pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-cookbook.git
The following commit(s) were added to refs/heads/main by this push:
new 20993f2 ARROW-13753: Filtering Arrays for values matching a mask
filter (#80)
20993f2 is described below
commit 20993f24798349dda68f80f35d98f8ccc101c1ee
Author: Alessandro Molina <[email protected]>
AuthorDate: Fri Oct 8 10:57:50 2021 +0200
ARROW-13753: Filtering Arrays for values matching a mask filter (#80)
* filtering arrays recipe
* Wrong heading
* Apply suggestions from code review
Co-authored-by: Weston Pace <[email protected]>
Co-authored-by: Weston Pace <[email protected]>
---
python/source/data.rst | 55 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)
diff --git a/python/source/data.rst b/python/source/data.rst
index 1d86b56..8fb339c 100644
--- a/python/source/data.rst
+++ b/python/source/data.rst
@@ -234,4 +234,57 @@ that match our predicate
7,
8,
9
- ]
\ No newline at end of file
+ ]
+
+Filtering Arrays using a mask
+=============================
+
+In many cases, when you are searching for something in an array
+you will end up with a mask that tells you the positions at which
+your search matched the values.
+
+For example in an array of four items, we might have a mask that
+matches the first and the last items only:
+
+.. testcode::
+
+ import pyarrow as pa
+
+ array = pa.array([1, 2, 3, 4])
+ mask = pa.array([True, False, False, True])
+
+We can then filter the array according to the mask using
+:meth:`pyarrow.Array.filter` to get back a new array with
+only the values matching the mask:
+
+.. testcode::
+
+ filtered_array = array.filter(mask)
+ print(filtered_array)
+
+.. testoutput::
+
+ [
+ 1,
+ 4
+ ]
+
+Most search functions in :mod:`pyarrow.compute` will produce
+a mask as the output, so you can use them to filter your arrays
+for the values that have been found by the function.
+
+For example we might filter our arrays for the values equal to ``2``
+using :func:`pyarrow.compute.equal`:
+
+.. testcode::
+
+ import pyarrow.compute as pc
+
+ filtered_array = array.filter(pc.equal(array, 2))
+ print(filtered_array)
+
+.. testoutput::
+
+ [
+ 2
+ ]