mortewle commented on issue #45161:
URL: https://github.com/apache/arrow/issues/45161#issuecomment-2609354342

   This seems to be an issue with parenthesis. The error does not occur when 
parenthesising the elements of the expression.
   
   ```python import numpy as np
   import polars as pl
   import pyarrow.dataset as ds
   from pyarrow.parquet import ParquetDataset
   
   n = 1_000_000
   rng = np.random.default_rng(seed=42)
   
   data = pl.DataFrame(
       {
           "a": rng.uniform(low=0, high=2, size=n),
           "b": rng.choice(["a", "b"], n),
           "c": rng.normal(size=n),
       }
   )
   
   data.write_parquet("data.parquet", row_group_size=500_000)
   
   df = pl.from_arrow(
       ParquetDataset(
           ["data.parquet"],
           filters=(~ds.field("c").is_null()) & (ds.field("a") >= 3),
       ).read(columns=["b"])
   )
   print(df)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to