To check for null you can use the `is_null` function:

```
import pyarrow as pa
import pyarrow.compute as pc
import pyarrow.dataset as ds

tab = pa.Table.from_pydict({"x": [1, 2, 3, None], "y": ["a", "b", "c",
"d"]})
filtered = ds.dataset(tab).to_table(filter=pc.is_null(pc.field("x")))
print(filtered)
```

Does that help?

On Mon, Oct 24, 2022 at 8:02 AM 1057445597 <[email protected]> wrote:

> my Filter Expression:
> expression->ToString() get this result:(predict_model != null[string])
> That's how I got this expression:
>
> auto null_expr = arrow::compute::Expression(MakeNullScalar(arrow::utf8
> ()));
> call(not_equal(field_ref("predict_model"), null_expr))
>
> I then use this expression to filter, but end up with an empty batch
>
> if (!dataset()->filter_.empty()) {
> auto scanner_builder =
> arrow::dataset::ScannerBuilder::FromRecordBatchReader(
> batch_reader);
> scanner_builder->Filter(dataset()->filter_expr_);
> auto scanner_result = scanner_builder->Finish();
> if (!scanner_result.ok()) {
> res = errors::Internal(scanner_result.status().ToString());
> break;
> }
> auto scanner = scanner_result.ValueOrDie();
> auto batch_reader_result = scanner->ToRecordBatchReader();
> if (!batch_reader_result.ok()) {
> res = errors::Internal(batch_reader_result.status().ToString());
> break;
> }
> batch_reader = batch_reader_result.ValueOrDie();
> }
>
> arrow_status = batch_reader->ReadNext(&batch);
>
> batch == nullptr
>
> Is there any other way to filter out things that are not null?
>
>
>
> ------------------------------
> 1057445597
> [email protected]
>
> <https://wx.mail.qq.com/home/index?t=readmail_businesscard_midpage&nocheck=true&name=1057445597&icon=http%3A%2F%2Fthirdqq.qlogo.cn%2Fg%3Fb%3Dsdk%26k%3DIlyZtc5eQb1ZfPd0rzpQlQ%26s%3D100%26t%3D1551800738%3Frand%3D1648208978&mail=1057445597%40qq.com&code=>
>
>

Reply via email to