Hi Gary, I believe there are `is_null` and `is_valid` functions, and I would expect that those are better to use for filtering on missing values than `==`. Try those out and let us know.
Neal On Fri, Sep 4, 2020 at 6:31 AM Gary Clark <[email protected]> wrote: > Hi, > > I'm currently reading my table in as such: > > ``` > filters = [ > ('column', '=', 'null') > ] > > df= pq.read_table('./joins/parquet/', filters=filters) > > print(df.shape) > ``` > > This gives me 0 rows even though I know there are thousands of nulls in my > data. If I read the data like this, I can see all the nulls > > ``` > df= pq.read_table('./joins/parquet/') > print(df.column( 'column').null_count) > ``` > > Is there something wrong with my filter? Or has this not been implemented? > > -- > Gary Clark > *Data Scientist & Data Engineer* > *B.S. Mechanical Engineering, Howard University '13* > +1 (717) 798-6916 > [email protected] >
