Re: how to find NaN values of each row of spark dataframe to decide whether the rows is dropeed or not
Also take a look at this API: https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameNaFunctions On Mon, Sep 26, 2016 at 1:09 AM, Bedrytski Aliaksandrwrote: > Hi Muhammet, > > python also supports sql queries http://spark.apache.org/docs/latest/sql- > programming-guide.html#running-sql-queries-programmatically > > Regards, > -- > Bedrytski Aliaksandr > sp...@bedryt.ski > > > > On Mon, Sep 26, 2016, at 10:01, muhammet pakyürek wrote: > > > > > but my requst is related to python because i have designed preprocess > for data which looks for rows including NaN values. if the number of Nan > is high above the threshodl. it s deleted otherwise fill it with a > predictive value. therefore i need python version for this process > > > -- > > *From:* Bedrytski Aliaksandr > *Sent:* Monday, September 26, 2016 7:53 AM > *To:* muhammet pakyürek > *Cc:* user@spark.apache.org > *Subject:* Re: how to find NaN values of each row of spark dataframe to > decide whether the rows is dropeed or not > > Hi Muhammet, > > have you tried to use sql queries? > > spark.sql(""" > SELECT > field1, > field2, > field3 >FROM table1 >WHERE > field1 != 'Nan', > field2 != 'Nan', > field3 != 'Nan' > """) > > > This query filters rows containing Nan for a table with 3 columns. > > Regards, > -- > Bedrytski Aliaksandr > sp...@bedryt.ski > > > > On Mon, Sep 26, 2016, at 09:30, muhammet pakyürek wrote: > > > is there any way to do this directly. if its not, is there any todo this > indirectly using another datastrcutures of spark > > > >
Re: how to find NaN values of each row of spark dataframe to decide whether the rows is dropeed or not
Hi Muhammet, python also supports sql queries http://spark.apache.org/docs/latest/sql-programming-guide.html#running-sql-queries-programmatically Regards, -- Bedrytski Aliaksandr sp...@bedryt.ski On Mon, Sep 26, 2016, at 10:01, muhammet pakyürek wrote: > > > > but my requst is related to python because i have designed preprocess > for data which looks for rows including NaN values. if the number of > Nan is high above the threshodl. it s deleted otherwise fill it with a > predictive value. therefore i need python version for this process > > > > *From:* Bedrytski Aliaksandr*Sent:* Monday, > September 26, 2016 7:53 AM *To:* muhammet pakyürek *Cc:* > user@spark.apache.org *Subject:* Re: how to find NaN values of each > row of spark dataframe to decide whether the rows is dropeed or not > > Hi Muhammet, > > have you tried to use sql queries? > >> spark.sql(""" >> SELECT >> field1, >> field2, >> field3 >>FROM table1 >>WHERE >> field1 != 'Nan', >> field2 != 'Nan', >> field3 != 'Nan' >> """) > > This query filters rows containing Nan for a table with 3 columns. > > Regards, > -- > Bedrytski Aliaksandr > sp...@bedryt.ski > > > > On Mon, Sep 26, 2016, at 09:30, muhammet pakyürek wrote: >> >> is there any way to do this directly. if its not, is there any todo >> this indirectly using another datastrcutures of spark >> >
Re: how to find NaN values of each row of spark dataframe to decide whether the rows is dropeed or not
Hi Muhammet, have you tried to use sql queries? > spark.sql(""" > SELECT > field1, > field2, > field3 >FROM table1 >WHERE > field1 != 'Nan', > field2 != 'Nan', > field3 != 'Nan' > """) This query filters rows containing Nan for a table with 3 columns. Regards, -- Bedrytski Aliaksandr sp...@bedryt.ski On Mon, Sep 26, 2016, at 09:30, muhammet pakyürek wrote: > > is there any way to do this directly. if its not, is there any todo > this indirectly using another datastrcutures of spark >
how to find NaN values of each row of spark dataframe to decide whether the rows is dropeed or not
is there any way to do this directly. if its not, is there any todo this indirectly using another datastrcutures of spark