Another solution could be using left-semi join: keys = sqlContext.createDataFrame(dict.keys()) DF2 = DF1.join(keys, DF1.a = keys.k, "leftsemi")
On Wed, Feb 24, 2016 at 2:14 AM, Franc Carter <franc.car...@gmail.com> wrote: > > A colleague found how to do this, the approach was to use a udf() > > cheers > > On 21 February 2016 at 22:41, Franc Carter <franc.car...@gmail.com> wrote: >> >> >> I have a DataFrame that has a Python dict() as one of the columns. I'd >> like to filter he DataFrame for those Rows that where the dict() contains a >> specific value. e.g something like this:- >> >> DF2 = DF1.filter('name' in DF1.params) >> >> but that gives me this error >> >> ValueError: Cannot convert column into bool: please use '&' for 'and', '|' >> for 'or', '~' for 'not' when building DataFrame boolean expressions. >> >> How do I express this correctly ? >> >> thanks >> >> -- >> Franc > > > > > -- > Franc --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org