Another solution could be using left-semi join:

keys = sqlContext.createDataFrame(dict.keys())
DF2 = DF1.join(keys, DF1.a = keys.k, "leftsemi")

On Wed, Feb 24, 2016 at 2:14 AM, Franc Carter <franc.car...@gmail.com> wrote:
>
> A colleague found how to do this, the approach was to use a udf()
>
> cheers
>
> On 21 February 2016 at 22:41, Franc Carter <franc.car...@gmail.com> wrote:
>>
>>
>> I have a DataFrame that has a Python dict() as one of the columns. I'd
>> like to filter he DataFrame for those Rows that where the dict() contains a
>> specific value. e.g something like this:-
>>
>>     DF2 = DF1.filter('name' in DF1.params)
>>
>> but that gives me this error
>>
>> ValueError: Cannot convert column into bool: please use '&' for 'and', '|'
>> for 'or', '~' for 'not' when building DataFrame boolean expressions.
>>
>> How do I express this correctly ?
>>
>> thanks
>>
>> --
>> Franc
>
>
>
>
> --
> Franc

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to