I'm thinking there must be a typo somewhere else as this works for me on
Spark 1.4:

Seq(("1231234", 1)).toDF("barcode", "items").registerTempTable("goods")

sql("SELECT barcode, IF(items IS NULL, 0, items) FROM goods").collect()

res1: Array[org.apache.spark.sql.Row] = Array([1231234,1])

I'll also note that you are essentially doing a coalesce here (i.e.
coalesce(items,
0))

Spark 1.5 improved error message here a bunch, you might try upgrading to
see what is wrong.

On Thu, Oct 8, 2015 at 7:28 PM, Maheshakya Wijewardena <mahesha...@wso2.com>
wrote:

> Spark version: 1.4.1
> The schema is "barcode STRING, items INT"
>
> On Thu, Oct 8, 2015 at 10:48 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> Hmm, that looks like it should work to me.  What version of Spark?  What
>> is the schema of goods?
>>
>> On Thu, Oct 8, 2015 at 6:13 AM, Maheshakya Wijewardena <
>> mahesha...@wso2.com> wrote:
>>
>>> Hi,
>>>
>>> Suppose there is data frame called goods with columns "barcode" and
>>> "items". Some of the values in the column "items" can be null.
>>>
>>> I want to the barcode and the respective items from the table adhering
>>> the following rules:
>>>
>>>    - If "items" is null -> output 0
>>>    - else -> output "items" ( the actual value in the column)
>>>
>>> I would write a query like:
>>>
>>> *SELECT barcode, IF(items is null, 0, items) FROM goods*
>>>
>>> But this query fails with the error:
>>>
>>> *unresolved operator 'Project [if (IS NULL items#1) 0 else items#1 AS
>>> c0#132]; *
>>>
>>> It seems I can only use numerical values inside this IF statement, but
>>> when a column name is used, it fails.
>>>
>>> Is there any workaround to do this?
>>>
>>> Best regards.
>>> --
>>> Pruthuvi Maheshakya Wijewardena
>>> Software Engineer
>>> WSO2 : http://wso2.com/
>>> Email: mahesha...@wso2.com
>>> Mobile: +94711228855
>>>
>>>
>>>
>>
>
>
> --
> Pruthuvi Maheshakya Wijewardena
> Software Engineer
> WSO2 : http://wso2.com/
> Email: mahesha...@wso2.com
> Mobile: +94711228855
>
>
>

Reply via email to