I'm thinking there must be a typo somewhere else as this works for me on Spark 1.4:
Seq(("1231234", 1)).toDF("barcode", "items").registerTempTable("goods") sql("SELECT barcode, IF(items IS NULL, 0, items) FROM goods").collect() res1: Array[org.apache.spark.sql.Row] = Array([1231234,1]) I'll also note that you are essentially doing a coalesce here (i.e. coalesce(items, 0)) Spark 1.5 improved error message here a bunch, you might try upgrading to see what is wrong. On Thu, Oct 8, 2015 at 7:28 PM, Maheshakya Wijewardena <mahesha...@wso2.com> wrote: > Spark version: 1.4.1 > The schema is "barcode STRING, items INT" > > On Thu, Oct 8, 2015 at 10:48 PM, Michael Armbrust <mich...@databricks.com> > wrote: > >> Hmm, that looks like it should work to me. What version of Spark? What >> is the schema of goods? >> >> On Thu, Oct 8, 2015 at 6:13 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi, >>> >>> Suppose there is data frame called goods with columns "barcode" and >>> "items". Some of the values in the column "items" can be null. >>> >>> I want to the barcode and the respective items from the table adhering >>> the following rules: >>> >>> - If "items" is null -> output 0 >>> - else -> output "items" ( the actual value in the column) >>> >>> I would write a query like: >>> >>> *SELECT barcode, IF(items is null, 0, items) FROM goods* >>> >>> But this query fails with the error: >>> >>> *unresolved operator 'Project [if (IS NULL items#1) 0 else items#1 AS >>> c0#132]; * >>> >>> It seems I can only use numerical values inside this IF statement, but >>> when a column name is used, it fails. >>> >>> Is there any workaround to do this? >>> >>> Best regards. >>> -- >>> Pruthuvi Maheshakya Wijewardena >>> Software Engineer >>> WSO2 : http://wso2.com/ >>> Email: mahesha...@wso2.com >>> Mobile: +94711228855 >>> >>> >>> >> > > > -- > Pruthuvi Maheshakya Wijewardena > Software Engineer > WSO2 : http://wso2.com/ > Email: mahesha...@wso2.com > Mobile: +94711228855 > > >