Re: Using a variable (a column name) in an IF statement in Spark SQL
I'm thinking there must be a typo somewhere else as this works for me on Spark 1.4: Seq(("1231234", 1)).toDF("barcode", "items").registerTempTable("goods") sql("SELECT barcode, IF(items IS NULL, 0, items) FROM goods").collect() res1: Array[org.apache.spark.sql.Row] = Array([1231234,1]) I'll also note that you are essentially doing a coalesce here (i.e. coalesce(items, 0)) Spark 1.5 improved error message here a bunch, you might try upgrading to see what is wrong. On Thu, Oct 8, 2015 at 7:28 PM, Maheshakya Wijewardenawrote: > Spark version: 1.4.1 > The schema is "barcode STRING, items INT" > > On Thu, Oct 8, 2015 at 10:48 PM, Michael Armbrust > wrote: > >> Hmm, that looks like it should work to me. What version of Spark? What >> is the schema of goods? >> >> On Thu, Oct 8, 2015 at 6:13 AM, Maheshakya Wijewardena < >> mahesha...@wso2.com> wrote: >> >>> Hi, >>> >>> Suppose there is data frame called goods with columns "barcode" and >>> "items". Some of the values in the column "items" can be null. >>> >>> I want to the barcode and the respective items from the table adhering >>> the following rules: >>> >>>- If "items" is null -> output 0 >>>- else -> output "items" ( the actual value in the column) >>> >>> I would write a query like: >>> >>> *SELECT barcode, IF(items is null, 0, items) FROM goods* >>> >>> But this query fails with the error: >>> >>> *unresolved operator 'Project [if (IS NULL items#1) 0 else items#1 AS >>> c0#132]; * >>> >>> It seems I can only use numerical values inside this IF statement, but >>> when a column name is used, it fails. >>> >>> Is there any workaround to do this? >>> >>> Best regards. >>> -- >>> Pruthuvi Maheshakya Wijewardena >>> Software Engineer >>> WSO2 : http://wso2.com/ >>> Email: mahesha...@wso2.com >>> Mobile: +94711228855 >>> >>> >>> >> > > > -- > Pruthuvi Maheshakya Wijewardena > Software Engineer > WSO2 : http://wso2.com/ > Email: mahesha...@wso2.com > Mobile: +94711228855 > > >
Re: Using a variable (a column name) in an IF statement in Spark SQL
Hmm, that looks like it should work to me. What version of Spark? What is the schema of goods? On Thu, Oct 8, 2015 at 6:13 AM, Maheshakya Wijewardenawrote: > Hi, > > Suppose there is data frame called goods with columns "barcode" and > "items". Some of the values in the column "items" can be null. > > I want to the barcode and the respective items from the table adhering the > following rules: > >- If "items" is null -> output 0 >- else -> output "items" ( the actual value in the column) > > I would write a query like: > > *SELECT barcode, IF(items is null, 0, items) FROM goods* > > But this query fails with the error: > > *unresolved operator 'Project [if (IS NULL items#1) 0 else items#1 AS > c0#132]; * > > It seems I can only use numerical values inside this IF statement, but > when a column name is used, it fails. > > Is there any workaround to do this? > > Best regards. > -- > Pruthuvi Maheshakya Wijewardena > Software Engineer > WSO2 : http://wso2.com/ > Email: mahesha...@wso2.com > Mobile: +94711228855 > > >
Re: Using a variable (a column name) in an IF statement in Spark SQL
Spark version: 1.4.1 The schema is "barcode STRING, items INT" On Thu, Oct 8, 2015 at 10:48 PM, Michael Armbrustwrote: > Hmm, that looks like it should work to me. What version of Spark? What > is the schema of goods? > > On Thu, Oct 8, 2015 at 6:13 AM, Maheshakya Wijewardena < > mahesha...@wso2.com> wrote: > >> Hi, >> >> Suppose there is data frame called goods with columns "barcode" and >> "items". Some of the values in the column "items" can be null. >> >> I want to the barcode and the respective items from the table adhering >> the following rules: >> >>- If "items" is null -> output 0 >>- else -> output "items" ( the actual value in the column) >> >> I would write a query like: >> >> *SELECT barcode, IF(items is null, 0, items) FROM goods* >> >> But this query fails with the error: >> >> *unresolved operator 'Project [if (IS NULL items#1) 0 else items#1 AS >> c0#132]; * >> >> It seems I can only use numerical values inside this IF statement, but >> when a column name is used, it fails. >> >> Is there any workaround to do this? >> >> Best regards. >> -- >> Pruthuvi Maheshakya Wijewardena >> Software Engineer >> WSO2 : http://wso2.com/ >> Email: mahesha...@wso2.com >> Mobile: +94711228855 >> >> >> > -- Pruthuvi Maheshakya Wijewardena Software Engineer WSO2 : http://wso2.com/ Email: mahesha...@wso2.com Mobile: +94711228855