[
https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yuri Budilov reopened SPARK-26777:
----------------------------------
please see my reply
> SQL worked in 2.3.2 and fails in 2.4.0
> --------------------------------------
>
> Key: SPARK-26777
> URL: https://issues.apache.org/jira/browse/SPARK-26777
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.4.0
> Reporter: Yuri Budilov
> Priority: Major
>
> Following SQL worked in Spark 2.3.2 and now fails on 2.4.0 (AWS EMR Spark)
> ------------ PySpark call below:
> spark.sql("select partition_year_utc,partition_month_utc,partition_day_utc \
> from datalake_reporting.copy_of_leads_notification \
> where partition_year_utc = (select max(partition_year_utc) from
> datalake_reporting.copy_of_leads_notification) \
> and partition_month_utc = \
> (select max(partition_month_utc) from
> datalake_reporting.copy_of_leads_notification as m \
> where \
> m.partition_year_utc = (select max(partition_year_utc) from
> datalake_reporting.copy_of_leads_notification)) \
> and partition_day_utc = (select max(d.partition_day_utc) from
> datalake_reporting.copy_of_leads_notification as d \
> where d.partition_month_utc = \
> (select max(m1.partition_month_utc) from
> datalake_reporting.copy_of_leads_notification as m1 \
> where m1.partition_year_utc = \
> (select max(y.partition_year_utc) from
> datalake_reporting.copy_of_leads_notification as y) \
> ) \
> ) \
> order by 1 desc, 2 desc, 3 desc limit 1 ").show(1,False)
> Error: (no need for data, this is syntax).
> py4j.protocol.Py4JJavaError: An error occurred while calling o1326.showString.
> : java.lang.UnsupportedOperationException: Cannot evaluate expression:
> scalar-subquery#4495 []
>
> Note: all 3 columns in query are Partitioned columns - see bottom of the
> schema)
>
> Hive EMR AWS Schema is:
>
> CREATE EXTERNAL TABLE `copy_of_leads_notification`(
> `message.environment.siteorigin` string, `dcpheader.dcploaddateutc` string,
> `message.id` int, `source.properties._country` string, `message.created`
> string, `dcpheader.generatedmessageid` string, `message.tags` bigint,
> `source.properties._enqueuedtimeutc` string, `source.properties._leadtype`
> string, `message.itemid` string, `message.prospect.postcode` string,
> `message.prospect.email` string, `message.referenceid` string,
> `message.item.year` string, `message.identifier` string,
> `dcpheader.dcploadmonthutc` string, `message.processed` string,
> `source.properties._tenant` string, `message.item.price` string,
> `message.subscription.confirmresponse` boolean, `message.itemtype` string,
> `message.prospect.lastname` string, `message.subscription.insurancequote`
> boolean, `source.exchangename` string,
> `message.prospect.identificationnumbers` bigint,
> `message.environment.ipaddress` string, `dcpheader.dcploaddayutc` string,
> `source.properties._itemtype` string, `source.properties._requesttype`
> string, `message.item.make` string, `message.prospect.firstname` string,
> `message.subscription.survey` boolean, `message.prospect.homephone` string,
> `message.extendedproperties` bigint, `message.subscription.financequote`
> boolean, `message.uniqueidentifier` string, `source.properties._id` string,
> `dcpheader.sourcemessageguid` string, `message.requesttype` string,
> `source.routingkey` string, `message.service` string, `message.item.model`
> string, `message.environment.pagesource` string, `source.source` string,
> `message.sellerid` string, `partition_date_utc` string,
> `message.selleridentifier` string, `message.subscription.newsletter` boolean,
> `dcpheader.dcploadyearutc` string, `message.leadtype` string,
> `message.history` bigint, `message.callconnect.calloutcome` string,
> `message.callconnect.datecreatedutc` string,
> `message.callconnect.callrecordingurl` string,
> `message.callconnect.transferoutcome` string,
> `message.callconnect.hiderecording` boolean,
> `message.callconnect.callstartutc` string, `message.callconnect.code` string,
> `message.callconnect.callduration` string, `message.fraudnetinfo` string,
> `message.callconnect.answernumber` string, `message.environment.sourcedevice`
> string, `message.comments` string, `message.fraudinfo.servervariables`
> bigint, `message.callconnect.servicenumber` string,
> `message.callconnect.callid` string, `message.callconnect.voicemailurl`
> string, `message.item.stocknumber` string,
> `message.callconnect.answerduration` string, `message.callconnect.callendutc`
> string, `message.item.series` string, `message.item.detailsurl` string,
> `message.item.pricetype` string, `message.item.description` string,
> `message.item.colour` string, `message.item.badge` string,
> `message.item.odometer` string, `message.environment.requestheader` string,
> `message.item.registrationnumber` string, `message.item.bodytype` string,
> `message.item.fueltype` string, `message.item.redbookcode` string,
> `message.item.spotid` string, `message.item.id` string,
> `message.item.transmission` string, `message.item.vin` string,
> `message.item.enginedescription` string, `message.prospect.mobilephone`
> string, `message.prospect.membertrackingid` string,
> `message.environment.username` string, `message.prospect.workphone` string,
> `message.environment.servername` string, `message.environment.sessionid`
> string, `message.tradein.type` string, `message.tradein.model` string,
> `message.tradein.year` string, `message.tradein.make` string,
> `message.tradein.kms` string, `message.fraudinfo.servertimestamp` string,
> `message.prospect.suburb` string, `message.callconnect.username` string,
> `message.callconnect.password` string, `message.status` string,
> `message.tradein.colour` string, `message.prospect.address` string,
> `message.prospect.state` string, `message.tradein.detailsurl` string,
> `message.prospect.faxnumber` string, `message.prospect.companyname` string,
> `message.prospect.title` string, `message.callconnect.calloutcometext`
> string, `message.prospect.preferredcontacttime` string,
> `message.fraudinfo.devicedata` string,
> `message.prospect.preferredcontactmethod` string,
> `message.assignment.assigned` string, `message.assignment.email` string,
> `message.assignment.name` string, `message.country` string,
> `message.financepackage.id` string, `message.financepackage.version` string)
> PARTITIONED BY ( `partition_year_utc` string, `partition_month_utc` string,
> `partition_day_utc` string, `job_run_guid` string)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION 's3://datalake/yurib_test/leads_notification'
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]