providing a list parameter for sum function

2020-12-20 Thread bigel_p
hi, im using spark dataframe API. i'm trying to give sum() a list parameter containing columns names as strings. when i'm putting columns names directly into the function- the script works' when i'm trying to provide it to the function as a parameter of type list- i get the error: "

Avoid re-creating __spark_conf__5678XXXX.zip in /tmp for each application submit and copy under application specific .staging directory

2020-12-20 Thread Bhatta, Ramesha
Hello, I have created an enhancement request describing details in the https://issues.apache.org/jira/browse/SPARK-33864. Kindly check and vote for this enhancement and/or provide your insight/suggestions. Regards, -Ramesh

S3-SQS vs Auto Loader With Apache Spark Structured Streaming

2020-12-20 Thread Rachana Srivastava
Problem Statement: I want to read files from S3 write files to s3 using Spark Structured Streaming. I looked at the reference architecture recommended by Spark team that recommends using S3 -> SNS -> SQS using S3-SQS file source. Question: - S3-SQS file source: Is S3-SQS file source

Re: Spark 3 + Delta 0.7.0 Hive Metastore Integration Question

2020-12-20 Thread Jay
I think I found the issue, Hive metastore 2.3.6 doesn't have the necessary support. After upgrading to Hive 3.1.2 I was able to run the select query. On Sun, 20 Dec 2020 at 12:00, Jay wrote: > Thanks Matt. > > I have set the two configs in my sparkConfig as below > val spark = >