Re: Spark 3 + Delta 0.7.0 Hive Metastore Integration Question

2020-12-19 Thread Jay
Thanks Matt. I have set the two configs in my sparkConfig as below val spark = SparkSession.builder().appName("QuickstartSQL").config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension").config("spark.sql.catalog.spark_catalog",

Re: Spark 3 + Delta 0.7.0 Hive Metastore Integration Question

2020-12-19 Thread Matt Proetsch
Hi Jay, Some things to check: Do you have the following set in your Spark SQL config: "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog" Is the JAR for the package delta-core_2.12:0.7.0 available on

Spark 3.0.1 fails to insert into Hive Parquet table but Spark 2.11.12 used to work

2020-12-19 Thread Mich Talebzadeh
Hi, Upgraded Spark from 2.11.12 to Spark 3.0.1 Hive version 3.1.1 and Hadoop version 3.1.1 The following used to work with Spark 2.11.12 scala> sqltext = s""" | INSERT INTO TABLE ${fullyQualifiedTableName} | SELECT | ID | , CLUSTERED | ,

Spark 3 + Delta 0.7.0 Hive Metastore Integration Question

2020-12-19 Thread Jay
Hi All - I have currently setup a Spark 3.0.1 cluster with delta version 0.7.0 which is connected to an external hive metastore. I run the below set of commands :- val tableName = tblname_2 spark.sql(s"CREATE TABLE $tableName(col1 INTEGER) USING delta options(path='GCS_PATH')") *20/12/19

Re: Convert Seq[Any] to Seq[String]

2020-12-19 Thread Roland Johann
Your code looks overly complicated and the relevant parts are missing. If possible please post the complete snippet including the retrieval/type if rows so we get the complete picture and can try to help. For first simplification you can just convert aMap to Seq[(String, (String, String))] and

Re: Re: Is Spark SQL able to auto update partition stats like hive by setting hive.stats.autogather=true

2020-12-19 Thread Mich Talebzadeh
Ok if not working then you need to find a work around to update stats before if (spark.sql(f"""SHOW TABLES IN {v.DB} like '{v.tableName}'""").count() == 1): spark.sql(f"""ANALYZE TABLE {v.fullyQualifiedTableName} compute statistics""") rows = spark.sql(f"""SELECT COUNT(1) FROM