date:20210425

Re: Is a Hive installation necessary for Spark SQL?

2021-04-25 Thread Mich Talebzadeh

Hi, I don't know much about delta but your statement below df.createOrReplaceTempView("myTable") res = spark.sql("select * from myTable") The so-called TempView is a reference to a hash table in memory. That is, you are mapping your dataframe* df *to a hash table in memory and it is transient,

Re: Is a Hive installation necessary for Spark SQL?

2021-04-25 Thread chia kang ren

Does it make sense to keep a Hive installation when your parquet files come with a transactional metadata layer like Delta Lake / Apache Iceberg? My understanding from this: https://github.com/delta-io/delta/issues/85 is that Hive is no longer necessary in a Spark cluster other than discovering