Re: Can Spark Catalog Perform Multimodal Database Query Analysis

2024-05-24 Thread Mich Talebzadeh
Something like this in Python

from pyspark.sql import SparkSession

# Configure Spark Session with JDBC URLs
spark_conf = SparkConf() \
  .setAppName("SparkCatalogMultipleSources") \
  .set("hive.metastore.uris",
"thrift://hive1-metastore:9080,thrift://hive2-metastore:9080")

jdbc_urls = ["jdbc:hive2://hive1-jdbc:1",
"jdbc:hive2://hive2-jdbc:1"]
mysql_jdbc_url = "jdbc:mysql://mysql-host:3306/mysql_database"

spark = SparkSession.builder \
  .config(spark_conf) \
  .enableHiveSupport() \
  .getOrCreate()

# Accessing tables from Hive1, Hive2, and MySQL
spark.sql("SELECT * FROM hive1.table1").show()
spark.sql("SELECT * FROM hive2.table2").show()
spark.sql("SELECT * FROM mysql.table1").show()

# Optional: Create temporary views for easier joining (if needed)
spark.sql("CREATE TEMPORARY VIEW hive1_table1 AS SELECT * FROM
hive1.table1")
spark.sql("CREATE TEMPORARY VIEW hive2_table2 AS SELECT * FROM
hive2.table2")
spark.sql("CREATE TEMPORARY VIEW mysql_table1 AS SELECT * FROM
mysql.table1")

HTH

Mich Talebzadeh,
Technologist | Architect | Data Engineer  | Generative AI | FinCrime
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  Von
Braun )".


On Fri, 24 May 2024 at 09:41, 志阳 <308027...@qq.com.invalid> wrote:

> I have two clusters hive1 and hive2, as well as a MySQL database. Can I
> use Spark Catalog for registration, but can I only use one catalog at a
> time? Can multiple catalogs be joined across databases.
> select * from
>  hive1.table1 join hive2.table2 join mysql.table1
> where 
>
> --
> 志阳
> 308027...@qq.com
>
> 
>
>


Can Spark Catalog Perform Multimodal Database Query Analysis

2024-05-24 Thread ????
I have two clusters hive1 and hive2, as well as a MySQL database. Can I use 
Spark Catalog for registration, but can I only use one catalog at a time? Can 
multiple catalogs be joined across databases.
select * from
hive1.table1 join hive2.table2 join mysql.table1
where 





308027...@qq.com