(Moving this over to the user list as that's the appropriate list for this question)

Do you get an error? We can't help you with only a "it didn't work" :)

I'd suggest that you try to narrow down the scope of the problem: is it unique to Hive external tables? Can you use a different Hive StorageHandler successfully (e.g. the HBaseStorageHandler)?

Finally, as you're using HDP, please also consider using their customer support.

On 7/11/19 2:18 AM, 马士成 wrote:
Hello All,

In Apache Phoenix homepage,  It shows two additional functions: Apache Spark Integration and Phoenix Storage Handler for Apache Hive,

According the guidance, I can query phoenix table from beeline-cli, I can load phoenix table as dataframe using Spark-sql.

So my question is :

Does Phoenix support spark-sql query the hive external table mapped from Phoenix ?

I am working on hdp3.0 ( Phoenix 5.0 Hbase 2.0, Hive 3.1.0 ,Spark2.3.1  )  and facing the issue as subject mentioned.

I tried to solve this problem but failed, I found some similar questions on internet but the answers didn’t work for me.

My submit command :

   spark-submit test3.py --jars \

   /usr/hdp/current/phoenix-client/lib/phoenix-hive-5.0.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/phoenix-client/lib/hadoop-mapreduce-client-core.jar\

   ,/usr/hdp/current/phoenix-client/lib/phoenix-core-5.0.0.3.0.0.0-1634.jar\

,/usr/hdp/current/phoenix-client/lib/phoenix-spark-5.0.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hive-metastore-3.1.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hive-common-3.1.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hbase-client-2.0.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hbase-mapreduce-2.0.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hive-serde-3.1.0.3.0.0.0-1634.jar\

   ,/usr/hdp/current/hive-client/lib/hive-shims-3.1.0.3.0.0.0-1634.jar

Log attached and Demo code as below:

   from pyspark.sql import SparkSession

   if __name__ == '__main__':

       spark = SparkSession.builder \

           .appName("test") \

           .enableHiveSupport() \

           .getOrCreate()

       df= spark.sql("select count(*) from ajmide_dw.part_device")

       df.show()

Similar Issues:

https://community.hortonworks.com/questions/140097/facing-issue-from-spark-sql.html

https://stackoverflow.com/questions/51501044/unable-to-access-hive-external-tables-from-spark-shell

Any comment or suggestion is appreciated!

Thanks,

Shi-Cheng, Ma

Reply via email to