Might sound silly, but are you using a Hive context? What errors do the Hive query results return?
spark = SparkSession.builder.enableHiveSupport().getOrCreate() The second part of your questions, you are creating a temp table and then subsequently creating another table from that temp view. Doesn’t seem like you are reading the table from the spark or hive warehouse. This works fine for me; albeit I was using spark thrift to communicate with my directory of choice. from pyspark import SparkContext from pyspark.sql import SparkSession, Row, types from pyspark.sql.types import * from pyspark.sql import functions as f from decimal import * from datetime import datetime # instantiate our sparkSession and context spark = SparkSession.builder.enableHiveSupport().getOrCreate() sc = spark.sparkContext # Generating customer orc table files # load raw data as an RDD customer_data = sc.textFile("/data/tpch/customer.tbl") # map the data into an RDD split with pipe delimitations customer_split = customer_data.map(lambda l: l.split("|")) # map the split data with a row method; this is where we specificy column names and types # default type is string- UTF8 # there are issues with converting string to date and these issues have been addressed # in those tables with dates: See notes below customer_row = customer_split.map( lambda r: Row( custkey=long(r[0]), name=r[1], address=r[2], nationkey=long(r[3]), phone=r[4], acctbal=Decimal(r[5]), mktsegment=r[6], comment=r[7] )) # we can have Spark infer the schema, or apply a strict schema and identify whether or not we want null values # in this case we don't want null values for keys; and we want explicit data types to support the TPCH tables/ data model customer_schema = types.StructType([ types.StructField('custkey',types.LongType(),False) ,types.StructField('name',types.StringType()) ,types.StructField('address',types.StringType()) ,types.StructField('nationkey',types.LongType(),False) ,types.StructField('phone',types.StringType()) ,types.StructField('acctbal',types.DecimalType()) ,types.StructField('mktsegment',types.StringType()) ,types.StructField('comment',types.StringType())]) # we create a dataframe object by referencing our sparkSession class and the createDataFrame method # this method takes two arguments by default (row, schema) customer_df = spark.createDataFrame(customer_row,customer_schema) # we can now write a file of type orc by referencing our dataframe object we created customer_df.write.orc("/data/tpch/customer.orc") # read that same file we created but create a seperate dataframe object customer_df_orc = spark.read.orc("/data/tpch/customer.orc") # reference the newly created dataframe object and create a tempView for QA purposes customer_df_orc.createOrReplaceTempView("customer") # reference the sparkSession class and SQL method in order to issue SQL statements to the materialized view spark.sql("SELECT * FROM customer LIMIT 10").show() From: "☼ R Nair (रविशंकर नायर)" <ravishankar.n...@gmail.com> Date: Friday, February 9, 2018 at 7:03 AM To: "user @spark/'user @spark'/spark users/user@spark" <user@spark.apache.org> Subject: Re: Spark Dataframe and HIVE An update: (Sorry I missed) When I do passion_df.createOrReplaceTempView("sampleview") spark.sql("create table sample table as select * from sample view") Now, I can see table and can query as well. So why this do work from Spark and other method discussed below is not? Thanks On Fri, Feb 9, 2018 at 9:49 AM, ☼ R Nair (रविशंकर नायर) <ravishankar.n...@gmail.com<mailto:ravishankar.n...@gmail.com>> wrote: All, It has been three days continuously I am on this issue. Not getting any clue. Environment: Spark 2.2.x, all configurations are correct. hive-site.xml is in spark's conf. 1) Step 1: I created a data frame DF1 reading a csv file. 2) Did manipulations on DF1. Resulting frame is passion_df. 3) passion_df.write.format("orc").saveAsTable("sampledb.passion") 4) The metastore shows the hive table., when I do "show tables" in HIVE, I can see table name 5) I can't select in HIVE, though I can select from SPARK as spark.sql("select * from sampledb.passion") Whats going on here? Please help. Why I am not seeing data from HIVE prompt? The "describe formatted " command on the table in HIVE shows he data is is in default warehouse location ( /user/hive/warehouse) since I set it. I am not getting any definite answer anywhere. Many suggestions and answers given in Stackoverflow et al.Nothing really works. So asking experts here for some light on this, thanks Best, Ravion -- [mage removed by sender.]