[ 
https://issues.apache.org/jira/browse/SPARK-19713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928734#comment-15928734
 ] 

Balaram R Gadiraju commented on SPARK-19713:
--------------------------------------------

The issue is not only in spark, because when the folder is created and spark 
ends with error. we are not able to create or drop table even in hive as hive 
needs to create the folder in order to create the table. 

1. Hive will not be able to create the table as the folder already exists.
2. Hive cannot drop the table because the spark has not updated HiveMetaStore 
(there is not table in hive to drop)

This causes the folder to be locked until you run "hdfs dfs -rm -r 
/data/hive/databases/testdb.db/brokentable"

Does everyone think this is not a issue ?

> saveAsTable
> -----------
>
>                 Key: SPARK-19713
>                 URL: https://issues.apache.org/jira/browse/SPARK-19713
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.1
>            Reporter: Balaram R Gadiraju
>
> Hi,
> I just observed that when we use dataframe.saveAsTable("table") -- In 
> oldversions
> and dataframe.write.saveAsTable("table") -- in the newer versions
>                 When using the method “df3.saveAsTable("brokentable")” in 
> scale code. This creates a folder in hdfs and doesn’t update hive-metastore 
> that it plans to create the table. So if anything goes wrong in between the 
> folder still exists and hive is not aware of the folder creation. This will 
> block the users from creating the table “brokentable” as the folder already 
> exists, we can remove the folder using “hadoop fs –rmr 
> /data/hive/databases/testdb.db/brokentable”.  So below is the workaround 
> which will enable to you to continue the development work.
> Current Code:
> val df3 = sqlContext.sql("select * fromtesttable")
> df3.saveAsTable("brokentable")
> THE WORKAROUND:
> By registering the DataFrame as table and then using sql command to load the 
> data will resolve the issue. EX:
> val df3 = sqlContext.sql("select * from testtable").registerTempTable("df3")
> sqlContext.sql("CREATE TABLE brokentable AS SELECT * FROM df3")



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to