Github user dilipbiswal commented on the pull request:
https://github.com/apache/spark/pull/9588#issuecomment-157972900
@marmbrus
Hi Michael,
I have been studying the code and have a few initial questions.
Looking at the syntax of "create temporary table" for hive..
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/TruncateTable
Here is what is supported today per my understanding.
- create temporay table
- create temporary table as select
- create temporary table like existing_table
In first two cases, there are quite a few causes that are supported such as
"clustered by", "skewed by", "row format", "stored as" , "location". Per
the doc
a temp table can not be partitioned.
Questions :
1. Can you please briefly explain the semantics of "Spark SQL temporary
table" ?
Is it same as when we call dataframe.registerTempTable() ?
2. In hive after creating a temporary table, we can use INSERT.. SELECT to
populate
data. How would we do it when we map it to Spark SQL ?
3. Michael, i have made some changes in HiveQl.scala:nodeToPlan() where i
am intercepting the
create temp call and would like to create the correct logical plan and
need your input.
- Is the correct place to intercept ?
- Can you suggest how you think the logical plan should look like ?
- I tried to insert into a table registered as temp table and get an
exception
"org.apache.spark.sql.AnalysisException: Inserting into an RDD-based
table is not allowed."
Please let me know your thoughts on this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]