[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82659075
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala ---
@@ -292,6 +291,21 @@ class CatalogImpl(sparkSession: SparkSession) extends 
Catalog {
   }
 
   /**
+   * Drops the global temporary view with the given view name in the 
catalog.
+   * If the view has been cached/persisted before, it's also unpersisted.
+   *
+   * @param viewName the name of the view to be dropped.
+   * @group ddl_ops
+   * @since 2.1.0
+   */
+  override def dropGlobalTempView(viewName: String): Boolean = {
--- End diff --

I think we should just drop "_global_temp." prefix (or whatever the value 
should be) if user specifies that.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82658771
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2433,31 +2433,65 @@ class Dataset[T] private[sql](
   }
 
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
+   * Local temporary view is session-scoped. Its lifetime is the lifetime 
of the session that
+   * created it, i.e. it will be automatically dropped when the session 
terminates. It's not
+   * tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+   *
* @throws AnalysisException if the view name already exists
*
* @group basic
* @since 2.0.0
*/
   @throws[AnalysisException]
   def createTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = false)
+createTempViewCommand(viewName, replace = false, global = false)
   }
 
+
+
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
* @group basic
* @since 2.0.0
*/
   def createOrReplaceTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = true)
+createTempViewCommand(viewName, replace = true, global = false)
   }
 
-  private def createViewCommand(viewName: String, replace: Boolean): 
CreateViewCommand = {
+  /**
+   * Creates a global temporary view using the given name. The lifetime of 
this
+   * temporary view is tied to this Spark application.
+   *
+   * Global temporary view is cross-session. Its lifetime is the lifetime 
of the Spark application,
+   * i.e. it will be automatically dropped when the application 
terminates. It's tied to a system
+   * preserved database `_global_temp`, and we must use the qualified name 
to refer a global temp
+   * view, e.g. `SELECT * FROM _global_temp.view1`.
+   *
+   * @throws TempTableAlreadyExistsException if the view name already 
exists
+   *
+   * @group basic
+   * @since 2.1.0
+   */
+  @throws[AnalysisException]
+  def createGlobalTempView(viewName: String): Unit = withPlan {
+createTempViewCommand(viewName, replace = false, global = true)
+  }
+
+  private def createTempViewCommand(
+  viewName: String,
+  replace: Boolean,
+  global: Boolean): CreateViewCommand = {
+val viewType = if (global) {
--- End diff --

shorten this to
```
val viewType = if (global) GlobalTempView else LocalTempView
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82658710
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2433,31 +2433,65 @@ class Dataset[T] private[sql](
   }
 
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
+   * Local temporary view is session-scoped. Its lifetime is the lifetime 
of the session that
+   * created it, i.e. it will be automatically dropped when the session 
terminates. It's not
+   * tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+   *
* @throws AnalysisException if the view name already exists
*
* @group basic
* @since 2.0.0
*/
   @throws[AnalysisException]
   def createTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = false)
+createTempViewCommand(viewName, replace = false, global = false)
   }
 
+
+
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
* @group basic
* @since 2.0.0
*/
   def createOrReplaceTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = true)
+createTempViewCommand(viewName, replace = true, global = false)
   }
 
-  private def createViewCommand(viewName: String, replace: Boolean): 
CreateViewCommand = {
+  /**
+   * Creates a global temporary view using the given name. The lifetime of 
this
+   * temporary view is tied to this Spark application.
+   *
+   * Global temporary view is cross-session. Its lifetime is the lifetime 
of the Spark application,
+   * i.e. it will be automatically dropped when the application 
terminates. It's tied to a system
+   * preserved database `_global_temp`, and we must use the qualified name 
to refer a global temp
+   * view, e.g. `SELECT * FROM _global_temp.view1`.
+   *
+   * @throws TempTableAlreadyExistsException if the view name already 
exists
--- End diff --

change this to AnalysisException


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-10 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82658617
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala ---
@@ -262,15 +262,33 @@ abstract class Catalog {
   options: Map[String, String]): DataFrame
 
   /**
-   * Drops the temporary view with the given view name in the catalog.
+   * Drops the local temporary view with the given view name in the 
catalog.
* If the view has been cached before, then it will also be uncached.
*
+   * Local temporary view is session-scoped. Its lifetime is the lifetime 
of the session that
+   * created it, i.e. it will be automatically dropped when the session 
terminates. It's not
+   * tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+   *
* @param viewName the name of the view to be dropped.
* @since 2.0.0
*/
   def dropTempView(viewName: String): Unit
 
   /**
+   * Drops the global temporary view with the given view name in the 
catalog.
+   * If the view has been cached before, then it will also be uncached.
+   *
+   * Global temporary view is cross-session. Its lifetime is the lifetime 
of the Spark application,
+   * i.e. it will be automatically dropped when the application 
terminates. It's tied to a system
+   * preserved database `_global_temp`, and we must use the qualified name 
to refer a global temp
+   * view, e.g. `SELECT * FROM _global_temp.view1`.
+   *
+   * @param viewName the name of the view to be dropped.
+   * @since 2.1.0
+   */
+  def dropGlobalTempView(viewName: String): Boolean
--- End diff --

need to explain the return type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/14897


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82538273
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala
 ---
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
+import org.apache.spark.sql.catalog.Table
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class GlobalTempViewSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  override protected def beforeAll(): Unit = {
+super.beforeAll()
+globalTempDB = spark.sharedState.globalTempViewManager.database
+  }
+
+  private var globalTempDB: String = _
+
+  test("basic semantic") {
+sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'")
+
+// If there is no database in table name, we should try local temp 
view first, if not found,
+// try table/view in current database, which is "default" in this 
case. So we expect
+// NoSuchTableException here.
+intercept[NoSuchTableException](spark.table("src"))
+
+// Use qualified name to refer to the global temp view explicitly.
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Table name without database will never refer to a global temp view.
+intercept[NoSuchTableException](sql("DROP VIEW src"))
+
+sql(s"DROP VIEW $globalTempDB.src")
+// The global temp view should be dropped successfully.
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+
+// We can also use Dataset API to create global temp view
+Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src")
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Use qualified name to rename a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2")
--- End diff --

i see. Thanks for the explanation!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82536096
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -94,6 +69,47 @@ private[sql] class SharedState(val sparkContext: 
SparkContext) extends Logging {
   }
 
   /**
+   * Class for caching query results reused in future executions.
+   */
+  val cacheManager: CacheManager = new CacheManager
+
+  /**
+   * A listener for SQL-specific 
[[org.apache.spark.scheduler.SparkListenerEvent]]s.
+   */
+  val listener: SQLListener = createListenerAndUI(sparkContext)
+
+  /**
+   * A catalog that interacts with external systems.
+   */
+  val externalCatalog: ExternalCatalog =
+SharedState.reflect[ExternalCatalog, SparkConf, Configuration](
+  SharedState.externalCatalogClassName(sparkContext.conf),
+  sparkContext.conf,
+  sparkContext.hadoopConfiguration)
+
+  /**
+   * A manager for global temporary views.
+   */
+  val globalTempViewManager = {
+// System preserved database should not exists in metastore. However 
it's hard to guarantee it
+// for every session, because case-sensitivity differs. Here we always 
lowercase it to make our
+// life easier.
+val globalTempDB = 
sparkContext.conf.get(GLOBAL_TEMP_DATABASE).toLowerCase
+if (externalCatalog.databaseExists(globalTempDB)) {
+  throw new SparkException(
+s"$globalTempDB is a system preserved database, please rename your 
existing database " +
+  "to resolve the name conflict and launch your Spark application 
again.")
--- End diff --

oh no. I think it is fine to hide that conf. But, if the user hit this 
exception, seems it is better to let them know there is another workaround 
(renaming the existing db may not be easy).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82536038
  
--- Diff: python/pyspark/sql/catalog.py ---
@@ -181,6 +181,22 @@ def dropTempView(self, viewName):
 """
 self._jcatalog.dropTempView(viewName)
 
+@since(2.1)
+def dropGlobalTempView(self, viewName):
+"""Drops the global temporary view with the given view name in the 
catalog.
+If the view has been cached before, then it will also be uncached.
+
+>>> spark.createDataFrame([(1, 
1)]).createGlobalTempView("my_table")
+>>> spark.table("global_temp.my_table").collect()
+[Row(_1=1, _2=1)]
+>>> spark.catalog.dropGlobalTempView("my_table")
+>>> spark.table("global_temp.my_table") # doctest: 
+IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+...
+AnalysisException: ...
+"""
--- End diff --

It will be pretty confusing because this test case will appear in the 
python doc (and users will see an example that throws exceptions). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82517570
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala
 ---
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
+import org.apache.spark.sql.catalog.Table
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class GlobalTempViewSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  override protected def beforeAll(): Unit = {
+super.beforeAll()
+globalTempDB = spark.sharedState.globalTempViewManager.database
+  }
+
+  private var globalTempDB: String = _
+
+  test("basic semantic") {
+sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'")
+
+// If there is no database in table name, we should try local temp 
view first, if not found,
+// try table/view in current database, which is "default" in this 
case. So we expect
+// NoSuchTableException here.
+intercept[NoSuchTableException](spark.table("src"))
+
+// Use qualified name to refer to the global temp view explicitly.
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Table name without database will never refer to a global temp view.
+intercept[NoSuchTableException](sql("DROP VIEW src"))
+
+sql(s"DROP VIEW $globalTempDB.src")
+// The global temp view should be dropped successfully.
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+
+// We can also use Dataset API to create global temp view
+Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src")
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Use qualified name to rename a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2")
--- End diff --

For `ALTER VIEW/TABLE ... RENAME TO ...`, we forbid users to specify 
database for the destination table name, because:

1. If the table/view belongs to a database, then it doesn't make sense to 
specify database for destination table name, because we can't change the 
database of the table/view by `RENAME TO`
2. if the table/view has no database(local temp view), then we can't 
specify database for destination table name.

Global temp view also follows these rules so we should not allow `ALTER 
VIEW $globalTempDB.src RENAME TO $globalTempDB.src2`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82517462
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -94,6 +69,47 @@ private[sql] class SharedState(val sparkContext: 
SparkContext) extends Logging {
   }
 
   /**
+   * Class for caching query results reused in future executions.
+   */
+  val cacheManager: CacheManager = new CacheManager
+
+  /**
+   * A listener for SQL-specific 
[[org.apache.spark.scheduler.SparkListenerEvent]]s.
+   */
+  val listener: SQLListener = createListenerAndUI(sparkContext)
+
+  /**
+   * A catalog that interacts with external systems.
+   */
+  val externalCatalog: ExternalCatalog =
+SharedState.reflect[ExternalCatalog, SparkConf, Configuration](
+  SharedState.externalCatalogClassName(sparkContext.conf),
+  sparkContext.conf,
+  sparkContext.hadoopConfiguration)
+
+  /**
+   * A manager for global temporary views.
+   */
+  val globalTempViewManager = {
+// System preserved database should not exists in metastore. However 
it's hard to guarantee it
+// for every session, because case-sensitivity differs. Here we always 
lowercase it to make our
+// life easier.
+val globalTempDB = 
sparkContext.conf.get(GLOBAL_TEMP_DATABASE).toLowerCase
+if (externalCatalog.databaseExists(globalTempDB)) {
+  throw new SparkException(
+s"$globalTempDB is a system preserved database, please rename your 
existing database " +
+  "to resolve the name conflict and launch your Spark application 
again.")
--- End diff --

In the offline discussion, we decided to hide this internal config to 
users, did I miss something?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82517434
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -37,39 +37,14 @@ import org.apache.spark.util.{MutableURLClassLoader, 
Utils}
  */
 private[sql] class SharedState(val sparkContext: SparkContext) extends 
Logging {
 
-  /**
-   * Class for caching query results reused in future executions.
-   */
-  val cacheManager: CacheManager = new CacheManager
-
-  /**
-   * A listener for SQL-specific 
[[org.apache.spark.scheduler.SparkListenerEvent]]s.
-   */
-  val listener: SQLListener = createListenerAndUI(sparkContext)
-
+  // Load hive-site.xml into hadoopConf and determine the warehouse path 
we want to use, based on
+  // the config from both hive and Spark SQL. Finally set the warehouse 
config value to sparkConf.
   {
 val configFile = 
Utils.getContextOrSparkClassLoader.getResource("hive-site.xml")
 if (configFile != null) {
   sparkContext.hadoopConfiguration.addResource(configFile)
 }
-  }
-
-  /**
-   * A catalog that interacts with external systems.
-   */
-  lazy val externalCatalog: ExternalCatalog =
-SharedState.reflect[ExternalCatalog, SparkConf, Configuration](
-  SharedState.externalCatalogClassName(sparkContext.conf),
-  sparkContext.conf,
-  sparkContext.hadoopConfiguration)
-
-  /**
-   * A classloader used to load all user-added jar.
-   */
-  val jarClassLoader = new NonClosableMutableURLClassLoader(
-org.apache.spark.util.Utils.getContextOrSparkClassLoader)
 
-  {
 // Set the Hive metastore warehouse path to the one we use
 val tempConf = new SQLConf
--- End diff --

I double checked, this block only use `sparkContext`, which is a 
constructor parameter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82517346
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -380,6 +380,7 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   tableIdent = visitTableIdentifier(ctx.tableIdentifier()),
   userSpecifiedSchema = 
Option(ctx.colTypeList()).map(createStructType),
   replace = ctx.REPLACE != null,
+  global = ctx.GLOBAL != null,
--- End diff --

already added: 
https://github.com/apache/spark/pull/14897/files#diff-03fd8e826da7af5c83b1d4e2282e8030R92


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82517333
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala ---
@@ -125,6 +124,9 @@ class QueryExecution(val sparkSession: SparkSession, 
val logical: LogicalPlan) {
   .mkString("\t")
 }
   }
+// SHOW TABLES in Hive only output table names, while ours outputs 
database, table name, isTemp.
+case command: ExecutedCommandExec if 
command.cmd.isInstanceOf[ShowTablesCommand] =>
+  command.executeCollect().map(_.getString(1))
--- End diff --

because we change the output of `ShowTablesCommand`, and now the `table` 
column is the second column.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82517028
  
--- Diff: python/pyspark/sql/catalog.py ---
@@ -181,6 +181,22 @@ def dropTempView(self, viewName):
 """
 self._jcatalog.dropTempView(viewName)
 
+@since(2.1)
+def dropGlobalTempView(self, viewName):
+"""Drops the global temporary view with the given view name in the 
catalog.
+If the view has been cached before, then it will also be uncached.
+
+>>> spark.createDataFrame([(1, 
1)]).createGlobalTempView("my_table")
+>>> spark.table("global_temp.my_table").collect()
+[Row(_1=1, _2=1)]
+>>> spark.catalog.dropGlobalTempView("my_table")
+>>> spark.table("global_temp.my_table") # doctest: 
+IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+...
+AnalysisException: ...
+"""
--- End diff --

I followed the method doc in `dropTempView`, is it bad?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82492217
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -453,7 +534,11 @@ class SessionCatalog(
   val db = formatDatabaseName(name.database.getOrElse(currentDb))
   val table = formatTableName(name.table)
   val relationAlias = alias.getOrElse(table)
-  if (name.database.isDefined || !tempTables.contains(table)) {
+  if (db == globalTempViewManager.database) {
+globalTempViewManager.get(table).map { viewDef =>
+  SubqueryAlias(relationAlias, viewDef, Some(name))
+}.getOrElse(throw new NoSuchTableException(db, table))
--- End diff --

In the `else` branch, we are just doing `SubqueryAlias(relationAlias, 
tempTables(table), Option(name))`. Later we should also make this branch and 
that branch consistent (by throwing NoSuchTableException). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493489
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -37,39 +37,14 @@ import org.apache.spark.util.{MutableURLClassLoader, 
Utils}
  */
 private[sql] class SharedState(val sparkContext: SparkContext) extends 
Logging {
 
-  /**
-   * Class for caching query results reused in future executions.
-   */
-  val cacheManager: CacheManager = new CacheManager
-
-  /**
-   * A listener for SQL-specific 
[[org.apache.spark.scheduler.SparkListenerEvent]]s.
-   */
-  val listener: SQLListener = createListenerAndUI(sparkContext)
-
+  // Load hive-site.xml into hadoopConf and determine the warehouse path 
we want to use, based on
+  // the config from both hive and Spark SQL. Finally set the warehouse 
config value to sparkConf.
   {
 val configFile = 
Utils.getContextOrSparkClassLoader.getResource("hive-site.xml")
 if (configFile != null) {
   sparkContext.hadoopConfiguration.addResource(configFile)
 }
-  }
-
-  /**
-   * A catalog that interacts with external systems.
-   */
-  lazy val externalCatalog: ExternalCatalog =
-SharedState.reflect[ExternalCatalog, SparkConf, Configuration](
-  SharedState.externalCatalogClassName(sparkContext.conf),
-  sparkContext.conf,
-  sparkContext.hadoopConfiguration)
-
-  /**
-   * A classloader used to load all user-added jar.
-   */
-  val jarClassLoader = new NonClosableMutableURLClassLoader(
-org.apache.spark.util.Utils.getContextOrSparkClassLoader)
 
-  {
 // Set the Hive metastore warehouse path to the one we use
 val tempConf = new SQLConf
--- End diff --

If this block does not use any `val` defined in this class, it is fine to 
move it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493266
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2433,31 +2433,65 @@ class Dataset[T] private[sql](
   }
 
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
+   * Local temporary view is session-scoped. Its lifetime is the lifetime 
of the session that
+   * created it, i.e. it will be automatically dropped when the session 
terminates. It's not
+   * tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+   *
* @throws AnalysisException if the view name already exists
*
* @group basic
* @since 2.0.0
*/
   @throws[AnalysisException]
   def createTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = false)
+createTempViewCommand(viewName, replace = false, global = false)
   }
 
+
+
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
* @group basic
* @since 2.0.0
*/
   def createOrReplaceTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = true)
+createTempViewCommand(viewName, replace = true, global = false)
   }
 
-  private def createViewCommand(viewName: String, replace: Boolean): 
CreateViewCommand = {
+  /**
+   * Creates a global temporary view using the given name. The lifetime of 
this
+   * temporary view is tied to this Spark application.
+   *
+   * Global temporary view is cross-session. Its lifetime is the lifetime 
of the Spark application,
+   * i.e. it will be automatically dropped when the application 
terminates. It's tied to a system
+   * preserved database `_global_temp`, and we must use the qualified name 
to refer a global temp
+   * view, e.g. `SELECT * FROM _global_temp.view1`.
+   *
+   * @throws TempTableAlreadyExistsException if the view name already 
exists
+   *
+   * @group basic
+   * @since 2.1.0
+   */
+  @throws[AnalysisException]
+  def createGlobalTempView(viewName: String): Unit = withPlan {
+createTempViewCommand(viewName, replace = false, global = true)
--- End diff --

What is the behavior of `createTempView("a.b")`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493366
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -183,17 +183,19 @@ case class DropTableCommand(
 
   override def run(sparkSession: SparkSession): Seq[Row] = {
 val catalog = sparkSession.sessionState.catalog
-// If the command DROP VIEW is to drop a table or DROP TABLE is to 
drop a view
-// issue an exception.
-catalog.getTableMetadataOption(tableName).map(_.tableType match {
-  case CatalogTableType.VIEW if !isView =>
-throw new AnalysisException(
-  "Cannot drop a view with DROP TABLE. Please use DROP VIEW 
instead")
-  case o if o != CatalogTableType.VIEW && isView =>
-throw new AnalysisException(
-  s"Cannot drop a table with DROP VIEW. Please use DROP TABLE 
instead")
-  case _ =>
-})
+if (tableName.database.forall(catalog.databaseExists) && 
catalog.tableExists(tableName)) {
--- End diff --

What will happen if the db is global_temp?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82279784
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/GlobalTempViewManager.scala
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.catalog
+
+import javax.annotation.concurrent.GuardedBy
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.AnalysisException
+import 
org.apache.spark.sql.catalyst.analysis.TempTableAlreadyExistsException
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.catalyst.util.StringUtils
+
+
+/**
+ * A thread-safe manager for global temporary views, providing atomic 
operations to manage them,
+ * e.g. create, update, remove, etc.
+ *
+ * Note that, the view name is always case-sensitive here, callers are 
responsible to format the
+ * view name w.r.t. case-sensitive config.
+ *
+ * @param database The system preserved virtual database that keeps all 
the global temporary views.
+ */
+class GlobalTempViewManager(val database: String) {
+
+  /** List of view definitions, mapping from view name to logical plan. */
+  @GuardedBy("this")
+  private val viewDefinitions = new mutable.HashMap[String, LogicalPlan]
+
+  /**
+   * Returns the global view definition which matches the given name, or 
None if not found.
+   */
+  def get(name: String): Option[LogicalPlan] = synchronized {
+viewDefinitions.get(name)
+  }
+
+  /**
+   * Creates a global temp view, or issue an exception if the view already 
exists and
+   * `overrideIfExists` is false.
+   */
+  def create(
+  name: String,
+  viewDefinition: LogicalPlan,
+  overrideIfExists: Boolean): Unit = synchronized {
+if (!overrideIfExists && viewDefinitions.contains(name)) {
+  throw new TempTableAlreadyExistsException(name)
+}
+viewDefinitions.put(name, viewDefinition)
+  }
+
+  /**
+   * Updates the global temp view if it exists, returns true if updated, 
false otherwise.
+   */
+  def update(
+  name: String,
+  viewDefinition: LogicalPlan): Boolean = synchronized {
+if (viewDefinitions.contains(name)) {
+  viewDefinitions.put(name, viewDefinition)
+  true
+} else {
+  false
+}
+  }
+
+  /**
+   * Removes the global temp view if it exists, returns true if removed, 
false otherwise.
+   */
+  def remove(name: String): Boolean = synchronized {
+viewDefinitions.remove(name).isDefined
+  }
+
+  /**
+   * Renames the global temp view if the source view exists and the 
destination view not exists, or
+   * issue an exception if the source view exists but the destination view 
already exists. Returns
+   * true if renamed, false otherwise.
+   */
+  def rename(oldName: String, newName: String): Boolean = synchronized {
+if (viewDefinitions.contains(oldName)) {
+  if (viewDefinitions.contains(newName)) {
+throw new AnalysisException(
+  s"rename temporary view from '$oldName' to '$newName': 
destination view already exists")
+  }
+
+  val viewDefinition = viewDefinitions(oldName)
+  viewDefinitions.remove(oldName)
+  viewDefinitions.put(newName, viewDefinition)
+  true
+} else {
+  false
+}
--- End diff --

What is reason that failing to rename has two behavior (when source does 
not exist, we return false. But, when destination already exists, we thrown an 
error)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket

[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82299965
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -188,6 +196,11 @@ class SessionCatalog(
 
   def setCurrentDatabase(db: String): Unit = {
 val dbName = formatDatabaseName(db)
+if (dbName == globalTempViewManager.database) {
+  throw new AnalysisException(
+s"${globalTempViewManager.database} is a system preserved 
database, " +
+  "you cannot use it as current database.")
--- End diff --

Seems it will be useful to let users know how to access temp views under 
this namespace.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493491
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -37,39 +37,14 @@ import org.apache.spark.util.{MutableURLClassLoader, 
Utils}
  */
 private[sql] class SharedState(val sparkContext: SparkContext) extends 
Logging {
 
-  /**
-   * Class for caching query results reused in future executions.
-   */
-  val cacheManager: CacheManager = new CacheManager
-
-  /**
-   * A listener for SQL-specific 
[[org.apache.spark.scheduler.SparkListenerEvent]]s.
-   */
-  val listener: SQLListener = createListenerAndUI(sparkContext)
-
+  // Load hive-site.xml into hadoopConf and determine the warehouse path 
we want to use, based on
+  // the config from both hive and Spark SQL. Finally set the warehouse 
config value to sparkConf.
   {
 val configFile = 
Utils.getContextOrSparkClassLoader.getResource("hive-site.xml")
 if (configFile != null) {
   sparkContext.hadoopConfiguration.addResource(configFile)
 }
-  }
-
-  /**
-   * A catalog that interacts with external systems.
-   */
-  lazy val externalCatalog: ExternalCatalog =
-SharedState.reflect[ExternalCatalog, SparkConf, Configuration](
-  SharedState.externalCatalogClassName(sparkContext.conf),
-  sparkContext.conf,
-  sparkContext.hadoopConfiguration)
-
-  /**
-   * A classloader used to load all user-added jar.
-   */
-  val jarClassLoader = new NonClosableMutableURLClassLoader(
-org.apache.spark.util.Utils.getContextOrSparkClassLoader)
 
-  {
 // Set the Hive metastore warehouse path to the one we use
 val tempConf = new SQLConf
--- End diff --

Can you double check?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493318
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -380,6 +380,7 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   tableIdent = visitTableIdentifier(ctx.tableIdentifier()),
   userSpecifiedSchema = 
Option(ctx.colTypeList()).map(createStructType),
   replace = ctx.REPLACE != null,
+  global = ctx.GLOBAL != null,
--- End diff --

Can we add a test for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493591
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -94,6 +69,47 @@ private[sql] class SharedState(val sparkContext: 
SparkContext) extends Logging {
   }
 
   /**
+   * Class for caching query results reused in future executions.
+   */
+  val cacheManager: CacheManager = new CacheManager
+
+  /**
+   * A listener for SQL-specific 
[[org.apache.spark.scheduler.SparkListenerEvent]]s.
+   */
+  val listener: SQLListener = createListenerAndUI(sparkContext)
+
+  /**
+   * A catalog that interacts with external systems.
+   */
+  val externalCatalog: ExternalCatalog =
+SharedState.reflect[ExternalCatalog, SparkConf, Configuration](
+  SharedState.externalCatalogClassName(sparkContext.conf),
+  sparkContext.conf,
+  sparkContext.hadoopConfiguration)
+
+  /**
+   * A manager for global temporary views.
+   */
+  val globalTempViewManager = {
+// System preserved database should not exists in metastore. However 
it's hard to guarantee it
+// for every session, because case-sensitivity differs. Here we always 
lowercase it to make our
+// life easier.
+val globalTempDB = 
sparkContext.conf.get(GLOBAL_TEMP_DATABASE).toLowerCase
+if (externalCatalog.databaseExists(globalTempDB)) {
+  throw new SparkException(
+s"$globalTempDB is a system preserved database, please rename your 
existing database " +
+  "to resolve the name conflict and launch your Spark application 
again.")
--- End diff --

Should we also mention this conf?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493274
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala ---
@@ -125,6 +124,9 @@ class QueryExecution(val sparkSession: SparkSession, 
val logical: LogicalPlan) {
   .mkString("\t")
 }
   }
+// SHOW TABLES in Hive only output table names, while ours outputs 
database, table name, isTemp.
+case command: ExecutedCommandExec if 
command.cmd.isInstanceOf[ShowTablesCommand] =>
+  command.executeCollect().map(_.getString(1))
--- End diff --

Why do we only hit this in this pr?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493549
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala
 ---
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
+import org.apache.spark.sql.catalog.Table
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class GlobalTempViewSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  override protected def beforeAll(): Unit = {
+super.beforeAll()
+globalTempDB = spark.sharedState.globalTempViewManager.database
+  }
+
+  private var globalTempDB: String = _
+
+  test("basic semantic") {
+sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'")
+
+// If there is no database in table name, we should try local temp 
view first, if not found,
+// try table/view in current database, which is "default" in this 
case. So we expect
+// NoSuchTableException here.
+intercept[NoSuchTableException](spark.table("src"))
+
+// Use qualified name to refer to the global temp view explicitly.
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Table name without database will never refer to a global temp view.
+intercept[NoSuchTableException](sql("DROP VIEW src"))
+
+sql(s"DROP VIEW $globalTempDB.src")
+// The global temp view should be dropped successfully.
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+
+// We can also use Dataset API to create global temp view
+Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src")
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Use qualified name to rename a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2")
--- End diff --

Should we do `sql(s"ALTER VIEW $globalTempDB.src RENAME TO 
$globalTempDB.src2")` since we always require operating global temp views using 
qualified identifiers?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82272178
  
--- Diff: python/pyspark/sql/catalog.py ---
@@ -181,6 +181,22 @@ def dropTempView(self, viewName):
 """
 self._jcatalog.dropTempView(viewName)
 
+@since(2.1)
+def dropGlobalTempView(self, viewName):
+"""Drops the global temporary view with the given view name in the 
catalog.
+If the view has been cached before, then it will also be uncached.
+
+>>> spark.createDataFrame([(1, 
1)]).createGlobalTempView("my_table")
+>>> spark.table("global_temp.my_table").collect()
+[Row(_1=1, _2=1)]
+>>> spark.catalog.dropGlobalTempView("my_table")
+>>> spark.table("global_temp.my_table") # doctest: 
+IGNORE_EXCEPTION_DETAIL
+Traceback (most recent call last):
+...
+AnalysisException: ...
+"""
--- End diff --

I think this bad case will end up in the python doc. Can we move this test 
to the `tests.py` file (it is fine to do it in a follow-up pr)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493562
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala
 ---
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class GlobalTempViewSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  override protected def beforeAll(): Unit = {
+super.beforeAll()
+globalTempDB = spark.sharedState.globalTempDB
+  }
+
+  private var globalTempDB: String = _
+
+  test("basic semantic") {
+sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'")
+
+// If there is no database in table name, we should try local temp 
view first, if not found,
+// try table/view in current database, which is "default" in this 
case. So we expect
+// NoSuchTableException here.
+intercept[NoSuchTableException](spark.table("src"))
+
+// Use qualified name to refer to the global temp view explicitly.
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Table name without database will never refer to a global temp view.
+intercept[NoSuchTableException](sql("DROP VIEW src"))
+
+sql(s"DROP VIEW $globalTempDB.src")
+// The global temp view should be dropped successfully.
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+
+// We can also use Dataset API to create global temp view
+Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src")
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Use qualified name to rename a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2")
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+checkAnswer(spark.table(s"$globalTempDB.src2"), Row(1, "a"))
+
+// Use qualified name to alter a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src2 AS SELECT 2, 'b'")
+checkAnswer(spark.table(s"$globalTempDB.src2"), Row(2, "b"))
+
+// We can also use Catalog API to drop global temp view
+spark.catalog.dropGlobalTempView("src2")
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src2"))
+  }
+
+  test("global temp view database should be preserved") {
+val e = intercept[AnalysisException](sql(s"CREATE DATABASE 
$globalTempDB"))
+assert(e.message.contains("system preserved database"))
+
+val e2 = intercept[AnalysisException](sql(s"USE $globalTempDB"))
+assert(e2.message.contains("system preserved database"))
+  }
+
+  test("CREATE TABLE LIKE should work for global temp view") {
+try {
+  sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1 AS a, '2' AS b")
+  sql(s"CREATE TABLE cloned LIKE ${globalTempDB}.src")
+  val tableMeta = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("cloned"))
+  assert(tableMeta.schema == new StructType().add("a", "int", 
false).add("b", "string", false))
+} finally {
+  spark.catalog.dropGlobalTempView("src")
+  sql("DROP TABLE default.cloned")
+}
+  }
+
+  test("list global temp views") {
+try {
+  sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4")
+  sql("CREATE TEMP VIEW v2 AS SELECT 1, 2")
+
+  checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"),
+Row(globalTempDB, "v1", true) ::
+Row("", "v2", true) :: Nil)
+
+  
assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == 
Seq("v1", "v2"))
+} finally {
+   

[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-07 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r82493357
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala ---
@@ -183,17 +183,19 @@ case class DropTableCommand(
 
   override def run(sparkSession: SparkSession): Seq[Row] = {
 val catalog = sparkSession.sessionState.catalog
-// If the command DROP VIEW is to drop a table or DROP TABLE is to 
drop a view
-// issue an exception.
-catalog.getTableMetadataOption(tableName).map(_.tableType match {
-  case CatalogTableType.VIEW if !isView =>
-throw new AnalysisException(
-  "Cannot drop a view with DROP TABLE. Please use DROP VIEW 
instead")
-  case o if o != CatalogTableType.VIEW && isView =>
-throw new AnalysisException(
-  s"Cannot drop a table with DROP VIEW. Please use DROP TABLE 
instead")
-  case _ =>
-})
+if (tableName.database.forall(catalog.databaseExists) && 
catalog.tableExists(tableName)) {
--- End diff --

I feel `forall` is not easy to understand when you are using it with a 
option.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-10-01 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81463019
  
--- Diff: docs/sql-programming-guide.md ---
@@ -220,6 +220,40 @@ The `sql` function enables applications to run SQL 
queries programmatically and
 
 
 
+## Global Temporary View
+
+Temporay views in Spark SQL are session-scoped and will disappear if the 
session that creates it
+terminates. If you want to have a temporary view that is shared among all 
sessions and keep alive
+until the Spark application terminiates, you can create a global temporary 
view. Global temporary
+view is tied to a system preserved database `global_temp`, and we must use 
the qualified name to
+refer it, e.g. `SELECT * FROM global_temp.view1`.
+
+
+
+{% include_example global_temp_view 
scala/org/apache/spark/examples/sql/SparkSQLExample.scala %}
+
+
+
+{% include_example global_temp_view 
java/org/apache/spark/examples/sql/JavaSparkSQLExample.java %}
+
+
+
+{% include_example global_temp_view python/sql/basic.py %}
+
+
+
+
+{% highlight sql %}
+
+CREATE GLOBAL TEMPORARY VIEW temp_view AS SELECT a + 1, b * 2 FROM tbl
+
+SELECT * FROM global_temp.temp_view
+
+{% endhighlight %}
+
+
--- End diff --

We need one more 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-30 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81330387
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala ---
@@ -95,20 +95,20 @@ class CatalogImpl(sparkSession: SparkSession) extends 
Catalog {
*/
   @throws[AnalysisException]("database does not exist")
   override def listTables(dbName: String): Dataset[Table] = {
-requireDatabaseExists(dbName)
 val tables = sessionCatalog.listTables(dbName).map { tableIdent =>
-  makeTable(tableIdent, tableIdent.database.isEmpty)
+  makeTable(tableIdent)
 }
 CatalogImpl.makeDataset(tables, sparkSession)
   }
 
-  private def makeTable(tableIdent: TableIdentifier, isTemp: Boolean): 
Table = {
-val metadata = if (isTemp) None else 
Some(sessionCatalog.getTableMetadata(tableIdent))
+  private def makeTable(tableIdent: TableIdentifier): Table = {
--- End diff --

changes to this method is necessary, or `findTable` for global temp view 
can't show database.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81281510
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2433,31 +2433,65 @@ class Dataset[T] private[sql](
   }
 
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
+   * Local temporary view is session-scoped. Its lifetime is the lifetime 
of the session that
+   * created it, i.e. it will be automatically dropped when the session 
terminates. It's not
+   * tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+   *
* @throws AnalysisException if the view name already exists
*
* @group basic
* @since 2.0.0
*/
   @throws[AnalysisException]
   def createTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = false)
+createTempViewCommand(viewName, replace = false, global = false)
   }
 
+
+
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
* @group basic
* @since 2.0.0
*/
   def createOrReplaceTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = true)
+createTempViewCommand(viewName, replace = true, global = false)
   }
 
-  private def createViewCommand(viewName: String, replace: Boolean): 
CreateViewCommand = {
+  /**
+   * Creates a global temporary view using the given name. The lifetime of 
this
+   * temporary view is tied to this Spark application.
+   *
+   * Global temporary view is cross-session. Its lifetime is the lifetime 
of the Spark application,
+   * i.e. it will be automatically dropped when the application 
terminates. It's tied to a system
+   * preserved database `_global_temp`, and we must use the qualified name 
to refer a global temp
+   * view, e.g. `SELECT * FROM _global_temp.view1`.
+   *
+   * @throws TempTableAlreadyExistsException if the view name already 
exists
+   *
+   * @group basic
+   * @since 2.1.0
+   */
+  @throws[AnalysisException]
+  def createGlobalTempView(viewName: String): Unit = withPlan {
+createTempViewCommand(viewName, replace = false, global = true)
--- End diff --

```Scala
Seq(1 -> "a").toDF("i", "j").createGlobalTempView(s"$globalTempDB.src")
```
We will get the following error:
```
org.apache.spark.sql.AnalysisException: It is not allowed to add database 
prefix `global_temp` for the TEMPORARY view name.;
```

I am wondering if we should also support it for the global temp view?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-29 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81261514
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -472,27 +557,48 @@ class SessionCatalog(
* explicitly specified.
*/
   def isTemporaryTable(name: TableIdentifier): Boolean = synchronized {
-name.database.isEmpty && 
tempTables.contains(formatTableName(name.table))
+val table = formatTableName(name.table)
+if (name.database.isEmpty) {
+  tempTables.contains(table)
+} else if (formatDatabaseName(name.database.get) == 
globalTempViewManager.database) {
+  globalTempViewManager.get(table).isDefined
+} else {
+  false
+}
   }
 
   /**
-   * List all tables in the specified database, including temporary tables.
+   * List all tables in the specified database, including local temporary 
tables.
+   *
+   * Note that, if the specified database is global temporary view 
database, we will list global
+   * temporary views.
*/
   def listTables(db: String): Seq[TableIdentifier] = listTables(db, "*")
 
   /**
-   * List all matching tables in the specified database, including 
temporary tables.
+   * List all matching tables in the specified database, including local 
temporary tables.
+   *
+   * Note that, if the specified database is global temporary view 
database, we will list global
+   * temporary views.
*/
   def listTables(db: String, pattern: String): Seq[TableIdentifier] = {
 val dbName = formatDatabaseName(db)
-requireDbExists(dbName)
-val dbTables =
-  externalCatalog.listTables(dbName, pattern).map { t => 
TableIdentifier(t, Some(dbName)) }
-synchronized {
-  val _tempTables = StringUtils.filterPattern(tempTables.keys.toSeq, 
pattern)
-.map { t => TableIdentifier(t) }
-  dbTables ++ _tempTables
+val dbTables = if (dbName == globalTempViewManager.database) {
+  globalTempViewManager.listViewNames(pattern).map { name =>
+TableIdentifier(name, Some(globalTempViewManager.database))
+  }
+} else {
+  requireDbExists(dbName)
+  externalCatalog.listTables(dbName, pattern).map { name =>
+TableIdentifier(name, Some(dbName))
+  }
+}
+val localTempViews = synchronized {
+  StringUtils.filterPattern(tempTables.keys.toSeq, pattern).map { name 
=>
+TableIdentifier(name)
+  }
 }
+dbTables ++ localTempViews
--- End diff --

because the `++` operator don't need to be done inside the `synchronized` 
block


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-29 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81114242
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -517,7 +517,8 @@ case class ShowTablesCommand(
 
   // The result of SHOW TABLES has two columns, tableName and isTemporary.
--- End diff --

Seems the result has three columns now...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-29 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81112745
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -472,27 +557,48 @@ class SessionCatalog(
* explicitly specified.
*/
   def isTemporaryTable(name: TableIdentifier): Boolean = synchronized {
-name.database.isEmpty && 
tempTables.contains(formatTableName(name.table))
+val table = formatTableName(name.table)
+if (name.database.isEmpty) {
+  tempTables.contains(table)
+} else if (formatDatabaseName(name.database.get) == 
globalTempViewManager.database) {
+  globalTempViewManager.get(table).isDefined
+} else {
+  false
+}
   }
 
   /**
-   * List all tables in the specified database, including temporary tables.
+   * List all tables in the specified database, including local temporary 
tables.
+   *
+   * Note that, if the specified database is global temporary view 
database, we will list global
+   * temporary views.
*/
   def listTables(db: String): Seq[TableIdentifier] = listTables(db, "*")
 
   /**
-   * List all matching tables in the specified database, including 
temporary tables.
+   * List all matching tables in the specified database, including local 
temporary tables.
+   *
+   * Note that, if the specified database is global temporary view 
database, we will list global
+   * temporary views.
*/
   def listTables(db: String, pattern: String): Seq[TableIdentifier] = {
 val dbName = formatDatabaseName(db)
-requireDbExists(dbName)
-val dbTables =
-  externalCatalog.listTables(dbName, pattern).map { t => 
TableIdentifier(t, Some(dbName)) }
-synchronized {
-  val _tempTables = StringUtils.filterPattern(tempTables.keys.toSeq, 
pattern)
-.map { t => TableIdentifier(t) }
-  dbTables ++ _tempTables
+val dbTables = if (dbName == globalTempViewManager.database) {
+  globalTempViewManager.listViewNames(pattern).map { name =>
+TableIdentifier(name, Some(globalTempViewManager.database))
+  }
+} else {
+  requireDbExists(dbName)
+  externalCatalog.listTables(dbName, pattern).map { name =>
+TableIdentifier(name, Some(dbName))
+  }
+}
+val localTempViews = synchronized {
+  StringUtils.filterPattern(tempTables.keys.toSeq, pattern).map { name 
=>
+TableIdentifier(name)
+  }
 }
+dbTables ++ localTempViews
--- End diff --

Why we should move this line out of the `synchronized` block?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81044097
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2433,31 +2433,65 @@ class Dataset[T] private[sql](
   }
 
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
+   * Local temporary view is session-scoped. Its lifetime is the lifetime 
of the session that
+   * created it, i.e. it will be automatically dropped when the session 
terminates. It's not
+   * tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+   *
* @throws AnalysisException if the view name already exists
*
* @group basic
* @since 2.0.0
*/
   @throws[AnalysisException]
   def createTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = false)
+createTempViewCommand(viewName, replace = false, global = false)
   }
 
+
+
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
* @group basic
* @since 2.0.0
*/
   def createOrReplaceTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = true)
+createTempViewCommand(viewName, replace = true, global = false)
   }
 
-  private def createViewCommand(viewName: String, replace: Boolean): 
CreateViewCommand = {
+  /**
+   * Creates a global temporary view using the given name. The lifetime of 
this
+   * temporary view is tied to this Spark application.
+   *
+   * Global temporary view is cross-session. Its lifetime is the lifetime 
of the Spark application,
+   * i.e. it will be automatically dropped when the application 
terminates. It's tied to a system
+   * preserved database `_global_temp`, and we must use the qualified name 
to refer a global temp
+   * view, e.g. `SELECT * FROM _global_temp.view1`.
+   *
+   * @throws TempTableAlreadyExistsException if the view name already 
exists
+   *
+   * @group basic
+   * @since 2.1.0
+   */
+  @throws[AnalysisException]
+  def createGlobalTempView(viewName: String): Unit = withPlan {
+createTempViewCommand(viewName, replace = false, global = true)
--- End diff --

no, it's same with `createTempView`, the parameter string is the literal 
view name, so your code should be `spark.table(`$globalTempDB.src`)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r81012097
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -472,27 +557,48 @@ class SessionCatalog(
* explicitly specified.
*/
   def isTemporaryTable(name: TableIdentifier): Boolean = synchronized {
-name.database.isEmpty && 
tempTables.contains(formatTableName(name.table))
+val table = formatTableName(name.table)
+if (name.database.isEmpty) {
+  tempTables.contains(table)
+} else if (formatDatabaseName(name.database.get) == 
globalTempViewManager.database) {
+  globalTempViewManager.get(table).isDefined
+} else {
+  false
+}
   }
 
   /**
-   * List all tables in the specified database, including temporary tables.
+   * List all tables in the specified database, including local temporary 
tables.
+   *
+   * Note that, if the specified database is global temporary view 
database, we will list global
+   * temporary views.
*/
   def listTables(db: String): Seq[TableIdentifier] = listTables(db, "*")
 
   /**
-   * List all matching tables in the specified database, including 
temporary tables.
+   * List all matching tables in the specified database, including local 
temporary tables.
+   *
+   * Note that, if the specified database is global temporary view 
database, we will list global
+   * temporary views.
*/
   def listTables(db: String, pattern: String): Seq[TableIdentifier] = {
 val dbName = formatDatabaseName(db)
-requireDbExists(dbName)
-val dbTables =
-  externalCatalog.listTables(dbName, pattern).map { t => 
TableIdentifier(t, Some(dbName)) }
-synchronized {
-  val _tempTables = StringUtils.filterPattern(tempTables.keys.toSeq, 
pattern)
-.map { t => TableIdentifier(t) }
-  dbTables ++ _tempTables
+val dbTables = if (dbName == globalTempViewManager.database) {
+  globalTempViewManager.listViewNames(pattern).map { name =>
+TableIdentifier(name, Some(globalTempViewManager.database))
+  }
+} else {
+  requireDbExists(dbName)
+  externalCatalog.listTables(dbName, pattern).map { name =>
+TableIdentifier(name, Some(dbName))
+  }
+}
+val localTempViews = synchronized {
+  StringUtils.filterPattern(tempTables.keys.toSeq, pattern).map { name 
=>
+TableIdentifier(name)
+  }
 }
+dbTables ++ localTempViews
--- End diff --

Now, `SHOW TABLES` includes the local temporary views when users try to 
list tables in a database, because these views are visible to users. Thus, we 
should also include globalTempViews, right? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80988644
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala
 ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class GlobalTempViewSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  override protected def beforeAll(): Unit = {
+super.beforeAll()
+globalTempDB = spark.sharedState.globalTempViewManager.database
+  }
+
+  private var globalTempDB: String = _
+
+  test("basic semantic") {
+sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'")
+
+// If there is no database in table name, we should try local temp 
view first, if not found,
+// try table/view in current database, which is "default" in this 
case. So we expect
+// NoSuchTableException here.
+intercept[NoSuchTableException](spark.table("src"))
+
+// Use qualified name to refer to the global temp view explicitly.
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Table name without database will never refer to a global temp view.
+intercept[NoSuchTableException](sql("DROP VIEW src"))
+
+sql(s"DROP VIEW $globalTempDB.src")
+// The global temp view should be dropped successfully.
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+
+// We can also use Dataset API to create global temp view
+Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src")
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Use qualified name to rename a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2")
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+checkAnswer(spark.table(s"$globalTempDB.src2"), Row(1, "a"))
+
+// Use qualified name to alter a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src2 AS SELECT 2, 'b'")
+checkAnswer(spark.table(s"$globalTempDB.src2"), Row(2, "b"))
+
+// We can also use Catalog API to drop global temp view
+spark.catalog.dropGlobalTempView("src2")
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src2"))
+  }
+
+  test("global temp view database should be preserved") {
+val e = intercept[AnalysisException](sql(s"CREATE DATABASE 
$globalTempDB"))
+assert(e.message.contains("system preserved database"))
+
+val e2 = intercept[AnalysisException](sql(s"USE $globalTempDB"))
+assert(e2.message.contains("system preserved database"))
+  }
+
+  test("CREATE TABLE LIKE should work for global temp view") {
+try {
+  sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1 AS a, '2' AS b")
+  sql(s"CREATE TABLE cloned LIKE ${globalTempDB}.src")
+  val tableMeta = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("cloned"))
+  assert(tableMeta.schema == new StructType().add("a", "int", 
false).add("b", "string", false))
+} finally {
+  spark.catalog.dropGlobalTempView("src")
+  sql("DROP TABLE default.cloned")
+}
+  }
+
+  test("list global temp views") {
+try {
+  sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4")
+  sql("CREATE TEMP VIEW v2 AS SELECT 1, 2")
+
+  checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"),
+Row(globalTempDB, "v1", true) ::
+Row("", "v2", true) :: Nil)
+
+  
assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == 
Seq("v1", "v2"))
+ 

[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80985641
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
@@ -2433,31 +2433,65 @@ class Dataset[T] private[sql](
   }
 
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
+   * Local temporary view is session-scoped. Its lifetime is the lifetime 
of the session that
+   * created it, i.e. it will be automatically dropped when the session 
terminates. It's not
+   * tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+   *
* @throws AnalysisException if the view name already exists
*
* @group basic
* @since 2.0.0
*/
   @throws[AnalysisException]
   def createTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = false)
+createTempViewCommand(viewName, replace = false, global = false)
   }
 
+
+
   /**
-   * Creates a temporary view using the given name. The lifetime of this
+   * Creates a local temporary view using the given name. The lifetime of 
this
* temporary view is tied to the [[SparkSession]] that was used to 
create this Dataset.
*
* @group basic
* @since 2.0.0
*/
   def createOrReplaceTempView(viewName: String): Unit = withPlan {
-createViewCommand(viewName, replace = true)
+createTempViewCommand(viewName, replace = true, global = false)
   }
 
-  private def createViewCommand(viewName: String, replace: Boolean): 
CreateViewCommand = {
+  /**
+   * Creates a global temporary view using the given name. The lifetime of 
this
+   * temporary view is tied to this Spark application.
+   *
+   * Global temporary view is cross-session. Its lifetime is the lifetime 
of the Spark application,
+   * i.e. it will be automatically dropped when the application 
terminates. It's tied to a system
+   * preserved database `_global_temp`, and we must use the qualified name 
to refer a global temp
+   * view, e.g. `SELECT * FROM _global_temp.view1`.
+   *
+   * @throws TempTableAlreadyExistsException if the view name already 
exists
+   *
+   * @group basic
+   * @since 2.1.0
+   */
+  @throws[AnalysisException]
+  def createGlobalTempView(viewName: String): Unit = withPlan {
+createTempViewCommand(viewName, replace = false, global = true)
--- End diff --

```Scala
Seq(1 -> "a").toDF("i", "j").createGlobalTempView(s"$globalTempDB.src")
checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
```

This will fail. Do we allow users to specify the global temp database name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-27 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80846005
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -37,39 +37,14 @@ import org.apache.spark.util.{MutableURLClassLoader, 
Utils}
  */
 private[sql] class SharedState(val sparkContext: SparkContext) extends 
Logging {
 
-  /**
-   * Class for caching query results reused in future executions.
-   */
-  val cacheManager: CacheManager = new CacheManager
-
-  /**
-   * A listener for SQL-specific 
[[org.apache.spark.scheduler.SparkListenerEvent]]s.
-   */
-  val listener: SQLListener = createListenerAndUI(sparkContext)
-
+  // Load hive-site.xml into hadoopConf and determine the warehouse path 
we want to use, based on
+  // the config from both hive and Spark SQL. Finally set the warehouse 
config value to sparkConf.
   {
 val configFile = 
Utils.getContextOrSparkClassLoader.getResource("hive-site.xml")
 if (configFile != null) {
   sparkContext.hadoopConfiguration.addResource(configFile)
 }
-  }
-
-  /**
-   * A catalog that interacts with external systems.
-   */
-  lazy val externalCatalog: ExternalCatalog =
-SharedState.reflect[ExternalCatalog, SparkConf, Configuration](
-  SharedState.externalCatalogClassName(sparkContext.conf),
-  sparkContext.conf,
-  sparkContext.hadoopConfiguration)
-
-  /**
-   * A classloader used to load all user-added jar.
-   */
-  val jarClassLoader = new NonClosableMutableURLClassLoader(
-org.apache.spark.util.Utils.getContextOrSparkClassLoader)
 
-  {
 // Set the Hive metastore warehouse path to the one we use
 val tempConf = new SQLConf
--- End diff --

it's really bad that we put this initializing code in the middle of this 
class. This initializing code sets the warehouse path to sparkConf, which means 
it changes the state of the instance. If any fields that are declared before 
this initializing code and need to read this states, e.g. `externalCatalog`, 
they must be `lazy val`.

I think it's more reasonable to less error-prone to put this initializing 
code in the beginning of this class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80407704
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -60,20 +90,21 @@ case class CreateViewCommand(
 child: LogicalPlan,
 allowExisting: Boolean,
 replace: Boolean,
-isTemporary: Boolean)
+viewType: ViewType)
   extends RunnableCommand {
 
   override protected def innerChildren: Seq[QueryPlan[_]] = Seq(child)
 
-  if (!isTemporary) {
-require(originalText.isDefined,
-  "The table to created with CREATE VIEW must have 'originalText'.")
+  if (viewType == PermanentView) {
+require(originalText.isDefined, "'originalText' must be provided to 
create permanent view")
   }
 
   if (allowExisting && replace) {
 throw new AnalysisException("CREATE VIEW with both IF NOT EXISTS and 
REPLACE is not allowed.")
   }
 
+  private def isTemporary = viewType == LocalTempView || viewType == 
GlobalTempView
+
--- End diff --

it's only used here, maybe it's OK?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80403654
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -393,21 +459,25 @@ class SessionCatalog(
*/
   def renameTable(oldName: TableIdentifier, newName: String): Unit = 
synchronized {
 val db = formatDatabaseName(oldName.database.getOrElse(currentDb))
-requireDbExists(db)
 val oldTableName = formatTableName(oldName.table)
 val newTableName = formatTableName(newName)
-if (oldName.database.isDefined || !tempTables.contains(oldTableName)) {
-  requireTableExists(TableIdentifier(oldTableName, Some(db)))
-  requireTableNotExists(TableIdentifier(newTableName, Some(db)))
-  externalCatalog.renameTable(db, oldTableName, newTableName)
+if (db == globalTempDB) {
+  globalTempViews.rename(oldTableName, newTableName)
--- End diff --

we do support, see 
https://github.com/apache/spark/pull/14897/files#diff-b3f9800839b9b9a1df9da9cbfc01adf8L410


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80403612
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -371,16 +431,24 @@ class SessionCatalog(
*/
   def getTempViewOrPermanentTableMetadata(name: TableIdentifier): 
CatalogTable = synchronized {
--- End diff --

yea, I'd like to rename them to `getPersistedTableMetadataOption`, but 
probably not in this PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80403509
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -142,8 +149,12 @@ class SessionCatalog(
   // 

 
   def createDatabase(dbDefinition: CatalogDatabase, ignoreIfExists: 
Boolean): Unit = {
-val qualifiedPath = 
makeQualifiedPath(dbDefinition.locationUri).toString
 val dbName = formatDatabaseName(dbDefinition.name)
+if (dbName == globalTempDB) {
--- End diff --

see see 
https://github.com/apache/spark/pull/14897/files#diff-42e78d37f5dcb2a1576f83b53bbf4b55R40

`globalTempDB` is always lower-cased, so here we respect the case 
sensitivity config, e.g. `gLobAl_TEmp.viewName` can also refer to a global temp 
view in case insensitive context


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80403416
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -37,6 +37,20 @@ import org.apache.spark.util.{MutableURLClassLoader, 
Utils}
  */
 private[sql] class SharedState(val sparkContext: SparkContext) extends 
Logging {
 
+  // System preserved database should not exists in metastore. However 
it's hard to guarantee it
+  // for every session, because case-sensitivity differs. Here we always 
lowercase it to make our
+  // life easier.
+  val globalTempDB = 
sparkContext.conf.get(GLOBAL_TEMP_DATABASE).toLowerCase
--- End diff --

yea, we do: 
https://github.com/apache/spark/pull/14897/files#diff-42e78d37f5dcb2a1576f83b53bbf4b55R46


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80403386
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -222,8 +265,8 @@ case class AlterViewAsCommand(
 qe.assertAnalyzed()
 val analyzedPlan = qe.analyzed
 
-if (session.sessionState.catalog.isTemporaryTable(name)) {
-  session.sessionState.catalog.createTempView(name.table, 
analyzedPlan, overrideIfExists = true)
+if (session.sessionState.catalog.alterTempViewDefinition(name, 
analyzedPlan)) {
+  // a local/global temp view has been altered, we are done.
--- End diff --

The previous one is not atomic, and here we need a 
`createLocalOrGlobalTempView` if we follow the existing style, so I went ahead 
and make an atomic operation: `alterTempViewDefinition`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80402604
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/GlobalTempViewManager.scala
 ---
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.catalog
+
+import javax.annotation.concurrent.GuardedBy
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.AnalysisException
+import 
org.apache.spark.sql.catalyst.analysis.TempTableAlreadyExistsException
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.catalyst.util.StringUtils
+
+
+/**
+ * A thread-safe manager for global temporary views, providing atomic 
operations to manage them,
+ * e.g. create, update, remove, etc.
+ *
+ * Note that, the view name is always case-sensitive here, callers are 
responsible to format the
+ * view name w.r.t. case-sensitive config.
+ */
+class GlobalTempViewManager {
+
+  /** List of view definitions, mapping from view name to logical plan. */
+  @GuardedBy("this")
+  private val viewDefinitions = new mutable.HashMap[String, LogicalPlan]
+
+  def get(name: String): Option[LogicalPlan] = synchronized {
+viewDefinitions.get(name)
+  }
+
+  def create(
+  name: String,
+  viewDefinition: LogicalPlan,
+  overrideIfExists: Boolean): Unit = synchronized {
+if (!overrideIfExists && viewDefinitions.contains(name)) {
+  throw new TempTableAlreadyExistsException(name)
+}
+viewDefinitions.put(name, viewDefinition)
+  }
+
+  def update(
+  name: String,
+  viewDefinition: LogicalPlan): Boolean = synchronized {
+// Only update it when the view with the given name exits.
+if (viewDefinitions.contains(name)) {
+  viewDefinitions.put(name, viewDefinition)
+  true
+} else {
+  false
+}
+  }
--- End diff --

CREATE VIEW and ALTER VIEW.

We can have a single API for them, but need to introduce a write mode: 
errorIfExists, overrideIfExists, updateOnlyIfExists.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80396050
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SharedState.scala ---
@@ -37,6 +37,20 @@ import org.apache.spark.util.{MutableURLClassLoader, 
Utils}
  */
 private[sql] class SharedState(val sparkContext: SparkContext) extends 
Logging {
 
+  // System preserved database should not exists in metastore. However 
it's hard to guarantee it
+  // for every session, because case-sensitivity differs. Here we always 
lowercase it to make our
+  // life easier.
+  val globalTempDB = 
sparkContext.conf.get(GLOBAL_TEMP_DATABASE).toLowerCase
--- End diff --

btw, do we check if there is an existing db having the same name as 
`GLOBAL_TEMP_DATABASE`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395764
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -453,7 +532,11 @@ class SessionCatalog(
   val db = formatDatabaseName(name.database.getOrElse(currentDb))
   val table = formatTableName(name.table)
   val relationAlias = alias.getOrElse(table)
-  if (name.database.isDefined || !tempTables.contains(table)) {
+  if (db == globalTempDB) {
+globalTempViews.get(table).map { viewDef =>
+  SubqueryAlias(relationAlias, viewDef, Some(name))
+}.getOrElse(throw new NoSuchTableException(db, table))
+  } else if (name.database.isDefined || !tempTables.contains(table)) {
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80396017
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala ---
@@ -277,7 +275,7 @@ class CatalogImpl(sparkSession: SparkSession) extends 
Catalog {
   }
 
   /**
-   * Drops the temporary view with the given view name in the catalog.
+   * Drops the local temporary view with the given view name in the 
catalog.
--- End diff --

(probably it is good to explain the meaning of local in the doc for create 
temp view)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395565
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -142,8 +149,12 @@ class SessionCatalog(
   // 

 
   def createDatabase(dbDefinition: CatalogDatabase, ignoreIfExists: 
Boolean): Unit = {
-val qualifiedPath = 
makeQualifiedPath(dbDefinition.locationUri).toString
 val dbName = formatDatabaseName(dbDefinition.name)
+if (dbName == globalTempDB) {
--- End diff --

yea, I have the same question.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395798
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -222,8 +265,8 @@ case class AlterViewAsCommand(
 qe.assertAnalyzed()
 val analyzedPlan = qe.analyzed
 
-if (session.sessionState.catalog.isTemporaryTable(name)) {
-  session.sessionState.catalog.createTempView(name.table, 
analyzedPlan, overrideIfExists = true)
+if (session.sessionState.catalog.alterTempViewDefinition(name, 
analyzedPlan)) {
+  // a local/global temp view has been altered, we are done.
--- End diff --

Is this change needed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395752
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -393,21 +459,25 @@ class SessionCatalog(
*/
   def renameTable(oldName: TableIdentifier, newName: String): Unit = 
synchronized {
 val db = formatDatabaseName(oldName.database.getOrElse(currentDb))
-requireDbExists(db)
 val oldTableName = formatTableName(oldName.table)
 val newTableName = formatTableName(newName)
-if (oldName.database.isDefined || !tempTables.contains(oldTableName)) {
-  requireTableExists(TableIdentifier(oldTableName, Some(db)))
-  requireTableNotExists(TableIdentifier(newTableName, Some(db)))
-  externalCatalog.renameTable(db, oldTableName, newTableName)
+if (db == globalTempDB) {
+  globalTempViews.rename(oldTableName, newTableName)
--- End diff --

Do we support rename for local temp view?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395517
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/GlobalTempViewManager.scala
 ---
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.catalog
+
+import javax.annotation.concurrent.GuardedBy
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.AnalysisException
+import 
org.apache.spark.sql.catalyst.analysis.TempTableAlreadyExistsException
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.catalyst.util.StringUtils
+
+
+/**
+ * A thread-safe manager for global temporary views, providing atomic 
operations to manage them,
+ * e.g. create, update, remove, etc.
+ *
+ * Note that, the view name is always case-sensitive here, callers are 
responsible to format the
+ * view name w.r.t. case-sensitive config.
+ */
+class GlobalTempViewManager {
--- End diff --

Seems methods in this class need docs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395993
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala ---
@@ -277,7 +275,7 @@ class CatalogImpl(sparkSession: SparkSession) extends 
Catalog {
   }
 
   /**
-   * Drops the temporary view with the given view name in the catalog.
+   * Drops the local temporary view with the given view name in the 
catalog.
--- End diff --

Seems we can also explain what local means. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395571
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -47,6 +50,8 @@ object SessionCatalog {
  */
 class SessionCatalog(
 externalCatalog: ExternalCatalog,
+globalTempDB: String,
+globalTempViews: GlobalTempViewManager,
--- End diff --

Same question


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395605
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -329,33 +343,77 @@ class SessionCatalog(
   // --
 
   /**
-   * Create a temporary table.
+   * Create a local temporary view.
*/
   def createTempView(
   name: String,
-  tableDefinition: LogicalPlan,
+  viewDefinition: LogicalPlan,
   overrideIfExists: Boolean): Unit = synchronized {
-val table = formatTableName(name)
-if (tempTables.contains(table) && !overrideIfExists) {
+val viewName = formatTableName(name)
+if (tempTables.contains(viewName) && !overrideIfExists) {
   throw new TempTableAlreadyExistsException(name)
--- End diff --

Maybe good to avoid of having this kind of changes in this PR? We can 
submit a follow-up one to make the variable name consistent. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395754
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -393,21 +459,25 @@ class SessionCatalog(
*/
   def renameTable(oldName: TableIdentifier, newName: String): Unit = 
synchronized {
 val db = formatDatabaseName(oldName.database.getOrElse(currentDb))
-requireDbExists(db)
 val oldTableName = formatTableName(oldName.table)
 val newTableName = formatTableName(newName)
-if (oldName.database.isDefined || !tempTables.contains(oldTableName)) {
-  requireTableExists(TableIdentifier(oldTableName, Some(db)))
-  requireTableNotExists(TableIdentifier(newTableName, Some(db)))
-  externalCatalog.renameTable(db, oldTableName, newTableName)
+if (db == globalTempDB) {
+  globalTempViews.rename(oldTableName, newTableName)
--- End diff --

If not, seems we do not need to support rename for global temp view?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80396066
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala
 ---
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.test.SharedSQLContext
+import org.apache.spark.sql.types.StructType
+
+class GlobalTempViewSuite extends QueryTest with SharedSQLContext {
+  import testImplicits._
+
+  override protected def beforeAll(): Unit = {
+super.beforeAll()
+globalTempDB = spark.sharedState.globalTempDB
+  }
+
+  private var globalTempDB: String = _
+
+  test("basic semantic") {
+sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1, 'a'")
+
+// If there is no database in table name, we should try local temp 
view first, if not found,
+// try table/view in current database, which is "default" in this 
case. So we expect
+// NoSuchTableException here.
+intercept[NoSuchTableException](spark.table("src"))
+
+// Use qualified name to refer to the global temp view explicitly.
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Table name without database will never refer to a global temp view.
+intercept[NoSuchTableException](sql("DROP VIEW src"))
+
+sql(s"DROP VIEW $globalTempDB.src")
+// The global temp view should be dropped successfully.
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+
+// We can also use Dataset API to create global temp view
+Seq(1 -> "a").toDF("i", "j").createGlobalTempView("src")
+checkAnswer(spark.table(s"$globalTempDB.src"), Row(1, "a"))
+
+// Use qualified name to rename a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src RENAME TO src2")
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src"))
+checkAnswer(spark.table(s"$globalTempDB.src2"), Row(1, "a"))
+
+// Use qualified name to alter a global temp view.
+sql(s"ALTER VIEW $globalTempDB.src2 AS SELECT 2, 'b'")
+checkAnswer(spark.table(s"$globalTempDB.src2"), Row(2, "b"))
+
+// We can also use Catalog API to drop global temp view
+spark.catalog.dropGlobalTempView("src2")
+intercept[NoSuchTableException](spark.table(s"$globalTempDB.src2"))
+  }
+
+  test("global temp view database should be preserved") {
+val e = intercept[AnalysisException](sql(s"CREATE DATABASE 
$globalTempDB"))
+assert(e.message.contains("system preserved database"))
+
+val e2 = intercept[AnalysisException](sql(s"USE $globalTempDB"))
+assert(e2.message.contains("system preserved database"))
+  }
+
+  test("CREATE TABLE LIKE should work for global temp view") {
+try {
+  sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1 AS a, '2' AS b")
+  sql(s"CREATE TABLE cloned LIKE ${globalTempDB}.src")
+  val tableMeta = 
spark.sessionState.catalog.getTableMetadata(TableIdentifier("cloned"))
+  assert(tableMeta.schema == new StructType().add("a", "int", 
false).add("b", "string", false))
+} finally {
+  spark.catalog.dropGlobalTempView("src")
+  sql("DROP TABLE default.cloned")
+}
+  }
+
+  test("list global temp views") {
+try {
+  sql("CREATE GLOBAL TEMP VIEW v1 AS SELECT 3, 4")
+  sql("CREATE TEMP VIEW v2 AS SELECT 1, 2")
+
+  checkAnswer(sql(s"SHOW TABLES IN $globalTempDB"),
+Row(globalTempDB, "v1", true) ::
+Row("", "v2", true) :: Nil)
+
+  
assert(spark.catalog.listTables(globalTempDB).collect().toSeq.map(_.name) == 
Seq("v1", "v2"))
+} finally {
+   

[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395510
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/GlobalTempViewManager.scala
 ---
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.catalog
+
+import javax.annotation.concurrent.GuardedBy
+
+import scala.collection.mutable
+
+import org.apache.spark.sql.AnalysisException
+import 
org.apache.spark.sql.catalyst.analysis.TempTableAlreadyExistsException
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.catalyst.util.StringUtils
+
+
+/**
+ * A thread-safe manager for global temporary views, providing atomic 
operations to manage them,
+ * e.g. create, update, remove, etc.
+ *
+ * Note that, the view name is always case-sensitive here, callers are 
responsible to format the
+ * view name w.r.t. case-sensitive config.
+ */
+class GlobalTempViewManager {
+
+  /** List of view definitions, mapping from view name to logical plan. */
+  @GuardedBy("this")
+  private val viewDefinitions = new mutable.HashMap[String, LogicalPlan]
+
+  def get(name: String): Option[LogicalPlan] = synchronized {
+viewDefinitions.get(name)
+  }
+
+  def create(
+  name: String,
+  viewDefinition: LogicalPlan,
+  overrideIfExists: Boolean): Unit = synchronized {
+if (!overrideIfExists && viewDefinitions.contains(name)) {
+  throw new TempTableAlreadyExistsException(name)
+}
+viewDefinitions.put(name, viewDefinition)
+  }
+
+  def update(
+  name: String,
+  viewDefinition: LogicalPlan): Boolean = synchronized {
+// Only update it when the view with the given name exits.
+if (viewDefinitions.contains(name)) {
+  viewDefinitions.put(name, viewDefinition)
+  true
+} else {
+  false
+}
+  }
--- End diff --

When do we use `update` and when do we use `create`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80395895
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -197,6 +201,45 @@ case class CreateViewCommand(
   }
 }
 
+
+/**
+ * Create or replace a local/global temporary view with given data source.
+ */
+case class CreateTempViewUsing(
+tableIdent: TableIdentifier,
+userSpecifiedSchema: Option[StructType],
+replace: Boolean,
+global: Boolean,
+provider: String,
+options: Map[String, String]) extends RunnableCommand {
+
+  if (tableIdent.database.isDefined) {
+throw new AnalysisException(
+  s"Temporary view '$tableIdent' should not have specified a database")
+  }
+
+  def run(sparkSession: SparkSession): Seq[Row] = {
+val dataSource = DataSource(
+  sparkSession,
+  userSpecifiedSchema = userSpecifiedSchema,
+  className = provider,
+  options = options)
+
+val catalog = sparkSession.sessionState.catalog
+val viewDefinition = Dataset.ofRows(
+  sparkSession, 
LogicalRelation(dataSource.resolveRelation())).logicalPlan
+
+if (global) {
+  catalog.createGlobalTempView(tableIdent.table, viewDefinition, 
replace)
+} else {
+  catalog.createTempView(tableIdent.table, viewDefinition, replace)
+}
+
+Seq.empty[Row]
+  }
+}
--- End diff --

Is this command moved from somewhere? If so, is it possible to move it in 
the follow-up pr?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80393533
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -459,7 +459,8 @@ class Analyzer(
   case u: UnresolvedRelation =>
 val table = u.tableIdentifier
 if (table.database.isDefined && conf.runSQLonFile &&
-(!catalog.databaseExists(table.database.get) || 
!catalog.tableExists(table))) {
+(!(catalog.databaseExists(table.database.get) || 
catalog.isTemporaryTable(table)) ||
+!catalog.tableExists(table))) {
--- End diff --

Do we need to update the comment?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80280321
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -424,17 +496,24 @@ class SessionCatalog(
   purge: Boolean): Unit = synchronized {
 val db = formatDatabaseName(name.database.getOrElse(currentDb))
 val table = formatTableName(name.table)
-if (name.database.isDefined || !tempTables.contains(table)) {
-  requireDbExists(db)
-  // When ignoreIfNotExists is false, no exception is issued when the 
table does not exist.
-  // Instead, log it as an error message.
-  if (tableExists(TableIdentifier(table, Option(db {
-externalCatalog.dropTable(db, table, ignoreIfNotExists = true, 
purge = purge)
-  } else if (!ignoreIfNotExists) {
-throw new NoSuchTableException(db = db, table = table)
+if (db == globalTempDB) {
+  val viewExists = globalTempViews.remove(table)
+  if (!viewExists && !ignoreIfNotExists) {
+throw new NoSuchTableException(globalTempDB, table)
   }
 } else {
-  tempTables.remove(table)
+  if (name.database.isDefined || !tempTables.contains(table)) {
+requireDbExists(db)
+// When ignoreIfNotExists is false, no lexception is issued when 
the table does not exist.
--- End diff --

Nit: "exception" misspelt.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80280674
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -453,7 +532,11 @@ class SessionCatalog(
   val db = formatDatabaseName(name.database.getOrElse(currentDb))
   val table = formatTableName(name.table)
   val relationAlias = alias.getOrElse(table)
-  if (name.database.isDefined || !tempTables.contains(table)) {
+  if (db == globalTempDB) {
+globalTempViews.get(table).map { viewDef =>
+  SubqueryAlias(relationAlias, viewDef, Some(name))
+}.getOrElse(throw new NoSuchTableException(db, table))
+  } else if (name.database.isDefined || !tempTables.contains(table)) {
--- End diff --

We should probably update the ScalaDoc to mention the lookup order.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80282210
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -60,20 +90,21 @@ case class CreateViewCommand(
 child: LogicalPlan,
 allowExisting: Boolean,
 replace: Boolean,
-isTemporary: Boolean)
+viewType: ViewType)
   extends RunnableCommand {
 
   override protected def innerChildren: Seq[QueryPlan[_]] = Seq(child)
 
-  if (!isTemporary) {
-require(originalText.isDefined,
-  "The table to created with CREATE VIEW must have 'originalText'.")
+  if (viewType == PermanentView) {
+require(originalText.isDefined, "'originalText' must be provided to 
create permanent view")
   }
 
   if (allowExisting && replace) {
 throw new AnalysisException("CREATE VIEW with both IF NOT EXISTS and 
REPLACE is not allowed.")
   }
 
+  private def isTemporary = viewType == LocalTempView || viewType == 
GlobalTempView
+
--- End diff --

Having an `isTemporary` method in `ViewType` might be more convenient and 
helps in keeping consistency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80080673
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -47,6 +50,8 @@ object SessionCatalog {
  */
 class SessionCatalog(
 externalCatalog: ExternalCatalog,
+globalTempDB: String,
+globalTempViews: GlobalTempViewManager,
--- End diff --

Would it be better if we move `globalTempDB` into `GlobalTempViewManager` 
as a string field to enforce consistency? Essentially, `GlobalTempViewManager` 
is just the global temporary view database.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80279937
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -371,16 +431,24 @@ class SessionCatalog(
*/
   def getTempViewOrPermanentTableMetadata(name: TableIdentifier): 
CatalogTable = synchronized {
--- End diff --

So `getTableMetadataOption` is only responsible for retrieving metadata of 
tables from an external catalog? The method names are a littble bit confusing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80080872
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -142,8 +149,12 @@ class SessionCatalog(
   // 

 
   def createDatabase(dbDefinition: CatalogDatabase, ignoreIfExists: 
Boolean): Unit = {
-val qualifiedPath = 
makeQualifiedPath(dbDefinition.locationUri).toString
 val dbName = formatDatabaseName(dbDefinition.name)
+if (dbName == globalTempDB) {
--- End diff --

Does case sensitivity matter here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread liancheng
Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80281859
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -19,13 +19,46 @@ package org.apache.spark.sql.execution.command
 
 import scala.util.control.NonFatal
 
-import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
+import org.apache.spark.sql.{AnalysisException, Dataset, Row, SparkSession}
 import org.apache.spark.sql.catalyst.{SQLBuilder, TableIdentifier}
 import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, 
CatalogTable, CatalogTableType}
 import org.apache.spark.sql.catalyst.expressions.Alias
 import org.apache.spark.sql.catalyst.plans.QueryPlan
 import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project}
-import org.apache.spark.sql.types.StructType
+import org.apache.spark.sql.execution.datasources.{DataSource, 
LogicalRelation}
+import org.apache.spark.sql.types.{MetadataBuilder, StructType}
+
+
+/**
+ * ViewType is used to specify the expected view type when we want to 
create or replace a view in
+ * [[CreateViewCommand]].
+ */
+sealed trait ViewType
+
+/**
+ * LocalTempView means session-scoped local temporary views. Its lifetime 
is the lifetime of the
+ * session that created it, i.e. it will be automatically dropped when the 
session terminates. It's
+ * not tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+ */
+object LocalTempView extends ViewType
+
+/**
+ * GlobalTempView means cross-session global temporary views. Its lifetime 
is the lifetime of the
+ * Spark application, i.e. it will be automatically dropped when the 
application terminates. It's
+ * tied to a system preserved database `_global_temp`, and we must use the 
qualified name to refer a
+ * global temp view, e.g. SELECT * FROM _global_temp.view1.
+ */
+object GlobalTempView extends ViewType
+
+/**
+ * PermanentView means cross-session permanent views. Permanent views stay 
until they are
+ * explicitly dropped by user command. It's always tied to a database, 
default to the current
+ * database if not specified.
+ *
+ * Note that, Existing permanent view with the same name are not visible 
to the current session
+ * while the local temporary view exists, unless the view name is 
qualified by database.
+ */
+object PermanentView extends ViewType
--- End diff --

Is `PersistedView` a more conventional name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80198629
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -188,6 +199,10 @@ class SessionCatalog(
 
   def setCurrentDatabase(db: String): Unit = {
 val dbName = formatDatabaseName(db)
+if (dbName == globalTempDB) {
--- End diff --

good catch!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-23 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80198287
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -36,6 +36,9 @@ import org.apache.spark.sql.catalyst.util.StringUtils
 
 object SessionCatalog {
   val DEFAULT_DATABASE = "default"
+
+  val GLOBAL_TEMP_DB_CONF_KEY = "spark.sql.database.globalTemp"
--- End diff --

good idea


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80183259
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -188,6 +199,10 @@ class SessionCatalog(
 
   def setCurrentDatabase(db: String): Unit = {
 val dbName = formatDatabaseName(db)
+if (dbName == globalTempDB) {
--- End diff --

When `globalTempDB` is set to a name that is not in the lower case, this 
compare is not right. Thus, `formatDatabaseName` need to be applied to both 
sides. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r80180856
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -36,6 +36,9 @@ import org.apache.spark.sql.catalyst.util.StringUtils
 
 object SessionCatalog {
   val DEFAULT_DATABASE = "default"
+
+  val GLOBAL_TEMP_DB_CONF_KEY = "spark.sql.database.globalTemp"
--- End diff --

Should we follow `spark.sql.catalogImplementation` and define it 
[here](https://github.com/apache/spark/blob/2cd1bfa4f0c6625b0ab1dbeba2b9586b9a6a9f42/core/src/main/scala/org/apache/spark/internal/config/package.scala#L95-L99)?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-07 Thread skambha
Github user skambha commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77879554
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -451,7 +464,7 @@ class SessionCatalog(
 if (isTemporaryTable(name)) {
   true
 } else {
-  externalCatalog.tableExists(db, table)
+  globalTempViews.get(table).isDefined || 
externalCatalog.tableExists(db, table)
--- End diff --

I think the tableExists should also follow the similar semantics as the 
lookupRelation.
i.e. If database in TableIdentifier is defined, then table existence should 
be checked against the defined database and table.

The code comment is correct but the code is not enforcing it in 
tableExists. 

The table/view resolution in the lookupRelation is 
- if database is defined, then fetch the table/view from it. 
- if database is not defined, then the following checks happen in the 
particular order.  
   a)  check if it a local temp view exists
   b) if a global temp view with name exists. 
   c) If not, then check if there is a table/view  in current db. 

Currently, if I do tableExists(db1.t1), it will return true  if a global 
temp view t1 exists but there is no db1.t1 in the externalCatalog. 
   
In tableExists method, we should enforce the check if database is defined, 
we should check against the externalCatalog and not the globalTempView... and 
enhance it to  follow the resolution semantics in the lookupRelation. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-02 Thread srinathshankar
Github user srinathshankar commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77424940
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -222,10 +245,19 @@ case class AlterViewAsCommand(
 qe.assertAnalyzed()
 val analyzedPlan = qe.analyzed
 
-if (session.sessionState.catalog.isTemporaryTable(name)) {
-  session.sessionState.catalog.createTempView(name.table, 
analyzedPlan, overrideIfExists = true)
-} else {
+if (name.database.isDefined) {
   alterPermanentView(session, analyzedPlan)
+} else if (session.sessionState.catalog.isTemporaryTable(name)) {
--- End diff --

Shouldn't we check the current database for the view name ? Or do we assume 
that database is resolved into the tableidentifier ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-02 Thread srinathshankar
Github user srinathshankar commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77424851
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -222,10 +245,19 @@ case class AlterViewAsCommand(
 qe.assertAnalyzed()
 val analyzedPlan = qe.analyzed
 
-if (session.sessionState.catalog.isTemporaryTable(name)) {
-  session.sessionState.catalog.createTempView(name.table, 
analyzedPlan, overrideIfExists = true)
-} else {
+if (name.database.isDefined) {
   alterPermanentView(session, analyzedPlan)
+} else if (session.sessionState.catalog.isTemporaryTable(name)) {
+  session.sessionState.catalog.createTempView(
--- End diff --

Not part of this patch, but why is it ok for an alter view to work on a 
table ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77265693
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -255,3 +287,43 @@ case class AlterViewAsCommand(
 session.sessionState.catalog.alterTable(updatedViewMeta)
   }
 }
+
+/**
+ * Drops a view which matches the given name.
+ *
+ * If the view name contains database prefix, this command will drop a 
permanent view matching the
+ * given name, or throw an exception if view not exist. Else, this command 
will try to drop a local
+ * temporary view first, if view not exist, try global temporary view 
next, if still not exist, try
--- End diff --

I think it is a bit risky to provide user the ability to drop a global view 
implicitly. 

I think all global view modification commands should be done explicitly, 
including create, drop, alter..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77265560
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -198,10 +219,12 @@ case class CreateViewCommand(
 }
 
 /**
- * Alter a view with given query plan. If the view name contains database 
prefix, this command will
- * alter a permanent view matching the given name, or throw an exception 
if view not exist. Else,
- * this command will try to alter a temporary view first, if view not 
exist, try permanent view
- * next, if still not exist, throw an exception.
+ * Alter a view with given query plan.
+ *
+ * If the view name contains database prefix, this command will alter a 
permanent view matching the
--- End diff --

We may want separate command for alter global view.

It is not safe for a user to be able to alter a global view directly. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77264967
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -25,7 +25,41 @@ import 
org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTable
 import org.apache.spark.sql.catalyst.expressions.Alias
 import org.apache.spark.sql.catalyst.plans.QueryPlan
 import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project}
-import org.apache.spark.sql.types.StructType
+import org.apache.spark.sql.types.MetadataBuilder
+
+
+/**
+ * ViewType is used to specify the expected view type when we want to 
create or replace a view in
+ * [[CreateViewCommand]].
+ */
+sealed trait ViewType
+
+/**
+ * LocalTempView means session-scoped local temporary views. Its lifetime 
is the lifetime of the
+ * session that created it, i.e. it will be automatically dropped when the 
session terminates. It's
+ * not tied to any databases, i.e. we can't use `db1.view1` to reference a 
local temporary view.
+ */
+object LocalTempView extends ViewType
+
+/**
+ * GlobalTempView means cross-session global temporary views. Its lifetime 
is the lifetime of the
+ * Spark application, i.e. it will be automatically dropped when the 
application terminates. It's
+ * not tied to any databases, i.e. we can't use `db1.view1` to reference a 
global temporary view.
+ *
+ * Note that, Existing global temp view with the same name are not visible 
to the current session
--- End diff --

If a local temporary view with the same name exists, then the global 
temporary view is not visible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77264028
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/TempViewManager.scala
 ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.catalog
+
+import scala.collection.mutable
+
+import 
org.apache.spark.sql.catalyst.analysis.TempTableAlreadyExistsException
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+
+
+/**
+ * A manager for a list of temp views, providing the interface to 
get/create/update/drop temp views.
+ */
+class TempViewManager {
+  /** List of temporary views, mapping from view name to logical plan. */
+  private val tempViews = new mutable.HashMap[String, LogicalPlan]
+
+  def get(name: String): Option[LogicalPlan] = tempViews.synchronized {
--- End diff --

This lock is very coarse-grain.

Should we use ConcurrentHashMap?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-09-01 Thread clockfly
Github user clockfly commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77263886
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/TempViewManager.scala
 ---
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.catalog
+
+import scala.collection.mutable
+
+import 
org.apache.spark.sql.catalyst.analysis.TempTableAlreadyExistsException
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+
+
+/**
+ * A manager for a list of temp views, providing the interface to 
get/create/update/drop temp views.
+ */
+class TempViewManager {
--- End diff --

Will the TempViewManager used by local views?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-08-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77098400
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -255,3 +287,43 @@ case class AlterViewAsCommand(
 session.sessionState.catalog.alterTable(updatedViewMeta)
   }
 }
+
+/**
+ * Drops a view which matches the given name.
+ *
+ * If the view name contains database prefix, this command will drop a 
permanent view matching the
+ * given name, or throw an exception if view not exist. Else, this command 
will try to drop a local
+ * temporary view first, if view not exist, try global temporary view 
next, if still not exist, try
--- End diff --

If that is a design choice, never mind.
I think it's okay if this PR can provide `DROP GLOBAL VIEW` together.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-08-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77097547
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -255,3 +287,43 @@ case class AlterViewAsCommand(
 session.sessionState.catalog.alterTable(updatedViewMeta)
   }
 }
+
+/**
+ * Drops a view which matches the given name.
+ *
+ * If the view name contains database prefix, this command will drop a 
permanent view matching the
+ * given name, or throw an exception if view not exist. Else, this command 
will try to drop a local
+ * temporary view first, if view not exist, try global temporary view 
next, if still not exist, try
--- End diff --

Yes. Right. Up to now, all happens in the same session. So, temporary view 
is created after permanent view always. It works like STACK.

However, this is a global view made by other session. So, it does not work 
like STACK. Even in the same session, if we execute 'create temporary view' and 
'create global view', we must delete the local temporary view first. So, it 
seems more serious to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-08-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77097177
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -255,3 +287,43 @@ case class AlterViewAsCommand(
 session.sessionState.catalog.alterTable(updatedViewMeta)
   }
 }
+
+/**
+ * Drops a view which matches the given name.
+ *
+ * If the view name contains database prefix, this command will drop a 
permanent view matching the
+ * given name, or throw an exception if view not exist. Else, this command 
will try to drop a local
+ * temporary view first, if view not exist, try global temporary view 
next, if still not exist, try
--- End diff --

see document here: 
https://github.com/apache/spark/pull/14897/files#diff-cfec39cf8accabd227bd325f0a0a5f30R49

yea, there is no way to drop a global temp view if a temp view with same 
name exists. We can have some discussion about it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-08-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77008695
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -255,3 +287,43 @@ case class AlterViewAsCommand(
 session.sessionState.catalog.alterTable(updatedViewMeta)
   }
 }
+
+/**
+ * Drops a view which matches the given name.
+ *
+ * If the view name contains database prefix, this command will drop a 
permanent view matching the
+ * given name, or throw an exception if view not exist. Else, this command 
will try to drop a local
+ * temporary view first, if view not exist, try global temporary view 
next, if still not exist, try
--- End diff --

Could you add `DROP GLOBAL VIEW` like `CREATE GLOBAL TEMPORARY VIEW`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-08-31 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14897#discussion_r77007610
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ---
@@ -255,3 +287,43 @@ case class AlterViewAsCommand(
 session.sessionState.catalog.alterTable(updatedViewMeta)
   }
 }
+
+/**
+ * Drops a view which matches the given name.
+ *
+ * If the view name contains database prefix, this command will drop a 
permanent view matching the
+ * given name, or throw an exception if view not exist. Else, this command 
will try to drop a local
+ * temporary view first, if view not exist, try global temporary view 
next, if still not exist, try
--- End diff --

Hi, @cloud-fan .

Then, is there no way to drop a global temporary view when there exists a 
local temporary view?

e.g. If I created the followings,
1. default.view1 (permanent view in `default` schema)
2. view1 (local temporary view).

And, suddenly, a global view is created by another session.
3. view1 (global temporary view)

In my session, is there any way to drop 3 (global view view1) without 
dropping 2 (my local view 2)?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14897: [SPARK-17338][SQL] add global temp view

2016-08-31 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/14897

[SPARK-17338][SQL] add global temp view

## What changes were proposed in this pull request?

global temporary view is a cross-session temporary view, which means it's 
shared among all sessions. Its lifetime is the lifetime of the Spark 
application, i.e. it will be automatically dropped when the application 
terminates. It's not tied to any databases, i.e. we can't use `db1.view1` to 
reference a global temporary view.

## How was this patch tested?

new tests in `SQLViewSuite`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark global-temp-view

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14897.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14897


commit 2b97fba6eb754d8d3b24026167749559e5dfe517
Author: Wenchen Fan 
Date:   2016-08-31T14:46:19Z

add global temp view




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org