[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34489: [SPARK-37210][SQL] Write to static partition in dynamic write mode

2021-11-30 Thread GitBox


dongjoon-hyun commented on a change in pull request #34489:
URL: https://github.com/apache/spark/pull/34489#discussion_r759919225



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
##
@@ -168,4 +170,23 @@ class HiveParquetSuite extends QueryTest with ParquetTest 
with TestHiveSingleton
   }
 }
   }
+
+  test("SPARK-37210: Concurrent write to different static partitions") {

Review comment:
   Do we have a JIRA issue for that? If not, could you file one?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34489: [SPARK-37210][SQL] Write to static partition in dynamic write mode

2021-11-30 Thread GitBox


dongjoon-hyun commented on a change in pull request #34489:
URL: https://github.com/apache/spark/pull/34489#discussion_r759917788



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
##
@@ -168,4 +172,30 @@ class HiveParquetSuite extends QueryTest with ParquetTest 
with TestHiveSingleton
   }
 }
   }
+
+  test("SPARK-37210: Concurrent write to different static partitions") {
+System.setProperty("spark.sql.test.master", "local[10]")
+import testImplicits._
+withSQLConf(SQLConf.LEAF_NODE_DEFAULT_PARALLELISM.key -> "1",
+  SQLConf.MAX_RECORDS_PER_FILE.key -> "1000") {
+  withTable("t") {
+(1 to 1).map(i => (i, i))
+  .toDF("_1", "_2").createOrReplaceTempView("s")

Review comment:
   If possible, one liner would be better.
   ```scala
   (1 to 1).map(i => (i, i)).toDF("_1", "_2").createOrReplaceTempView("s")
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34489: [SPARK-37210][SQL] Write to static partition in dynamic write mode

2021-11-30 Thread GitBox


dongjoon-hyun commented on a change in pull request #34489:
URL: https://github.com/apache/spark/pull/34489#discussion_r759917283



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHiveSingleton.scala
##
@@ -27,9 +27,9 @@ import org.apache.spark.sql.hive.client.HiveClient
 
 trait TestHiveSingleton extends SparkFunSuite with BeforeAndAfterAll {
   override protected val enableAutoThreadAudit = false
-  protected val spark: SparkSession = TestHive.sparkSession
-  protected val hiveContext: TestHiveContext = TestHive
-  protected val hiveClient: HiveClient =
+  protected lazy val spark: SparkSession = TestHive.sparkSession
+  protected lazy val hiveContext: TestHiveContext = TestHive
+  protected lazy val hiveClient: HiveClient =

Review comment:
   Could you add some comment to explain why these three `lazy` are 
required, please?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34489: [SPARK-37210][SQL] Write to static partition in dynamic write mode

2021-11-30 Thread GitBox


dongjoon-hyun commented on a change in pull request #34489:
URL: https://github.com/apache/spark/pull/34489#discussion_r759917283



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/test/TestHiveSingleton.scala
##
@@ -27,9 +27,9 @@ import org.apache.spark.sql.hive.client.HiveClient
 
 trait TestHiveSingleton extends SparkFunSuite with BeforeAndAfterAll {
   override protected val enableAutoThreadAudit = false
-  protected val spark: SparkSession = TestHive.sparkSession
-  protected val hiveContext: TestHiveContext = TestHive
-  protected val hiveClient: HiveClient =
+  protected lazy val spark: SparkSession = TestHive.sparkSession
+  protected lazy val hiveContext: TestHiveContext = TestHive
+  protected lazy val hiveClient: HiveClient =

Review comment:
   Could you add some comment to explain why these `lazy` are required, 
please?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34489: [SPARK-37210][SQL] Write to static partition in dynamic write mode

2021-11-11 Thread GitBox


dongjoon-hyun commented on a change in pull request #34489:
URL: https://github.com/apache/spark/pull/34489#discussion_r747812090



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
##
@@ -168,4 +170,23 @@ class HiveParquetSuite extends QueryTest with ParquetTest 
with TestHiveSingleton
   }
 }
   }
+
+  test("SPARK-37210: Concurrent write to different static partitions") {
+withTable("t") {
+  sql("CREATE TABLE t (c1 int, c2 int) PARTITIONED BY (c3 int, c4 int) 
STORED AS PARQUET")
+  val selectStatement = "SELECT 1 AS c1, 2 AS c2"
+  val tasks: Seq[Callable[Unit]] = (1 to 4).map(i => {
+new Callable[Unit] {
+  override def call(): Unit = {
+sql(s"INSERT OVERWRITE TABLE t PARTITION(c3=3$i, c4=4$i) 
$selectStatement")
+  }
+}
+  })
+  import scala.collection.JavaConverters._

Review comment:
   Please move this to the import section at the beginning of this file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34489: [SPARK-37210][SQL] Write to static partition in dynamic write mode

2021-11-11 Thread GitBox


dongjoon-hyun commented on a change in pull request #34489:
URL: https://github.com/apache/spark/pull/34489#discussion_r747811433



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
##
@@ -168,4 +170,23 @@ class HiveParquetSuite extends QueryTest with ParquetTest 
with TestHiveSingleton
   }
 }
   }
+
+  test("SPARK-37210: Concurrent write to different static partitions") {
+withTable("t") {
+  sql("CREATE TABLE t (c1 int, c2 int) PARTITIONED BY (c3 int, c4 int) 
STORED AS PARQUET")
+  val selectStatement = "SELECT 1 AS c1, 2 AS c2"
+  val tasks: Seq[Callable[Unit]] = (1 to 4).map(i => {
+new Callable[Unit] {
+  override def call(): Unit = {
+sql(s"INSERT OVERWRITE TABLE t PARTITION(c3=3$i, c4=4$i) 
$selectStatement")

Review comment:
   Shall we simplify this by using `VALUES` instead of `$selectStatement`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34489: [SPARK-37210][SQL] Write to static partition in dynamic write mode

2021-11-11 Thread GitBox


dongjoon-hyun commented on a change in pull request #34489:
URL: https://github.com/apache/spark/pull/34489#discussion_r747810048



##
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala
##
@@ -168,4 +170,23 @@ class HiveParquetSuite extends QueryTest with ParquetTest 
with TestHiveSingleton
   }
 }
   }
+
+  test("SPARK-37210: Concurrent write to different static partitions") {

Review comment:
   This PR aims to support concurrent write for `INSERT OVERWRITE` only. Do 
we support non-overwrite INSERT case?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org