[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

gatorsmile Thu, 05 Jan 2017 15:50:42 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16135#discussion_r94877047
  
    --- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/PartitionedTablePerfStatsSuite.scala
 ---
    @@ -352,4 +353,34 @@ class PartitionedTablePerfStatsSuite
           }
         }
       }
    +
    +  test("SPARK-18700: table loaded only once even when resolved 
concurrently") {
    +    withSQLConf(SQLConf.HIVE_MANAGE_FILESOURCE_PARTITIONS.key -> "false") {
    +      withTable("test") {
    +        withTempDir { dir =>
    +          HiveCatalogMetrics.reset()
    +          setupPartitionedHiveTable("test", dir, 50)
    +          // select the table in multi-threads
    +          val executorPool = Executors.newFixedThreadPool(10)
    +          (1 to 10).map(threadId => {
    +            val runnable = new Runnable {
    +              override def run(): Unit = {
    +                spark.sql("select * from test where partCol1 = 
999").count()
    +              }
    +            }
    +            executorPool.execute(runnable)
    +            None
    +          })
    +          executorPool.shutdown()
    +          executorPool.awaitTermination(30, TimeUnit.SECONDS)
    +          // check the cache hit, we use the metric of 
METRIC_FILES_DISCOVERED and
    +          // METRIC_PARALLEL_LISTING_JOB_COUNT to check this, while the 
lock take effect,
    +          // only one thread can really do the build, so the listing job 
count is 2, the other
    +          // one is cache.load func. Also METRIC_FILES_DISCOVERED is 
$partition_num * 2
    --- End diff --
    
    Working on a fix to avoid the useless filesystem scan caused by the save() 
API.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #16135: [SPARK-18700][SQL] Add StripedLock for each table...

Reply via email to