Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18866#discussion_r131654292
  
    --- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala
 ---
    @@ -534,4 +534,29 @@ class InsertIntoHiveTableSuite extends QueryTest with 
TestHiveSingleton with Bef
           }
         }
       }
    +
    +  test("Insert data into hive bucketized table.") {
    +    sql("""
    +     |CREATE TABLE bucketizedTable (key int, value string)
    +     |CLUSTERED BY (key) SORTED BY (key ASC) into 4 buckets
    +     |ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    +     |""".stripMargin)
    +    val identifier = 
spark.sessionState.sqlParser.parseTableIdentifier("bucketizedTable")
    +    val data = spark.sparkContext.parallelize((0 until 100)
    +      .map(i => TestData(i, i.toString))).toDF()
    +    data.write.mode(SaveMode.Overwrite).insertInto("bucketizedTable")
    +    val dir = spark.sessionState.catalog.defaultTablePath(identifier)
    +    val bucketFiles = new File(dir).listFiles().sortWith((a: File, b: 
File) => {
    +      a.getName < b.getName
    +    }).filter(file => file.getName.startsWith("part-"))
    +    assert(bucketFiles.length === 4)
    +    (0 to 3).foreach { bucket =>
    +      spark.read.format("text")
    +        .load(bucketFiles(bucket).getAbsolutePath)
    +        .collect().map(_.getString(0).split("\t")(0).toInt)
    +        .zip((bucket to (100, 4))).foreach { case (a, b) =>
    +        assert(a === b)
    --- End diff --
    
    This looks obscure. I need to verify it by calculating HiveHash values for 
0 until 100. Maybe we should compute the actual hive hash value here, instead 
of `bucket to (100, 4)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to