[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-07-29 Thread gatorsmile
Github user gatorsmile closed the pull request at:

https://github.com/apache/spark/pull/20331


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-03-07 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r173071130
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonHadoopFsRelationSuite.scala
 ---
@@ -110,14 +113,16 @@ class JsonHadoopFsRelationSuite extends 
HadoopFsRelationTest {
 
   test("invalid json with leading nulls - from file (multiLine=true)") {
 import testImplicits._
-withTempDir { tempDir =>
-  val path = tempDir.getAbsolutePath
-  Seq(badJson, """{"a":1}""").toDS().write.mode("overwrite").text(path)
-  val expected = s"""$badJson\n{"a":1}\n"""
-  val schema = new StructType().add("a", 
IntegerType).add("_corrupt_record", StringType)
-  val df =
-spark.read.format(dataSourceName).option("multiLine", 
true).schema(schema).load(path)
-  checkAnswer(df, Row(null, expected))
+withSQLConf(SQLConf.MAX_RECORDS_PER_FILE.key -> "2") {
--- End diff --

I think the default value won't be less than 2, we don't need to be so 
careful...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r163369167
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonHadoopFsRelationSuite.scala
 ---
@@ -110,14 +113,16 @@ class JsonHadoopFsRelationSuite extends 
HadoopFsRelationTest {
 
   test("invalid json with leading nulls - from file (multiLine=true)") {
 import testImplicits._
-withTempDir { tempDir =>
-  val path = tempDir.getAbsolutePath
-  Seq(badJson, """{"a":1}""").toDS().write.mode("overwrite").text(path)
-  val expected = s"""$badJson\n{"a":1}\n"""
-  val schema = new StructType().add("a", 
IntegerType).add("_corrupt_record", StringType)
-  val df =
-spark.read.format(dataSourceName).option("multiLine", 
true).schema(schema).load(path)
-  checkAnswer(df, Row(null, expected))
+withSQLConf(SQLConf.MAX_RECORDS_PER_FILE.key -> "2") {
--- End diff --

The test will fail if `SQLConf.MAX_RECORDS_PER_FILE.key` is less than 2


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r163354261
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonHadoopFsRelationSuite.scala
 ---
@@ -110,14 +113,16 @@ class JsonHadoopFsRelationSuite extends 
HadoopFsRelationTest {
 
   test("invalid json with leading nulls - from file (multiLine=true)") {
 import testImplicits._
-withTempDir { tempDir =>
-  val path = tempDir.getAbsolutePath
-  Seq(badJson, """{"a":1}""").toDS().write.mode("overwrite").text(path)
-  val expected = s"""$badJson\n{"a":1}\n"""
-  val schema = new StructType().add("a", 
IntegerType).add("_corrupt_record", StringType)
-  val df =
-spark.read.format(dataSourceName).option("multiLine", 
true).schema(schema).load(path)
-  checkAnswer(df, Row(null, expected))
+withSQLConf(SQLConf.MAX_RECORDS_PER_FILE.key -> "2") {
--- End diff --

Just curious, why this change?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r162838622
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcHadoopFsRelationSuite.scala
 ---
@@ -82,44 +80,4 @@ class OrcHadoopFsRelationSuite extends 
HadoopFsRelationTest {
   }
 }
   }
-
-  test("SPARK-13543: Support for specifying compression codec for ORC via 
option()") {
-withTempPath { dir =>
-  val path = s"${dir.getCanonicalPath}/table1"
-  val df = (1 to 5).map(i => (i, (i % 2).toString)).toDF("a", "b")
-  df.write
-.option("compression", "ZlIb")
-.orc(path)
-
-  // Check if this is compressed as ZLIB.
-  val maybeOrcFile = new File(path).listFiles().find { f =>
-!f.getName.startsWith("_") && f.getName.endsWith(".zlib.orc")
-  }
-  assert(maybeOrcFile.isDefined)
-  val orcFilePath = maybeOrcFile.get.toPath.toString
-  val expectedCompressionKind =
-OrcFileOperator.getFileReader(orcFilePath).get.getCompression
-  assert("ZLIB" === expectedCompressionKind.name())
-
-  val copyDf = spark
-.read
-.orc(path)
-  checkAnswer(df, copyDf)
-}
-  }
-
-  test("Default compression codec is snappy for ORC compression") {
-withTempPath { file =>
-  spark.range(0, 10).write
-.orc(file.getCanonicalPath)
-  val expectedCompressionKind =
-
OrcFileOperator.getFileReader(file.getCanonicalPath).get.getCompression
--- End diff --

@gatorsmile . This test case should be tested on `native` implementation, 
too.
`HiveOrcHadoopFsRelationSuite` test coverage is only `hive` implementation.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r162838329
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcHadoopFsRelationSuite.scala
 ---
@@ -82,44 +80,4 @@ class OrcHadoopFsRelationSuite extends 
HadoopFsRelationTest {
   }
 }
   }
-
-  test("SPARK-13543: Support for specifying compression codec for ORC via 
option()") {
-withTempPath { dir =>
-  val path = s"${dir.getCanonicalPath}/table1"
-  val df = (1 to 5).map(i => (i, (i % 2).toString)).toDF("a", "b")
-  df.write
-.option("compression", "ZlIb")
-.orc(path)
-
-  // Check if this is compressed as ZLIB.
-  val maybeOrcFile = new File(path).listFiles().find { f =>
-!f.getName.startsWith("_") && f.getName.endsWith(".zlib.orc")
-  }
-  assert(maybeOrcFile.isDefined)
-  val orcFilePath = maybeOrcFile.get.toPath.toString
-  val expectedCompressionKind =
-OrcFileOperator.getFileReader(orcFilePath).get.getCompression
-  assert("ZLIB" === expectedCompressionKind.name())
-
-  val copyDf = spark
-.read
-.orc(path)
-  checkAnswer(df, copyDf)
-}
-  }
-
-  test("Default compression codec is snappy for ORC compression") {
-withTempPath { file =>
-  spark.range(0, 10).write
-.orc(file.getCanonicalPath)
-  val expectedCompressionKind =
-
OrcFileOperator.getFileReader(file.getCanonicalPath).get.getCompression
-  assert("SNAPPY" === expectedCompressionKind.name())
-}
-  }
-}
-
-class HiveOrcHadoopFsRelationSuite extends OrcHadoopFsRelationSuite {
--- End diff --

Thank you!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r162683746
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcHadoopFsRelationSuite.scala
 ---
@@ -82,44 +80,4 @@ class OrcHadoopFsRelationSuite extends 
HadoopFsRelationTest {
   }
 }
   }
-
-  test("SPARK-13543: Support for specifying compression codec for ORC via 
option()") {
-withTempPath { dir =>
-  val path = s"${dir.getCanonicalPath}/table1"
-  val df = (1 to 5).map(i => (i, (i % 2).toString)).toDF("a", "b")
-  df.write
-.option("compression", "ZlIb")
-.orc(path)
-
-  // Check if this is compressed as ZLIB.
-  val maybeOrcFile = new File(path).listFiles().find { f =>
-!f.getName.startsWith("_") && f.getName.endsWith(".zlib.orc")
-  }
-  assert(maybeOrcFile.isDefined)
-  val orcFilePath = maybeOrcFile.get.toPath.toString
-  val expectedCompressionKind =
-OrcFileOperator.getFileReader(orcFilePath).get.getCompression
--- End diff --

The same here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r162683705
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcHadoopFsRelationSuite.scala
 ---
@@ -82,44 +80,4 @@ class OrcHadoopFsRelationSuite extends 
HadoopFsRelationTest {
   }
 }
   }
-
-  test("SPARK-13543: Support for specifying compression codec for ORC via 
option()") {
-withTempPath { dir =>
-  val path = s"${dir.getCanonicalPath}/table1"
-  val df = (1 to 5).map(i => (i, (i % 2).toString)).toDF("a", "b")
-  df.write
-.option("compression", "ZlIb")
-.orc(path)
-
-  // Check if this is compressed as ZLIB.
-  val maybeOrcFile = new File(path).listFiles().find { f =>
-!f.getName.startsWith("_") && f.getName.endsWith(".zlib.orc")
-  }
-  assert(maybeOrcFile.isDefined)
-  val orcFilePath = maybeOrcFile.get.toPath.toString
-  val expectedCompressionKind =
-OrcFileOperator.getFileReader(orcFilePath).get.getCompression
-  assert("ZLIB" === expectedCompressionKind.name())
-
-  val copyDf = spark
-.read
-.orc(path)
-  checkAnswer(df, copyDf)
-}
-  }
-
-  test("Default compression codec is snappy for ORC compression") {
-withTempPath { file =>
-  spark.range(0, 10).write
-.orc(file.getCanonicalPath)
-  val expectedCompressionKind =
-
OrcFileOperator.getFileReader(file.getCanonicalPath).get.getCompression
--- End diff --

`OrcFileOperator` is defined in `sql\hive`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/20331#discussion_r162683627
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcHadoopFsRelationSuite.scala
 ---
@@ -82,44 +80,4 @@ class OrcHadoopFsRelationSuite extends 
HadoopFsRelationTest {
   }
 }
   }
-
-  test("SPARK-13543: Support for specifying compression codec for ORC via 
option()") {
-withTempPath { dir =>
-  val path = s"${dir.getCanonicalPath}/table1"
-  val df = (1 to 5).map(i => (i, (i % 2).toString)).toDF("a", "b")
-  df.write
-.option("compression", "ZlIb")
-.orc(path)
-
-  // Check if this is compressed as ZLIB.
-  val maybeOrcFile = new File(path).listFiles().find { f =>
-!f.getName.startsWith("_") && f.getName.endsWith(".zlib.orc")
-  }
-  assert(maybeOrcFile.isDefined)
-  val orcFilePath = maybeOrcFile.get.toPath.toString
-  val expectedCompressionKind =
-OrcFileOperator.getFileReader(orcFilePath).get.getCompression
-  assert("ZLIB" === expectedCompressionKind.name())
-
-  val copyDf = spark
-.read
-.orc(path)
-  checkAnswer(df, copyDf)
-}
-  }
-
-  test("Default compression codec is snappy for ORC compression") {
-withTempPath { file =>
-  spark.range(0, 10).write
-.orc(file.getCanonicalPath)
-  val expectedCompressionKind =
-
OrcFileOperator.getFileReader(file.getCanonicalPath).get.getCompression
-  assert("SNAPPY" === expectedCompressionKind.name())
-}
-  }
-}
-
-class HiveOrcHadoopFsRelationSuite extends OrcHadoopFsRelationSuite {
--- End diff --

This is Hive only.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20331: [SPARK-23158] [SQL] Move HadoopFsRelationTest tes...

2018-01-19 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/20331

[SPARK-23158] [SQL] Move HadoopFsRelationTest test suites to from sql/hive 
to sql/core

## What changes were proposed in this pull request?
The test suites that extend HadoopFsRelationTest are not in sql/hive 
packages, but their directories are in sql/hive. We should move them to 
sql/core.

## How was this patch tested?
The existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark moveTests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20331.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20331


commit f7693f0abfe0923868c1918ddcaeaece2c107c5d
Author: gatorsmile 
Date:   2018-01-19T16:57:50Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org