[
https://issues.apache.org/jira/browse/SPARK-22967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316278#comment-16316278
]
Hyukjin Kwon commented on SPARK-22967:
--------------------------------------
Hm .. I haven't taken a look for it closely yet but can we ignore it in that
case in anyway? If it's difficult or impossible, I think we can just skip on
Windows as it's going to fail anyway. I did this several times. For example, I
think you can refer https://github.com/apache/spark/pull/16999.
I know it take a whole to investigate to check if other tests are failed on
Windows but it should be really nicer if we can identify some more test cases
failed on Windows fix them in a batch.
> VersionSuite failed on Windows caused by unescapeSQLString()
> ------------------------------------------------------------
>
> Key: SPARK-22967
> URL: https://issues.apache.org/jira/browse/SPARK-22967
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.1
> Environment: Windos7
> Reporter: wuyi
> Priority: Minor
> Labels: build, test, windows
>
> On Windows system, two unit test case would fail while running VersionSuite
> ("A simple set of tests that call the methods of a `HiveClient`, loading
> different version of hive from maven central.")
> Failed A : test(s"$version: read avro file containing decimal")
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException:
> MetaException(message:java.lang.IllegalArgumentException: Can not create a
> Path from an empty string);
> {code}
> Failed B: test(s"$version: SPARK-17920: Insert into/overwrite avro table")
> {code:java}
> Unable to infer the schema. The schema specification is required to create
> the table `default`.`tab2`.;
> org.apache.spark.sql.AnalysisException: Unable to infer the schema. The
> schema specification is required to create the table `default`.`tab2`.;
> {code}
> As I deep into this problem, I found it is related to
> ParserUtils#unescapeSQLString().
> These are two lines at the beginning of Failed A:
> {code:java}
> val url =
> Thread.currentThread().getContextClassLoader.getResource("avroDecimal")
> val location = new File(url.getFile)
> {code}
> And in my environment,`location` (path value) is
> {code:java}
> D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
> {code}
> And then, in SparkSqlParser#visitCreateHiveTable()#L1128:
> {code:java}
> val location = Option(ctx.locationSpec).map(visitLocationSpec)
> {code}
> This line want to get LocationSepcContext's content first, which is equal to
> `location` above.
> Then, the content is passed to visitLocationSpec(), and passed to
> unescapeSQLString()
> finally.
> Lets' have a look at unescapeSQLString():
> {code:java}
> /** Unescape baskslash-escaped string enclosed by quotes. */
> def unescapeSQLString(b: String): String = {
> var enclosure: Character = null
> val sb = new StringBuilder(b.length())
> def appendEscapedChar(n: Char) {
> n match {
> case '0' => sb.append('\u0000')
> case '\'' => sb.append('\'')
> case '"' => sb.append('\"')
> case 'b' => sb.append('\b')
> case 'n' => sb.append('\n')
> case 'r' => sb.append('\r')
> case 't' => sb.append('\t')
> case 'Z' => sb.append('\u001A')
> case '\\' => sb.append('\\')
> // The following 2 lines are exactly what MySQL does TODO: why do we
> do this?
> case '%' => sb.append("\\%")
> case '_' => sb.append("\\_")
> case _ => sb.append(n)
> }
> }
> var i = 0
> val strLength = b.length
> while (i < strLength) {
> val currentChar = b.charAt(i)
> if (enclosure == null) {
> if (currentChar == '\'' || currentChar == '\"') {
> enclosure = currentChar
> }
> } else if (enclosure == currentChar) {
> enclosure = null
> } else if (currentChar == '\\') {
> if ((i + 6 < strLength) && b.charAt(i + 1) == 'u') {
> // \u0000 style character literals.
> val base = i + 2
> val code = (0 until 4).foldLeft(0) { (mid, j) =>
> val digit = Character.digit(b.charAt(j + base), 16)
> (mid << 4) + digit
> }
> sb.append(code.asInstanceOf[Char])
> i += 5
> } else if (i + 4 < strLength) {
> // \000 style character literals.
> val i1 = b.charAt(i + 1)
> val i2 = b.charAt(i + 2)
> val i3 = b.charAt(i + 3)
> if ((i1 >= '0' && i1 <= '1') && (i2 >= '0' && i2 <= '7') && (i3 >=
> '0' && i3 <= '7')) {
> val tmp = ((i3 - '0') + ((i2 - '0') << 3) + ((i1 - '0') <<
> 6)).asInstanceOf[Char]
> sb.append(tmp)
> i += 3
> } else {
> appendEscapedChar(i1)
> i += 1
> }
> } else if (i + 2 < strLength) {
> // escaped character literals.
> val n = b.charAt(i + 1)
> appendEscapedChar(n)
> i += 1
> }
> } else {
> // non-escaped character literals.
> sb.append(currentChar)
> }
> i += 1
> }
> sb.toString()
> }
> {code}
> Again, here, variable `b` is equal to content and `location`, is valued of
> {code:java}
> D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
> {code}
> And we can make sense from the unescapeSQLString()' strategies that it
> transform the String "\t" into a escape character '\t' and remove all
> backslashes.
> So, our original correct location resulted in:
> {code:java}
> D:workspaceIdeaProjectssparksqlhive\targetscala-2.11\test-classesavroDecimal
> {code}
> after unescapeSQLString() completed.
> Note that, here, [ \t ] is no longer a string, but a escape character.
> Then, return into SparkSqlParser#visitCreateHiveTable(), and move to L1134:
> {code:java}
> val locUri = location.map(CatalogUtils.stringToURI(_))
> {code}
> `location` is passed to stringToURI(), and resulted in:
> {code:java}
> file:/D:workspaceIdeaProjectssparksqlhive%09argetscala-2.11%09est-classesavroDecimal
> {code}
> finally, as escape character '\t' is transformed into URI code '%09'.
> Although, I'm not clearly about how this wrong path directly caused that
> exception, as I almostly know nothing about Hive, I can verify that this
> wrong path is the real factor to cause this exception.
> When I append these lines(in order to fix the wrong path) after
> HiveExternalCatalog#doCreateTable()Line236-240:
> {code:java}
> if (tableLocation.get.getPath.startsWith("/D")) {
> tableLocation = Some(CatalogUtils.stringToURI(
>
> "file:/D:/workspace/IdeaProjects/spark/sql/hive/target/scala-2.11/test-classes/avroDecimal"))
> }
> {code}
>
> then, failed unit test A will pass, excluding test B.
> And below is the stack trace of the Exception:
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException:
> MetaException(message:java.lang.IllegalArgumentException: Can not create a
> Path from an empty string)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:602)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:469)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:273)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:210)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:209)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:256)
> at
> org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:467)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:263)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
> at
> org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
> at
> org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:119)
> at
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:304)
> at
> org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:128)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
> at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
> at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
> at org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3196)
> at
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
> at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3195)
> at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
> at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:71)
> at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
> at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
> at
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24$$anonfun$apply$mcV$sp$3.apply$mcV$sp(VersionsSuite.scala:829)
> at
> org.apache.spark.sql.hive.client.VersionsSuite.withTable(VersionsSuite.scala:70)
> at
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply$mcV$sp(VersionsSuite.scala:828)
> at
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
> at
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
> at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> at org.scalatest.Transformer.apply(Transformer.scala:22)
> at org.scalatest.Transformer.apply(Transformer.scala:20)
> at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
> at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
> at
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183)
> at
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
> at
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
> at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
> at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196)
> at org.scalatest.FunSuite.runTest(FunSuite.scala:1560)
> at
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
> at
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
> at
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
> at
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
> at
> org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379)
> at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
> at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229)
> at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
> at org.scalatest.Suite$class.run(Suite.scala:1147)
> at
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
> at
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
> at
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
> at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
> at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233)
> at
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31)
> at
> org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213)
> at
> org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
> at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31)
> at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
> at
> org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1340)
> at
> org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1334)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1334)
> at
> org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1011)
> at
> org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1010)
> at
> org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1500)
> at
> org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
> at org.scalatest.tools.Runner$.run(Runner.scala:850)
> at org.scalatest.tools.Runner.run(Runner.scala)
> at
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:138)
> at
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:28)
> Caused by: MetaException(message:java.lang.IllegalArgumentException: Can not
> create a Path from an empty string)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1121)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
> at com.sun.proxy.$Proxy31.create_table_with_environment_context(Unknown
> Source)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
> at com.sun.proxy.$Proxy32.createTable(Unknown Source)
> at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
> ... 78 more
> Caused by: java.lang.IllegalArgumentException: Can not create a Path from an
> empty string
> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
> at org.apache.hadoop.fs.Path.<init>(Path.java:184)
> at org.apache.hadoop.fs.Path.getParent(Path.java:357)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:427)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:690)
> at org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:194)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1059)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1107)
> ... 93 more
> {code}
> As for test B, I did'n do a careful inspection, but I find a same wrong path
> as test A. So, I guess exceptions were caused by the same factor.
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]