[jira] [Commented] (SPARK-22967) VersionSuite failed on Windows caused by unescapeSQLString()

wuyi (JIRA) Mon, 08 Jan 2018 03:49:55 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-22967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316161#comment-16316161
 ]


wuyi commented on SPARK-22967:
------------------------------

I understand what you mean now, and things with Hive go well after I tried 
this. But, another wired problem arise.

Tmp dir was created at the beginning of test B(mentioned above):

{code:java}
protected def withTempDir(f: File => Unit): Unit = {
   val dir = Utils.createTempDir().getCanonicalFile
   try f(dir) finally Utils.deleteRecursively(dir)
  }
{code}

And, it would be deleted in finally clause.

And test B will run with below Hive visions sequently:

{code:java}
private val versions = Seq("0.12", "0.13", "0.14", "1.0", "1.1", "1.2", "2.0", 
"2.1")
{code}

And each version will delete the tmp dir successfully except the version 0.12. 
And when I try to delete this tmp file manualy, then, Windows warning me that 
this file may open in another program. It seems that an open stream is 
occupying this file.

But, this tmp file could be deleted after another version test start running.

And, I tried to exchange the order between 0.12 and 0.13, but the result 
remains the same.

That's really make me confused. Maybe, there's something incompatible with 
version 0.12.





> VersionSuite failed on Windows caused by unescapeSQLString()
> ------------------------------------------------------------
>
>                 Key: SPARK-22967
>                 URL: https://issues.apache.org/jira/browse/SPARK-22967
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.1
>         Environment: Windos7
>            Reporter: wuyi
>            Priority: Minor
>              Labels: build, test, windows
>
> On Windows system, two unit test case would fail while running VersionSuite 
> ("A simple set of tests that call the methods of a `HiveClient`, loading 
> different version of hive from maven central.")
> Failed A : test(s"$version: read avro file containing decimal") 
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string);
> {code}
> Failed B: test(s"$version: SPARK-17920: Insert into/overwrite avro table")
> {code:java}
> Unable to infer the schema. The schema specification is required to create 
> the table `default`.`tab2`.;
> org.apache.spark.sql.AnalysisException: Unable to infer the schema. The 
> schema specification is required to create the table `default`.`tab2`.;
> {code}
> As I deep into this problem, I found it is related to 
> ParserUtils#unescapeSQLString().
> These are two lines at the beginning of Failed A:
> {code:java}
> val url = 
> Thread.currentThread().getContextClassLoader.getResource("avroDecimal")
> val location = new File(url.getFile)
> {code}
> And in my environment，`location` (path value) is
> {code:java}
> D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
> {code}
> And then, in SparkSqlParser#visitCreateHiveTable()#L1128:
> {code:java}
> val location = Option(ctx.locationSpec).map(visitLocationSpec)
> {code}
> This line want to get LocationSepcContext's content first, which is equal to 
> `location` above.
> Then, the content is passed to visitLocationSpec(), and passed to 
> unescapeSQLString()
> finally.
> Lets' have a look at unescapeSQLString():
> {code:java}
> /** Unescape baskslash-escaped string enclosed by quotes. */
>   def unescapeSQLString(b: String): String = {
>     var enclosure: Character = null
>     val sb = new StringBuilder(b.length())
>     def appendEscapedChar(n: Char) {
>       n match {
>         case '0' => sb.append('\u0000')
>         case '\'' => sb.append('\'')
>         case '"' => sb.append('\"')
>         case 'b' => sb.append('\b')
>         case 'n' => sb.append('\n')
>         case 'r' => sb.append('\r')
>         case 't' => sb.append('\t')
>         case 'Z' => sb.append('\u001A')
>         case '\\' => sb.append('\\')
>         // The following 2 lines are exactly what MySQL does TODO: why do we 
> do this?
>         case '%' => sb.append("\\%")
>         case '_' => sb.append("\\_")
>         case _ => sb.append(n)
>       }
>     }
>     var i = 0
>     val strLength = b.length
>     while (i < strLength) {
>       val currentChar = b.charAt(i)
>       if (enclosure == null) {
>         if (currentChar == '\'' || currentChar == '\"') {
>           enclosure = currentChar
>         }
>       } else if (enclosure == currentChar) {
>         enclosure = null
>       } else if (currentChar == '\\') {
>         if ((i + 6 < strLength) && b.charAt(i + 1) == 'u') {
>           // \u0000 style character literals.
>           val base = i + 2
>           val code = (0 until 4).foldLeft(0) { (mid, j) =>
>             val digit = Character.digit(b.charAt(j + base), 16)
>             (mid << 4) + digit
>           }
>           sb.append(code.asInstanceOf[Char])
>           i += 5
>         } else if (i + 4 < strLength) {
>           // \000 style character literals.
>           val i1 = b.charAt(i + 1)
>           val i2 = b.charAt(i + 2)
>           val i3 = b.charAt(i + 3)
>           if ((i1 >= '0' && i1 <= '1') && (i2 >= '0' && i2 <= '7') && (i3 >= 
> '0' && i3 <= '7')) {
>             val tmp = ((i3 - '0') + ((i2 - '0') << 3) + ((i1 - '0') << 
> 6)).asInstanceOf[Char]
>             sb.append(tmp)
>             i += 3
>           } else {
>             appendEscapedChar(i1)
>             i += 1
>           }
>         } else if (i + 2 < strLength) {
>           // escaped character literals.
>           val n = b.charAt(i + 1)
>           appendEscapedChar(n)
>           i += 1
>         }
>       } else {
>         // non-escaped character literals.
>         sb.append(currentChar)
>       }
>       i += 1
>     }
>     sb.toString()
>   }
> {code}
>  Again, here, variable `b` is equal to content and `location`, is valued of 
> {code:java}
> D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
> {code}
> And we can make sense from the unescapeSQLString()' strategies that it 
> transform  the String "\t" into a escape character '\t' and remove all 
> backslashes.
> So, our original correct location resulted in:
> {code:java}
> D:workspaceIdeaProjectssparksqlhive\targetscala-2.11\test-classesavroDecimal
> {code}
>  after unescapeSQLString() completed.
> Note that, here, [ \t ] is no longer a string, but a escape character. 
> Then, return into SparkSqlParser#visitCreateHiveTable(), and move to L1134:
> {code:java}
> val locUri = location.map(CatalogUtils.stringToURI(_))
> {code}
> `location` is passed to stringToURI(), and resulted in:
> {code:java}
> file:/D:workspaceIdeaProjectssparksqlhive%09argetscala-2.11%09est-classesavroDecimal
> {code}
> finally, as  escape character '\t'  is transformed into URI code '%09'.
> Although, I'm not clearly about how this wrong path directly caused that 
> exception, as I almostly know nothing about Hive, I can verify that this 
> wrong path is the real factor to cause this exception.
> When I append these lines(in order to fix the wrong path) after 
> HiveExternalCatalog#doCreateTable()Line236-240:
> {code:java}
> if (tableLocation.get.getPath.startsWith("/D")) {
>      tableLocation = Some(CatalogUtils.stringToURI(
>         
> "file:/D:/workspace/IdeaProjects/spark/sql/hive/target/scala-2.11/test-classes/avroDecimal"))
>     }
> {code}
>  
> then, failed unit test A will pass, excluding test B.
> And below is the stack trace of the Exception:
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string)
>       at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:602)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:469)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:273)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:210)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:209)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:256)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:467)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:263)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
>       at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:119)
>       at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:304)
>       at 
> org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:128)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>       at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
>       at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
>       at org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3196)
>       at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>       at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3195)
>       at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
>       at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:71)
>       at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
>       at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24$$anonfun$apply$mcV$sp$3.apply$mcV$sp(VersionsSuite.scala:829)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite.withTable(VersionsSuite.scala:70)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply$mcV$sp(VersionsSuite.scala:828)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
>       at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>       at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>       at org.scalatest.Transformer.apply(Transformer.scala:22)
>       at org.scalatest.Transformer.apply(Transformer.scala:20)
>       at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>       at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
>       at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
>       at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
>       at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196)
>       at org.scalatest.FunSuite.runTest(FunSuite.scala:1560)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
>       at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
>       at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
>       at scala.collection.immutable.List.foreach(List.scala:381)
>       at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
>       at 
> org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379)
>       at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
>       at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229)
>       at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
>       at org.scalatest.Suite$class.run(Suite.scala:1147)
>       at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
>       at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
>       at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233)
>       at 
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31)
>       at 
> org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213)
>       at 
> org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
>       at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31)
>       at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
>       at 
> org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1340)
>       at 
> org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1334)
>       at scala.collection.immutable.List.foreach(List.scala:381)
>       at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1334)
>       at 
> org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1011)
>       at 
> org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1010)
>       at 
> org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1500)
>       at 
> org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
>       at org.scalatest.tools.Runner$.run(Runner.scala:850)
>       at org.scalatest.tools.Runner.run(Runner.scala)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:138)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:28)
> Caused by: MetaException(message:java.lang.IllegalArgumentException: Can not 
> create a Path from an empty string)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1121)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
>       at com.sun.proxy.$Proxy31.create_table_with_environment_context(Unknown 
> Source)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>       at com.sun.proxy.$Proxy32.createTable(Unknown Source)
>       at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
>       ... 78 more
> Caused by: java.lang.IllegalArgumentException: Can not create a Path from an 
> empty string
>       at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
>       at org.apache.hadoop.fs.Path.<init>(Path.java:184)
>       at org.apache.hadoop.fs.Path.getParent(Path.java:357)
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:427)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:690)
>       at org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:194)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1059)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1107)
>       ... 93 more
> {code}
> As for test B, I did'n do a careful inspection, but I find a same wrong path 
> as test A. So, I guess exceptions were  caused by the same factor.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-22967) VersionSuite failed on Windows caused by unescapeSQLString()

Reply via email to