[jira] [Comment Edited] (SPARK-22967) VersionSuite failed on Windows caused by unescapeSQLString()

wuyi (JIRA) Mon, 08 Jan 2018 06:28:37 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-22967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316368#comment-16316368
 ]


wuyi edited comment on SPARK-22967 at 1/8/18 2:27 PM:
------------------------------------------------------

Maybe, we can give a warning rather than an exception, since the main goal of 
the tests have been achieved.

I will check it again tomorrow...
 
And I know another case(ChildProcAppHanlderSuite) may related to Windows, also. 
I can post it once I have time.


was (Author: ngone51):
Maybe, we can give a warning rather than an exception, since the main goal of 
the tests have been achieved.

I will check it again tomorrow...
 

> VersionSuite failed on Windows caused by unescapeSQLString()
> ------------------------------------------------------------
>
>                 Key: SPARK-22967
>                 URL: https://issues.apache.org/jira/browse/SPARK-22967
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.1
>         Environment: Windos7
>            Reporter: wuyi
>            Priority: Minor
>              Labels: build, test, windows
>
> On Windows system, two unit test case would fail while running VersionSuite 
> ("A simple set of tests that call the methods of a `HiveClient`, loading 
> different version of hive from maven central.")
> Failed A : test(s"$version: read avro file containing decimal") 
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string);
> {code}
> Failed B: test(s"$version: SPARK-17920: Insert into/overwrite avro table")
> {code:java}
> Unable to infer the schema. The schema specification is required to create 
> the table `default`.`tab2`.;
> org.apache.spark.sql.AnalysisException: Unable to infer the schema. The 
> schema specification is required to create the table `default`.`tab2`.;
> {code}
> As I deep into this problem, I found it is related to 
> ParserUtils#unescapeSQLString().
> These are two lines at the beginning of Failed A:
> {code:java}
> val url = 
> Thread.currentThread().getContextClassLoader.getResource("avroDecimal")
> val location = new File(url.getFile)
> {code}
> And in my environment，`location` (path value) is
> {code:java}
> D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
> {code}
> And then, in SparkSqlParser#visitCreateHiveTable()#L1128:
> {code:java}
> val location = Option(ctx.locationSpec).map(visitLocationSpec)
> {code}
> This line want to get LocationSepcContext's content first, which is equal to 
> `location` above.
> Then, the content is passed to visitLocationSpec(), and passed to 
> unescapeSQLString()
> finally.
> Lets' have a look at unescapeSQLString():
> {code:java}
> /** Unescape baskslash-escaped string enclosed by quotes. */
>   def unescapeSQLString(b: String): String = {
>     var enclosure: Character = null
>     val sb = new StringBuilder(b.length())
>     def appendEscapedChar(n: Char) {
>       n match {
>         case '0' => sb.append('\u0000')
>         case '\'' => sb.append('\'')
>         case '"' => sb.append('\"')
>         case 'b' => sb.append('\b')
>         case 'n' => sb.append('\n')
>         case 'r' => sb.append('\r')
>         case 't' => sb.append('\t')
>         case 'Z' => sb.append('\u001A')
>         case '\\' => sb.append('\\')
>         // The following 2 lines are exactly what MySQL does TODO: why do we 
> do this?
>         case '%' => sb.append("\\%")
>         case '_' => sb.append("\\_")
>         case _ => sb.append(n)
>       }
>     }
>     var i = 0
>     val strLength = b.length
>     while (i < strLength) {
>       val currentChar = b.charAt(i)
>       if (enclosure == null) {
>         if (currentChar == '\'' || currentChar == '\"') {
>           enclosure = currentChar
>         }
>       } else if (enclosure == currentChar) {
>         enclosure = null
>       } else if (currentChar == '\\') {
>         if ((i + 6 < strLength) && b.charAt(i + 1) == 'u') {
>           // \u0000 style character literals.
>           val base = i + 2
>           val code = (0 until 4).foldLeft(0) { (mid, j) =>
>             val digit = Character.digit(b.charAt(j + base), 16)
>             (mid << 4) + digit
>           }
>           sb.append(code.asInstanceOf[Char])
>           i += 5
>         } else if (i + 4 < strLength) {
>           // \000 style character literals.
>           val i1 = b.charAt(i + 1)
>           val i2 = b.charAt(i + 2)
>           val i3 = b.charAt(i + 3)
>           if ((i1 >= '0' && i1 <= '1') && (i2 >= '0' && i2 <= '7') && (i3 >= 
> '0' && i3 <= '7')) {
>             val tmp = ((i3 - '0') + ((i2 - '0') << 3) + ((i1 - '0') << 
> 6)).asInstanceOf[Char]
>             sb.append(tmp)
>             i += 3
>           } else {
>             appendEscapedChar(i1)
>             i += 1
>           }
>         } else if (i + 2 < strLength) {
>           // escaped character literals.
>           val n = b.charAt(i + 1)
>           appendEscapedChar(n)
>           i += 1
>         }
>       } else {
>         // non-escaped character literals.
>         sb.append(currentChar)
>       }
>       i += 1
>     }
>     sb.toString()
>   }
> {code}
>  Again, here, variable `b` is equal to content and `location`, is valued of 
> {code:java}
> D:\workspace\IdeaProjects\spark\sql\hive\target\scala-2.11\test-classes\avroDecimal
> {code}
> And we can make sense from the unescapeSQLString()' strategies that it 
> transform  the String "\t" into a escape character '\t' and remove all 
> backslashes.
> So, our original correct location resulted in:
> {code:java}
> D:workspaceIdeaProjectssparksqlhive\targetscala-2.11\test-classesavroDecimal
> {code}
>  after unescapeSQLString() completed.
> Note that, here, [ \t ] is no longer a string, but a escape character. 
> Then, return into SparkSqlParser#visitCreateHiveTable(), and move to L1134:
> {code:java}
> val locUri = location.map(CatalogUtils.stringToURI(_))
> {code}
> `location` is passed to stringToURI(), and resulted in:
> {code:java}
> file:/D:workspaceIdeaProjectssparksqlhive%09argetscala-2.11%09est-classesavroDecimal
> {code}
> finally, as  escape character '\t'  is transformed into URI code '%09'.
> Although, I'm not clearly about how this wrong path directly caused that 
> exception, as I almostly know nothing about Hive, I can verify that this 
> wrong path is the real factor to cause this exception.
> When I append these lines(in order to fix the wrong path) after 
> HiveExternalCatalog#doCreateTable()Line236-240:
> {code:java}
> if (tableLocation.get.getPath.startsWith("/D")) {
>      tableLocation = Some(CatalogUtils.stringToURI(
>         
> "file:/D:/workspace/IdeaProjects/spark/sql/hive/target/scala-2.11/test-classes/avroDecimal"))
>     }
> {code}
>  
> then, failed unit test A will pass, excluding test B.
> And below is the stack trace of the Exception:
> {code:java}
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:java.lang.IllegalArgumentException: Can not create a 
> Path from an empty string)
>       at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:602)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:469)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:467)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:273)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:210)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:209)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:256)
>       at 
> org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:467)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply$mcV$sp(HiveExternalCatalog.scala:263)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$doCreateTable$1.apply(HiveExternalCatalog.scala:216)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
>       at 
> org.apache.spark.sql.hive.HiveExternalCatalog.doCreateTable(HiveExternalCatalog.scala:216)
>       at 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalog.createTable(ExternalCatalog.scala:119)
>       at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:304)
>       at 
> org.apache.spark.sql.execution.command.CreateTableCommand.run(tables.scala:128)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>       at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>       at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
>       at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:186)
>       at org.apache.spark.sql.Dataset$$anonfun$51.apply(Dataset.scala:3196)
>       at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>       at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3195)
>       at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
>       at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:71)
>       at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
>       at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24$$anonfun$apply$mcV$sp$3.apply$mcV$sp(VersionsSuite.scala:829)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite.withTable(VersionsSuite.scala:70)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply$mcV$sp(VersionsSuite.scala:828)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
>       at 
> org.apache.spark.sql.hive.client.VersionsSuite$$anonfun$6$$anonfun$apply$24.apply(VersionsSuite.scala:805)
>       at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>       at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>       at org.scalatest.Transformer.apply(Transformer.scala:22)
>       at org.scalatest.Transformer.apply(Transformer.scala:20)
>       at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>       at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:68)
>       at 
> org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:183)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:196)
>       at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
>       at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:196)
>       at org.scalatest.FunSuite.runTest(FunSuite.scala:1560)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:229)
>       at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:396)
>       at 
> org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:384)
>       at scala.collection.immutable.List.foreach(List.scala:381)
>       at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
>       at 
> org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:379)
>       at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
>       at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:229)
>       at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
>       at org.scalatest.Suite$class.run(Suite.scala:1147)
>       at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
>       at 
> org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:233)
>       at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
>       at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:233)
>       at 
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:31)
>       at 
> org.scalatest.BeforeAndAfterAll$class.liftedTree1$1(BeforeAndAfterAll.scala:213)
>       at 
> org.scalatest.BeforeAndAfterAll$class.run(BeforeAndAfterAll.scala:210)
>       at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:31)
>       at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:45)
>       at 
> org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1340)
>       at 
> org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$1.apply(Runner.scala:1334)
>       at scala.collection.immutable.List.foreach(List.scala:381)
>       at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1334)
>       at 
> org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1011)
>       at 
> org.scalatest.tools.Runner$$anonfun$runOptionallyWithPassFailReporter$2.apply(Runner.scala:1010)
>       at 
> org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1500)
>       at 
> org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:1010)
>       at org.scalatest.tools.Runner$.run(Runner.scala:850)
>       at org.scalatest.tools.Runner.run(Runner.scala)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.runScalaTest2(ScalaTestRunner.java:138)
>       at 
> org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner.main(ScalaTestRunner.java:28)
> Caused by: MetaException(message:java.lang.IllegalArgumentException: Can not 
> create a Path from an empty string)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1121)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
>       at com.sun.proxy.$Proxy31.create_table_with_environment_context(Unknown 
> Source)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:482)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:471)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
>       at com.sun.proxy.$Proxy32.createTable(Unknown Source)
>       at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:596)
>       ... 78 more
> Caused by: java.lang.IllegalArgumentException: Can not create a Path from an 
> empty string
>       at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
>       at org.apache.hadoop.fs.Path.<init>(Path.java:184)
>       at org.apache.hadoop.fs.Path.getParent(Path.java:357)
>       at 
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:427)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:690)
>       at org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:194)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1059)
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1107)
>       ... 93 more
> {code}
> As for test B, I did'n do a careful inspection, but I find a same wrong path 
> as test A. So, I guess exceptions were  caused by the same factor.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-22967) VersionSuite failed on Windows caused by unescapeSQLString()

Reply via email to