zero323 commented on issue #27328: [WIP][SPARK-23435][SPARKR][TESTS] Update 
testthat to >= 2.0.0
URL: https://github.com/apache/spark/pull/27328#issuecomment-577975049
 
 
   I've checked some of the failures, and it is pretty clear that these tests 
must have been silenced somehow.
   
   - `node stack overflow` in `cleanClosure` is an obvious bug and it is 
trivial to reproduce outside tests - I've opened 
[SPARK-30629](https://issues.apache.org/jira/browse/SPARK-30629) to track this 
one.
   - Seems like mismatches in `collect() support Unicode characters` are caused 
by Windows / R encoding quirks. As is right now `lines` are not even valid as a 
JSON input
   
       ```r
       library(SparkR)
       SparkR::sparkR.session()
       Sys.info()
       #           sysname           release           version 
       #         "Windows"      "Server x64"     "build 17763" 
       #          nodename           machine             login 
       # "WIN-5BLT6Q610KH"          "x86-64"   "Administrator" 
       #              user    effective_user 
       #   "Administrator"   "Administrator" 
       
       Sys.getlocale()
       
       # [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
States.1252;LC_MONETARY=English_United 
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
       
       lines <- c("{\"name\":\"안녕하세요\"}",
                  "{\"name\":\"您好\", \"age\":30}",
                  "{\"name\":\"こんにちは\", \"age\":19}",
                  "{\"name\":\"Xin chào\"}")
       
       jsonPath <- tempfile(pattern = "sparkr-test", fileext = ".tmp")
       writeLines(lines, jsonPath)
       
       df <- read.df(jsonPath, "json")
       
       
       printSchema(df)
       # root
       #  |-- _corrupt_record: string (nullable = true)
       #  |-- age: long (nullable = true)
       #  |-- name: string (nullable = true)
       
       head(df)
       #              _corrupt_record age                                     
name
       # 1                       <NA>  NA 
<U+C548><U+B155><U+D558><U+C138><U+C694>
       # 2                       <NA>  30                         
<U+60A8><U+597D>
       # 3                       <NA>  19 
<U+3053><U+3093><U+306B><U+3061><U+306F>
       # 4 {"name":"Xin ch<U+FFFD>o"}  NA                                     
<NA>
       ```
   
   
   Let's see what else surfaces downstream.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to