[GitHub] [spark] kimtkyeom edited a comment on issue #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

GitBox Mon, 16 Mar 2020 02:03:27 -0700

kimtkyeom edited a comment on issue #27888: [SPARK-31116][SQL] Fix nested 
schema case-sensitivity in ParquetRowConverter
URL: https://github.com/apache/spark/pull/27888#issuecomment-599419426
 
 
   > following, but it was `JSON` (`Relation[StructColumn#329] json`). Could 
you confirm your failure report again?
   > 
   > ```
   > ORC passed case insensitive test cases, but it failed case sensitive 
manner.
   > 
   > [info] - [SPARK-31116](https://issues.apache.org/jira/browse/SPARK-31116): 
Select nested columns correctly in case sensitive manner *** FAILED *** (871 
milliseconds)
   > [info]   Results do not match for query:
   > [info]   Timezone: 
sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
   > [info]   Timezone Env:
   > [info]
   > [info]   == Parsed Logical Plan ==
   > [info]   Relation[StructColumn#329] json
   > [info]
   > [info]   == Analyzed Logical Plan ==
   > [info]   StructColumn: struc
   > ```
   
   @dongjoon-hyun Ah, sorry I mis-pasted test result. ORC also shows same 
result as following whatever value of 
`spark.sql.optimizer.nestedSchemaPruning.enabled`
   
   ```
   [info] - SPARK-31116: Select nested columns correctly in case sensitive 
manner *** FAILED *** (905 milliseconds)
   [info]   Results do not match for query:
   [info]   Timezone: 
sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
   [info]   Timezone Env:
   [info]
   [info]   == Parsed Logical Plan ==
   [info]   Relation[StructColumn#331] orc
   [info]
   [info]   == Analyzed Logical Plan ==
   [info]   StructColumn: struct<LowerCase:bigint,camelcase:bigint>
   [info]   Relation[StructColumn#331] orc
   [info]
   [info]   == Optimized Logical Plan ==
   [info]   Relation[StructColumn#331] orc
   [info]
   [info]   == Physical Plan ==
   [info]   FileScan orc [StructColumn#331] Batched: false, DataFilters: [], 
Format: ORC, Location: 
InMemoryFileIndex[file:/Users/kimtkyeom/Dev/spark_devel/target/tmp/spark-f1fb325e-9ff3-4945-81c7-...,
 PartitionFilters: [], PushedFilters: [], ReadSchema: 
struct<StructColumn:struct<LowerCase:bigint,camelcase:bigint>>
   [info]
   [info]   == Results ==
   [info]
   [info]   == Results ==
   [info]   !== Correct Answer - 1 ==   == Spark Answer - 1 ==
   [info]   !struct<>                   
struct<StructColumn:struct<LowerCase:bigint,camelcase:bigint>>
   [info]   ![null]                     [[null,null]] (QueryTest.scala:248)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] kimtkyeom edited a comment on issue #27888: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

Reply via email to