[
https://issues.apache.org/jira/browse/SPARK-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012320#comment-15012320
]
Antonio Piccolboni commented on SPARK-10754:
--------------------------------------------
Maybe related. Created table from csv using spark-csv. Colnames contain a mix
of upper and lower in file and continue to do so in table, as shown by
describe. Then I create a table with CREATE TABLE AS SELECT. New table has
lowercase col names. This seems case sensitive sometimes, and case insensitive
some other times. Please let me know if I need to open a separate report. Test
case follows
Sample data
"playerID","yearID","stint","teamID","lgID","G","G_batting","AB","R","H","X2B","X3B","HR","RBI","SB","CS","BB","SO","IBB","HBP","SH","SF","GIDP","G_old"
"aardsda01",2004,1,"SFN","NL",11,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11
"aardsda01",2006,1,"CHN","NL",45,43,2,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,45
"aardsda01",2007,1,"CHA","AL",25,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
"aardsda01",2008,1,"BOS","AL",47,5,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,5
"aardsda01",2009,1,"SEA","AL",73,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,NA
"aardsda01",2010,1,"SEA","AL",53,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,NA
"aardsda01",2012,1,"NYA","AL",1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
"aaronha01",1954,1,"ML1","NL",122,122,468,58,131,27,6,13,69,2,2,28,39,NA,3,6,4,13,122
"aaronha01",1955,1,"ML1","NL",153,153,602,105,189,37,9,27,106,3,1,49,61,5,3,7,4,20,153
Create table with
CREATE TABLE `batting` USING com.databricks.spark.csv OPTIONS (path
'/var/folders/_p/1gx4vy311_x4syn2xq6f2xtc0000gr/T//Rtmp0E8pqi/file11a8546f94ed6',
header 'TRUE', delimiter ',', quote '"', parserLib 'commons', mode
'PERMISSIVE', charset 'UTF-8', inferSchema 'TRUE', comment '#')
Upper and lower cases preserved:
Browse[6]> qy("describe batting", my_db)
col_name data_type comment
1 playerID string
2 yearID int
3 stint int
4 teamID string
5 lgID string
6 G int
7 G_batting string
8 AB string
9 R string
10 H string
11 X2B string
12 X3B string
13 HR string
14 RBI string
15 SB string
16 CS string
17 BB string
18 SO string
19 IBB string
20 HBP string
21 SH string
22 SF string
23 GIDP string
24 G_old string
Create other table with
CREATE TABLE `xxhcteugas` AS SELECT `playerID` AS `playerID`, `yearID` AS
`yearID`, `teamID` AS `teamID`, `G` AS `G`, `AB` AS `AB`, `R` AS `R`, `H` AS `H`
FROM `batting`
ORDER BY `playerID`, `yearID`, `teamID`
Browse[6]>
Upper case gone in colnames
Browse[6]> qy("describe xxhcteugas", my_db)
col_name data_type comment
1 playerid string <NA>
2 yearid int <NA>
3 teamid string <NA>
4 g int <NA>
5 ab string <NA>
6 r string <NA>
7 h string <NA>
> table and column name are case sensitive when json Dataframe was registered
> as tempTable using JavaSparkContext.
> -----------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-10754
> URL: https://issues.apache.org/jira/browse/SPARK-10754
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.3.0, 1.3.1, 1.4.1
> Environment: Linux ,Hadoop Version 1.3
> Reporter: Babulal
>
> Create a dataframe using json data source
> SparkConf conf=new
> SparkConf().setMaster("spark://xyz:7077")).setAppName("Spark Tabble");
> JavaSparkContext javacontext=new JavaSparkContext(conf);
> SQLContext sqlContext=new SQLContext(javacontext);
>
> DataFrame df =
> sqlContext.jsonFile("/user/root/examples/src/main/resources/people.json");
>
> df.registerTempTable("sparktable");
>
> Run the Query
>
> sqlContext.sql("select * from sparktable").show() // this will PASs
>
>
> sqlContext.sql("select * from sparkTable").show() /// This will FAIL
>
> java.lang.RuntimeException: Table Not Found: sparkTable
> at scala.sys.package$.error(package.scala:27)
> at
> org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:115)
> at
> org.apache.spark.sql.catalyst.analysis.SimpleCatalog$$anonfun$1.apply(Catalog.scala:115)
> at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
> at scala.collection.AbstractMap.getOrElse(Map.scala:58)
> at
> org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:115)
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:233)
>
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]