[ https://issues.apache.org/jira/browse/SPARK-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14198570#comment-14198570 ]
Venkata Ramana G commented on SPARK-4252: ----------------------------------------- Same When i executed over hive 0.12 from hive command line is giving result hive> select * from user; OK Alice 12 Bob 13 > SparkSQL behaves differently from Hive when encountering illegal record > ----------------------------------------------------------------------- > > Key: SPARK-4252 > URL: https://issues.apache.org/jira/browse/SPARK-4252 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.1.0 > Reporter: patrickliu > > Hive will ignore illegal record, while SparkSQL will try to convert illegal > record. > Assume I have a text file user.txt with 2 records(userName, age): > Alice,12.4 > Bob,13 > Then I create a Hive table to query the data: > CREATE TABLE user( > name string, > age int, (Pay attention! The field is int) > ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' ; > LOAD DATA LOCAL INPATH 'user' INTO TABLE user; > Then I use Hive and SparkSQL to query the 'user' table: > SQL: select * from user; > Result by Hive: > Alice NULL( Hive ignore Alice's age because it is a float number ) > Bob 13 > Result by SparkSQL: > Alice 12 ( SparkSQL converts Alice's age from float to int ) > Bob 13 > So if I run, "select sum(age) from user;" > Then I will get different result. > Maybe SparkSQL should be compatible with Hive in this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org