Suhas Nalapure created PHOENIX-2336:
---------------------------------------

             Summary: Queries with small case column-names return empty 
result-set when working with Spark Datasource Plugin 
                 Key: PHOENIX-2336
                 URL: https://issues.apache.org/jira/browse/PHOENIX-2336
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 4.5.3
            Reporter: Suhas Nalapure


Hi,

The Spark DataFrame filter operation returns empty result-set when column-name 
is in the smaller case. Example below:
DataFrame df = 
sqlContext.read().format("org.apache.phoenix.spark").options(params).load();
df.filter("\"col1\" = '5.0'").show(); 

Result:
+---+----+---+---+---+---
| ID|col1| c1| d2| d3| d4|
+---+----+---+---+---+---+
+---+----+---+---+---+---+

Whereas the table actually has some rows matching the filter condition. And if 
double quotes are removed from around the column name i.e. df.filter("col1 = 
'5.0'").show(); , a ColumnNotFoundException is thrown:
Exception in thread "main" java.lang.RuntimeException: 
org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): Undefined 
column. columnName=D1
        at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
        at 
org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80)
        at 
org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to