[
https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073742#comment-15073742
]
Suhas Nalapure commented on PHOENIX-2336:
-----------------------------------------
Another pointer: the root cause of this behavior seems to be same as
PHOENIX-2547. For columns with small letter names, the select query generated
by PhoenixInputFormat doesn't enclose the column names in double quotes ("")
which means Phoenix will actually look for a column with the same letter
sequence but in upper case and this results in the error. This is also true of
tables created directly through Phoenix and not just Views that map to existing
HBase table.
> Queries with small case column-names return empty result-set when working
> with Spark Datasource Plugin
> -------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2336
> URL: https://issues.apache.org/jira/browse/PHOENIX-2336
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.6.0
> Reporter: Suhas Nalapure
>
> Hi,
> The Spark DataFrame filter operation returns empty result-set when
> column-name is in the smaller case. Example below:
> DataFrame df =
> sqlContext.read().format("org.apache.phoenix.spark").options(params).load();
> df.filter("\"col1\" = '5.0'").show();
> Result:
> +---+----+---+---+---+---
> | ID|col1| c1| d2| d3| d4|
> +---+----+---+---+---+---+
> +---+----+---+---+---+---+
> Whereas the table actually has some rows matching the filter condition. And
> if double quotes are removed from around the column name i.e. df.filter("col1
> = '5.0'").show(); , a ColumnNotFoundException is thrown:
> Exception in thread "main" java.lang.RuntimeException:
> org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703):
> Undefined column. columnName=D1
> at
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
> at
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80)
> at
> org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
> at scala.Option.getOrElse(Option.scala:120)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)