[jira] [Commented] (PHOENIX-2336) Queries with small case column-names return empty result-set when working with Spark Datasource Plugin

Suhas Nalapure (JIRA) Tue, 29 Dec 2015 02:03:07 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073742#comment-15073742
 ]


Suhas Nalapure commented on PHOENIX-2336:
-----------------------------------------

Another pointer: the root cause of this behavior seems to be same as 
PHOENIX-2547. For columns with small letter names, the select query generated 
by PhoenixInputFormat doesn't enclose the column names in double quotes ("") 
which means Phoenix will actually look for a column with the same letter 
sequence but in upper case and this results in the error. This is also true of 
tables created directly through Phoenix and not just Views that map to existing 
HBase table.

> Queries with small case column-names return empty result-set when working 
> with Spark Datasource Plugin 
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2336
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2336
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Suhas Nalapure
>
> Hi,
> The Spark DataFrame filter operation returns empty result-set when 
> column-name is in the smaller case. Example below:
> DataFrame df = 
> sqlContext.read().format("org.apache.phoenix.spark").options(params).load();
> df.filter("\"col1\" = '5.0'").show(); 
> Result:
> +---+----+---+---+---+---
> | ID|col1| c1| d2| d3| d4|
> +---+----+---+---+---+---+
> +---+----+---+---+---+---+
> Whereas the table actually has some rows matching the filter condition. And 
> if double quotes are removed from around the column name i.e. df.filter("col1 
> = '5.0'").show(); , a ColumnNotFoundException is thrown:
> Exception in thread "main" java.lang.RuntimeException: 
> org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): 
> Undefined column. columnName=D1
>         at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
>         at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80)
>         at 
> org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
>         at scala.Option.getOrElse(Option.scala:120)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2336) Queries with small case column-names return empty result-set when working with Spark Datasource Plugin

Reply via email to