[jira] [Commented] (PHOENIX-2336) Queries with small case column-names return empty result-set when working with Spark Datasource Plugin

James Taylor (JIRA) Mon, 15 Aug 2016 09:31:03 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421230#comment-15421230
 ]


James Taylor commented on PHOENIX-2336:
---------------------------------------

Thanks for the patch [~kalyanhadoop]. I've added you as a Phoenix contributor 
and assigned this issue to you. Thanks too for shepherding this along, 
[~jmahonin]. Since it sounds like this is a bug fix, I'd recommend checking 
this into master, 4.x, and 4.8 branches. I believe the following ones will be 
necessary:
- master, 4.x-HBase-0.98, 4.8-HBase-0.98, 4.8-HBase-1.2
- not sure if we'll need 4.8-HBase-1.1 and 4.x-HBase-1.1, but we won't continue 
with the 4.x-HBase-1.0 branch and I doubt we'll continue with the 4.8-HBase-1.0 
branch either.

> Queries with small case column-names return empty result-set when working 
> with Spark Datasource Plugin 
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2336
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2336
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.6.0
>            Reporter: Suhas Nalapure
>            Assignee: Kalyan
>              Labels: verify
>             Fix For: 4.9.0
>
>         Attachments: 
> PHOENIX-2336_PHOENIX-2290_PHOENIX-2547_code_changes.patch, 
> PHOENIX-2336_PHOENIX-2290_PHOENIX-2547_unit_tests.patch
>
>
> Hi,
> The Spark DataFrame filter operation returns empty result-set when 
> column-name is in the smaller case. Example below:
> DataFrame df = 
> sqlContext.read().format("org.apache.phoenix.spark").options(params).load();
> df.filter("\"col1\" = '5.0'").show(); 
> Result:
> +---+----+---+---+---+---
> | ID|col1| c1| d2| d3| d4|
> +---+----+---+---+---+---+
> +---+----+---+---+---+---+
> Whereas the table actually has some rows matching the filter condition. And 
> if double quotes are removed from around the column name i.e. df.filter("col1 
> = '5.0'").show(); , a ColumnNotFoundException is thrown:
> Exception in thread "main" java.lang.RuntimeException: 
> org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): 
> Undefined column. columnName=D1
>         at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:125)
>         at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:80)
>         at 
> org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
>         at scala.Option.getOrElse(Option.scala:120)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2336) Queries with small case column-names return empty result-set when working with Spark Datasource Plugin

Reply via email to