[jira] [Commented] (PHOENIX-5035) phoenix-spark dataframe filtes date or timestamp type with error

Hadoop QA (JIRA) Wed, 21 Nov 2018 01:39:02 -0800


    [ 
https://issues.apache.org/jira/browse/PHOENIX-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16694456#comment-16694456
 ]


Hadoop QA commented on PHOENIX-5035:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12948992/PHOENIX-5035.patch
  against master branch at commit cfcf615d98c682df3b60aa7bd82c6706082bdac2.
  ATTACHMENT ID: 12948992

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the master's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
    +      // Borrowed from 'elasticsearch-hadoop', support these internal UTF 
types across Spark versions
+      case utf if (isClass(utf, "org.apache.spark.sql.types.UTF8String")) => 
s"'${escapeStringConstant(utf.toString)}'"
+      case utf if (isClass(utf, "org.apache.spark.unsafe.types.UTF8String")) 
=> s"'${escapeStringConstant(utf.toString)}'"
+        val dateTimeFormat = config.get(QueryServices.TIMESTAMP_FORMAT_ATTRIB, 
DateUtil.DEFAULT_TIMESTAMP_FORMAT)

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.MutableIndexSplitReverseScanIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.UpsertSelectAutoCommitIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.PartialIndexRebuilderIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ConcurrentMutationsIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.index.MutableIndexSplitForwardScanIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2154//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2154//artifact/patchprocess/patchReleaseAuditWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2154//console

This message is automatically generated.

> phoenix-spark dataframe filtes date or timestamp type with error
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-5035
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5035
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.13.0, 4.14.0, 4.13.1, 5.0.0, 4.14.1
>         Environment: HBase:apache 1.2
> Phoenix:4.13.1-HBase-1.2
> Hadoop:CDH 2.6
> Spark:2.3.1
>            Reporter: zhongyuhai
>            Priority: Critical
>              Labels: patch, pull-request-available
>         Attachments: PHOENIX-5035.patch, table desc.png
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> *table desc as following:*
> attach "table desc.png"
>  
> *code as following:*
> val df = SparkUtil.getActiveSession().read.format( 
> "org.apache.phoenix.spark").options(options).load()
> df.filter("INCREATEDDATE = date'2018-07-14'")
>  
> *exception as following:*
> java.lang.RuntimeException: org.apache.phoenix.schema.TypeMismatchException: 
> ERROR 203 (22005): Type mismatch. DATE and BIGINT for "INCREATEDDATE" = 1997
>  at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:201)
>  at 
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:87)
>  at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:127)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>  
> *analyse as following:*
> In the org.apache.phoenix.spark.PhoenixRelation.compileValue(value: Any): Any 
> ,
>  
>  
> {code:java}
> private def compileValue(value: Any): Any = {
> value match {
> case stringValue: String => s"'${escapeStringConstant(stringValue)}'"
> // Borrowed from 'elasticsearch-hadoop', support these internal UTF types 
> across Spark versions
> // Spark 1.4
> case utf if (isClass(utf, "org.apache.spark.sql.types.UTF8String")) => 
> s"'${escapeStringConstant(utf.toString)}'"
> // Spark 1.5
> case utf if (isClass(utf, "org.apache.spark.unsafe.types.UTF8String")) => 
> s"'${escapeStringConstant(utf.toString)}'"
>  
> // Pass through anything else
> case _ => value
> }
> {code}
>  
> It only handles the String type , other type returns the toString。It makes 
> the Spark filte condition "INCREATEDDATE = date'2018-07-14'" translate to 
> Phoenix filte condition like "INCREATEDDATE = 2018-07-14" ,so Phoenix can not 
> run with this syntax and throw the exception ERROR 203 (22005): Type 
> mismatch. DATE and BIGINT for "INCREATEDDATE" = 1997 。
> *soluation as following:*
> add handle to other type just like Date 、Timestamp 
> {code:java}
> private def compileValue(value: Any): Any = {
> value match {
> case stringValue: String => s"'${escapeStringConstant(stringValue)}'"
> // Borrowed from 'elasticsearch-hadoop', support these internal UTF types 
> across Spark versions
> // Spark 1.4
> case utf if (isClass(utf, "org.apache.spark.sql.types.UTF8String")) => 
> s"'${escapeStringConstant(utf.toString)}'"
> // Spark 1.5
> case utf if (isClass(utf, "org.apache.spark.unsafe.types.UTF8String")) => 
> s"'${escapeStringConstant(utf.toString)}'"
> case d if(isClass(d , "java.lang.Date") || isClass(d , "java.sql.Date")) => {
> val config: Configuration = 
> HBaseFactoryProvider.getConfigurationFactory.getConfiguration
> val dateFormat = config.get(QueryServices.DATE_FORMAT_ATTRIB, 
> DateUtil.DEFAULT_DATE_FORMAT)
> val df = new SimpleDateFormat(dateFormat)
> s"date'${df.format(d)}'"
> }
> case dt if(isClass(dt , "java.sql.Timestamp")) => {
> val config: Configuration = 
> HBaseFactoryProvider.getConfigurationFactory.getConfiguration
> val dateTimeFormat = config.get(QueryServices.TIMESTAMP_FORMAT_ATTRIB, 
> DateUtil.DEFAULT_TIMESTAMP_FORMAT)
> val df = new SimpleDateFormat(dateTimeFormat)
> s"timestamp'${df.format(dt)}'"
> }
> // Pass through anything else
> case _ => value
> }
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5035) phoenix-spark dataframe filtes date or timestamp type with error

Reply via email to