[jira] [Created] (PHOENIX-6321) Array of Shorts/Smallint returned as Array of Integers

Alvaro Fernandez (Jira) Sun, 17 Jan 2021 09:54:04 -0800

Alvaro Fernandez created PHOENIX-6321:
-----------------------------------------


             Summary: Array of Shorts/Smallint returned as Array of Integers
                 Key: PHOENIX-6321
                 URL: https://issues.apache.org/jira/browse/PHOENIX-6321
             Project: Phoenix
          Issue Type: Bug
          Components: spark-connector
    Affects Versions: 5.0.0
            Reporter: Alvaro Fernandez


When using spark connector to read a Phoenix table with at least a column 
defined as Array of Shorts, the resulting Dataset infers the schema as a Array 
of Integers.

I believe this is due to the following code:

phoenix/phoenix-spark/src/main/scala/org/apache/phoenix/spark/PhoenixRDD.scala:182

case t if t.isInstanceOf[PSmallintArray] || 
t.isInstanceOf[PUnsignedSmallintArray] => ArrayType(IntegerType, containsNull = 
true)

 

phoenix-connectors/phoenix-spark-base/src/main/scala/org/apache/phoenix/spark/SparkSchemaUtil.scala:82

case t if t.isInstanceOf[PSmallintArray] || 
t.isInstanceOf[PUnsignedSmallintArray] => ArrayType(IntegerType, containsNull = 
true)

 

Subsequent tries to programatically cast to Shorts will fail with a 
ClassCastException.

And it is also impossible to define the original schema within a 
DataFrameReader as it fails with:"org.apache.spark.sql.AnalysisException: 
org.apache.phoenix.spark does not allow user-specified schemas.;"

Making it impossible afaik to work with tables with this kind of data types.

Is there any reason to have this code intepreting SmallInts/Shorts as Integers?

Thanks

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (PHOENIX-6321) Array of Shorts/Smallint returned as Array of Integers

Reply via email to