[
https://issues.apache.org/jira/browse/PHOENIX-6559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419372#comment-17419372
]
ASF GitHub Bot commented on PHOENIX-6559:
-----------------------------------------
stoty commented on pull request #63:
URL: https://github.com/apache/phoenix-connectors/pull/63#issuecomment-926033322
:broken_heart: **-1 overall**
| Vote | Subsystem | Runtime | Comment |
|:----:|----------:|--------:|:--------|
| +0 :ok: | reexec | 0m 33s | Docker mode activated. |
||| _ Prechecks _ |
| +1 :green_heart: | dupname | 0m 0s | No case conflicting files
found. |
| +1 :green_heart: | @author | 0m 0s | The patch does not contain any
@author tags. |
| -1 :x: | test4tests | 0m 0s | The patch doesn't appear to include
any new or modified tests. Please justify why no new tests are needed for this
patch. Also please list what manual steps were performed to verify this patch.
|
||| _ master Compile Tests _ |
| +1 :green_heart: | mvninstall | 32m 49s | master passed |
| +1 :green_heart: | compile | 2m 9s | master passed |
| +1 :green_heart: | scaladoc | 1m 23s | master passed |
||| _ Patch Compile Tests _ |
| +1 :green_heart: | mvninstall | 23m 41s | the patch passed |
| +1 :green_heart: | compile | 2m 8s | the patch passed |
| +1 :green_heart: | scalac | 2m 8s | the patch passed |
| +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace
issues. |
| +1 :green_heart: | scaladoc | 1m 23s | the patch passed |
||| _ Other Tests _ |
| -1 :x: | unit | 54m 52s | phoenix-spark-base in the patch failed. |
| -1 :x: | asflicense | 0m 9s | The patch generated 2 ASF License
warnings. |
| | | 119m 29s | |
| Subsystem | Report/Notes |
|----------:|:-------------|
| Docker | ClientAPI=1.41 ServerAPI=1.41 base:
https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-Connectors-PreCommit-GitHub-PR/job/PR-63/1/artifact/yetus-general-check/output/Dockerfile
|
| GITHUB PR | https://github.com/apache/phoenix-connectors/pull/63 |
| Optional Tests | dupname asflicense scalac scaladoc unit compile |
| uname | Linux 055b7903f8ca 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | dev/phoenix-connectors-personality.sh |
| git revision | master / 349a2b2 |
| unit |
https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-Connectors-PreCommit-GitHub-PR/job/PR-63/1/artifact/yetus-general-check/output/patch-unit-phoenix-spark-base.txt
|
| Test Results |
https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-Connectors-PreCommit-GitHub-PR/job/PR-63/1/testReport/
|
| asflicense |
https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-Connectors-PreCommit-GitHub-PR/job/PR-63/1/artifact/yetus-general-check/output/patch-asflicense-problems.txt
|
| Max. process+thread count | 1612 (vs. ulimit of 30000) |
| modules | C: phoenix-spark-base U: phoenix-spark-base |
| Console output |
https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-Connectors-PreCommit-GitHub-PR/job/PR-63/1/console
|
| versions | git=2.7.4 maven=3.3.9 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |
This message was automatically generated.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> spark connector access to SmallintArray / UnsignedSmallintArray columns
> -----------------------------------------------------------------------
>
> Key: PHOENIX-6559
> URL: https://issues.apache.org/jira/browse/PHOENIX-6559
> Project: Phoenix
> Issue Type: Bug
> Components: connectors, spark-connector
> Affects Versions: connectors-6.0.0
> Reporter: Alvaro Fernandez
> Assignee: Alvaro Fernandez
> Priority: Major
> Fix For: connectors-6.0.0
>
> Attachments: PHOENIX-6559.master.v1.patch
>
>
> We have some tables defined with SMALLINT array[] columns, that are not
> accessible correctly with the spark connector.
> Seems that the Spark data type is incorrectly inferred by the connector as an
> array of integers ArrayType(IntegerType), instead of ArrayType(ShortType).
> A table example:
> {code:java}
> CREATE TABLE IF NOT EXISTS AEIDEV.ARRAY_TABLE (ID BIGINT NOT NULL PRIMARY
> KEY, COL1 SMALLINT ARRAY[] );
> UPSERT INTO AEIDEV.ARRAY_TABLE VALUES (1, ARRAY[-32678,-9876,-234,-1]);
> UPSERT INTO AEIDEV.ARRAY_TABLE VALUES (2, ARRAY[0,8,9,10]);
> UPSERT INTO AEIDEV.ARRAY_TABLE VALUES (3, ARRAY[123,1234,12345,32767]);{code}
> Accessing the values from Spark gives wrong values:
>
> {code:java}
> scala> val df =
> spark.sqlContext.read.format("org.apache.phoenix.spark").option("table","AEIDEV.ARRAY_TABLE").option("zkUrl","ithdp1101.cern.ch:2181").load
> df: org.apache.spark.sql.DataFrame = [ID: bigint, COL1: array<int>]
> scala> df.show
> ---------------------+
> ID COL1
> ---------------------+
> 1 [-647200678, -234... 2 [524288, 655369, ... 3 [80871547, 214743...
> ---------------------+
> scala> df.collect
> res3: Array[org.apache.spark.sql.Row] = Array([1,WrappedArray(-647200678,
> -234, 0, 0)], [2,WrappedArray(524288, 655369, 0, 0)],
> [3,WrappedArray(80871547, 2147430457, 0, 0)])
> {code}
> We have identified the problem in the SparkSchemaUtil class, and applied the
> tiny patch included in the report. After this, the data type is correctly
> inferred and results are correct:
>
> {code:java}
> scala> val df =
> spark.sqlContext.read.format("org.apache.phoenix.spark").option("table","AEIDEV.ARRAY_TABLE").option("zkUrl","ithdp1101.cern.ch:2181").load
> df: org.apache.spark.sql.DataFrame = [ID: bigint, COL1: array<smallint>]
> scala> df.show
> ---------------------+
> ID COL1
> ---------------------+
> 1 [-32678, -9876, -... 2 [0, 8, 9, 10] 3 [123, 1234, 12345...
> ---------------------+
> scala> df.collect
> res1: Array[org.apache.spark.sql.Row] = Array([1,WrappedArray(-32678, -9876,
> -234, -1)], [2,WrappedArray(0, 8, 9, 10)], [3,WrappedArray(123, 1234, 12345,
> 32767)])
> {code}
>
>
> We can provide more information and submit a merge request if needed.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)