[ https://issues.apache.org/jira/browse/PHOENIX-7377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Istvan Toth updated PHOENIX-7377: --------------------------------- Component/s: connectors spark-connector > phoenix5-spark dataframe issue with schema inference > ---------------------------------------------------- > > Key: PHOENIX-7377 > URL: https://issues.apache.org/jira/browse/PHOENIX-7377 > Project: Phoenix > Issue Type: Bug > Components: connectors, spark-connector > Reporter: rejeb ben rejeb > Priority: Major > > The fix of the PHOENIX-4981 introduced a bracking change in the way the > schema was inferred. > In previous versions of the connector, for non default column family , > columns mapped to "columnName" in DataFrame. Now, they are mapped to > "columnFamily.columnName". > There are no unit tests that cover this case, all tests uses tables with > default column family "0". > The change is made is this [pull > request|https://github.com/apache/phoenix/pull/402] (the project was moved to > another git repo since): > * In previous version code uses `ColumnInfo.getDisplayName` to define the > name of the column in the DF. > * The new class SparkSchemaUtil the method used is > `ColumnInfo.getColumnName` which returns the columnName as > `columnFamilyName.columnName`. > The pull request is related to this ticket PHOENIX-4981 the change is not > documented. > This change breaks jobs reading from tables having a non default column > family. > The saprk3 connector have the same issue since code has been duplicated from > spark2 module to spark3 module. > Since V1 api has been modified to use same method to resolve schema it has > the same behavior and it should not bcause they are now a deprecated classes > and should not contain a braking change. > > *Resolution proposal:* > The best way to fix the issue is to add a property to have both options for > schema non default column family column name mapping. > The issue is in spark connector and it's resolution will not have a side > effect on other phoenix-connectors like phoenix5-hive for example. -- This message was sent by Atlassian Jira (v8.20.10#820010)