[ 
https://issues.apache.org/jira/browse/PHOENIX-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15115849#comment-15115849
 ] 

Thomas D'Silva commented on PHOENIX-2135:
-----------------------------------------

Yes, I think we wanted to have DataFrameFunctions and ProductRDDFunctions call 
ColumnInfoToStringEncoder directly instead of calling 
getUpsertColumnMetaDataList. ConfigurationUtil.getOutputConfiguration is used 
to serialize columns which ends up calling 
PhoenixConfigurationUtil.setUpsertColumnNames

Previously the columns were being serialized to a single string which could 
cause issues if the the delimiter was used in the column name. We changed this 
so that the column count and each column value is written to the config. 

Both ColumnInfoToStringEncoder.encode and 
PhoenixConfigurationUtil.setUpsertColumnNames use this new way to 
serialize/deserialize so I think its fine if we don't make this change.

> Use ColumnInfoToStringEncoderDecoder.encode/decode to store the column 
> metadata in the configuration
> ----------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2135
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2135
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Thomas D'Silva
>            Assignee: Josh Mahonin
>             Fix For: 4.8.0
>
>
> In the saveToPhoenix method ProductRDDFunctions and DataFrameFunctions see if 
> we can serialize the column metadata to the configuration object using 
> ColumnInfoToStringEncoderDecoder.encode 
> and then in data.mapPartitions use ColumnInfoToStringEncoderDecoder.decode to 
> get the list of column infos. 
> Currently we call PhoenixConfigurationUtil.getUpsertColumnMetadataList for 
> each partition to get the column metadata. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to