[
https://issues.apache.org/jira/browse/CARBONDATA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ajantha Bhat resolved CARBONDATA-3565.
--------------------------------------
Fix Version/s: 2.0.0
Resolution: Fixed
> Binary to string issue when loading dataframe data in NewRddIterator
> --------------------------------------------------------------------
>
> Key: CARBONDATA-3565
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3565
> Project: CarbonData
> Issue Type: Bug
> Components: spark-integration
> Affects Versions: 1.6.0
> Reporter: ChenKai
> Priority: Major
> Fix For: 2.0.0
>
> Time Spent: 6h
> Remaining Estimate: 0h
>
> * issue
> Spark DataFrame(SQL) load complex binary data to a hive table, the data will
> be broken when reading out. I see in RddIterator, the data will be converted
> to a string, and then be converted back.
> * test case
> Binary data can be *DataOutputStream#writeDouble* and so on.
> * discussion
> I think *CarbonScalaUtil#getString* operation can be removed now. I dig deep
> into the code in 2016, the code was used in kettle *CsvInput* (commit:
> 0018756d). But the code has been removed now, I think this converting
> operation is a little redundant. (UPDATE: The follow-up code GenericParser
> will use this string-convert logic, should consider here.)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)