[jira] [Commented] (HUDI-69) Support realtime view in Spark datasource #136

Vinoth Chandar (Jira) Wed, 15 Apr 2020 22:49:26 -0700


    [ 
https://issues.apache.org/jira/browse/HUDI-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084560#comment-17084560
 ]


Vinoth Chandar commented on HUDI-69:
------------------------------------

. I was wondering if we can just wrap the FileFormat (Parquet/ORC both have 
formats inside Spark) , reuse its record reader for reading parquet/orc -> Row 
and also use our existing LogReader classes to read the log blocks are Row 
(instead of GenericRecord.. or we can for now do GenericRecord -> Row ).. This 
means, we need to redesign our CompactedRecordScanner etc classes to be generic 
and not implicitly assume it merging Avro/ArrayWritable per se. Must be doable.

> Support realtime view in Spark datasource #136
> ----------------------------------------------
>
>                 Key: HUDI-69
>                 URL: https://issues.apache.org/jira/browse/HUDI-69
>             Project: Apache Hudi (incubating)
>          Issue Type: New Feature
>          Components: Spark Integration
>            Reporter: Vinoth Chandar
>            Assignee: Yanjia Gary Li
>            Priority: Major
>             Fix For: 0.6.0
>
>
> https://github.com/uber/hudi/issues/136



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HUDI-69) Support realtime view in Spark datasource #136

Reply via email to