Re: [DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-20 Thread Vinoth Chandar
I would for all these new things to be revamped on top of Spark 3's newer
APIs
(it's kind of frustrating that the datasource APIs don't stabilize easily
in Spark)

I am thinking we can implement a "hudi3" format using Spark 3, with support
for SQL Merges, existing functionality and a redone Spark Structured
Streaming support.

I know I may be increasing the scope. So feel free to push back and have
this just be about getting the streaming reads working as well.

On Thu, Aug 20, 2020 at 10:50 AM Balaji Varadarajan
 wrote:

>  Hi linshan,
> Sorry for the delay in responding. It is better to discuss code changes
> over draft PR. Can you open one and tag us there. At a high level, it looks
> like you are using Spark Datasource v2 APIs while currently the structured
> streaming write is implemented using V1 API. Let's discuss this over a PR.
> We have few folks (Gary, Udit) who know about this part better than me.
> They can help you out here.
> Balaji.V
>
> On Tuesday, August 18, 2020, 08:03:01 PM PDT, linshan <
> mabin194...@163.com> wrote:
>
>  hi team:
> I need  help,After a few days of thinking, trial and error, I have no
> idea.I wrote the relevant information on this page。Please follow this link(
> https://issues.apache.org/jira/browse/HUDI-1126)。
>
> Best,
> linshan-ma


Re: [DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-20 Thread Balaji Varadarajan
 Hi linshan,
Sorry for the delay in responding. It is better to discuss code changes over 
draft PR. Can you open one and tag us there. At a high level, it looks like you 
are using Spark Datasource v2 APIs while currently the structured streaming 
write is implemented using V1 API. Let's discuss this over a PR. We have few 
folks (Gary, Udit) who know about this part better than me. They can help you 
out here.
Balaji.V

On Tuesday, August 18, 2020, 08:03:01 PM PDT, linshan  
wrote:  
 
 hi team:
    I need  help,After a few days of thinking, trial and error, I have no 
idea.I wrote the relevant information on this page。Please follow this 
link(https://issues.apache.org/jira/browse/HUDI-1126)。
  
Best,
linshan-ma  

[DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-18 Thread linshan
hi team:
 I need  help,After a few days of thinking, trial and error, I have no 
idea.I wrote the relevant information on this page。Please follow this 
link(https://issues.apache.org/jira/browse/HUDI-1126)。
   
Best,
linshan-ma