On 2019/03/15 16:30:55, "[email protected]" <[email protected]> wrote: 
>  
> Hi Rahul,
> Thanks for creating the HIP. I have reformatted your HIP just to standardize 
> it. 
> https://docs.google.com/document/d/1bj-xpkRomVtbzvLb_4BRngDIGkkMR5yzxXRRzkA7QVo/edit?usp=sharing
> 
> Few points you need to think and elaborate in the HIP section:
> 1. You can mention how the schema of the csv file is going to be handled. We 
> should probably use the current schemaProvider and use it decode csv data. 2. 
> Using a config, make the source generic to support any configurable 
> delimiters (e:g - tab instead of comma).3. Also, you can write about how to 
> handle presence/absence of heading. I am assuming for kafka, this should not 
> be a concern but not sure if there is a standard way of storing CSV files in 
> data-lake. Do they include heading or not ? 
> 
> Let us know if you have any thoughts/questions and we will be happy to help.
> Also, If not already done, can you create a JIRA account and we can assign 
> the ticket. 
> Balaji.V
>     On Friday, March 15, 2019, 7:34:18 AM PDT, <[email protected]> 
> wrote:  
>  
>  
> 
> On 2019/03/15 06:09:04, Umesh Kacha <[email protected]> wrote: 
> > Hi Rahul I am happy to volunteer for this task in case you don't have
> > bandwidth for the same. Please advice.
> > 
> > Regards,
> > Umesh.
> > 
> > On Fri, Mar 15, 2019, 6:59 AM [email protected] <[email protected]> wrote:
> > 
> > >
> > > Hi Rahul,
> > > We do not have any ready made csv  support in deltastreamer yet. But it
> > > should be simple to extend the DeltaStreamer by implementing a CSV Source.
> > >  Would you be interested in writing a HIP -
> > > https://cwiki.apache.org/confluence/display/HUDI/Hudi+Improvement+Plan+Details+and+Process
> > >  for
> > > CSV support and implementing it ?
> > > We will be very happy to assist you on this.
> > > Thanks,Balaji.V
> > >
> > >    On Thursday, March 14, 2019, 2:48:49 AM PDT, <[email protected]>
> > > wrote:
> > >
> > >  Dear Team
> > >
> > > I tested DeltaStreamer with JsonKafka,JsonDFS ..etc  sources. If possible
> > > please suggest how i can consume CSV data from kaka/HDFS and insert it 
> > > into
> > > hudi.
> > >
> > >
> > > Thanks & Regards
> > > Rahul
> > >
> > 
> Dear Balaji
> 
> I am initiating a HIP for Csv Source Support for Hudi DeltaStreamer. 
> Please find the HIP document in the below link.
> https://docs.google.com/document/d/1bj-xpkRomVtbzvLb_4BRngDIGkkMR5yzxXRRzkA7QVo/edit?usp=sharing
> 
> I am new to this kind of open source project discussions. If there is  any 
> mistake in my HIP requested way please correct me.
> 
> @Umesh thanks you are always welcome.
> 
> Thanks & Regards
> Rahul P
> 
> 
>   
Dear Balaji

    I have edited the HIP as per your suggestion. Please advise if any further 
modification is required.
Jira Id: rahuledavalath

Thanks & Regards
Rahul 

Reply via email to