Hi Raghvendra,

You would have to re-write you Parquet Dataset in Hudi format. Here are the 
links you can follow to get started:
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hudi-work-with-dataset.html
https://hudi.apache.org/docs/querying_data.html#spark-incr-pull

Thanks,
Udit

On 2/12/20, 10:27 AM, "Raghvendra Dhar Dubey" 
<[email protected]> wrote:

    Hi Team,
    
    I want to setup incremental view of my AWS S3 parquet data through Apache
    Hudi, and want to query this data through Athena, but currently Athena not
    supporting Hudi Dataset.
    
    so there are few questions which I want to understand here
    
    1 - How to stream s3 parquet file to Hudi dataset running on EMR.
    
    2 - How to query Hudi Dataset running on EMR
    
    Please help me to understand this.
    
    Thanks
    
    Raghvendra
    

Reply via email to