[ 
https://issues.apache.org/jira/browse/HUDI-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5202:
----------------------------
    Description: 
For Azure blob storage, similar to HUDI-1897

GH issue request
 - [https://github.com/apache/hudi/issues/7158]

Since Streaming ingestion using DeltaStreamer from DFS that contains parquet 
files has a problem as explained in this blog here 
[https://medium.com/apache-hudi-blogs/reliable-ingestion-from-aws-s3-using-hudi-b7d5590c78a9]
 here, I'm working on implementing a similar setup described in the above 
mentioned blog for ingesting parquet files stored in Azure blob storage and 
enable event triggers to Azure storage queue.
Currently, hudi-utilities/sources contains support only for S3 events source 
(S3EventsSource.java) and incremental pulls from S3 
(S3EventsHoodieIncrSource.java). The ingestion pattern doesn't seem to support 
equivalent Azure cloud stack.

  was:
For Azure blob storage, similar to HUDI-1897

GH issue request
- https://github.com/apache/hudi/issues/7158


> Implement DeltaStreamer EventsSource and EventsHoodieIncrSource for Azure 
> Blob storage
> --------------------------------------------------------------------------------------
>
>                 Key: HUDI-5202
>                 URL: https://issues.apache.org/jira/browse/HUDI-5202
>             Project: Apache Hudi
>          Issue Type: New Feature
>          Components: deltastreamer
>            Reporter: Raymond Xu
>            Priority: Major
>             Fix For: 0.14.0
>
>
> For Azure blob storage, similar to HUDI-1897
> GH issue request
>  - [https://github.com/apache/hudi/issues/7158]
> Since Streaming ingestion using DeltaStreamer from DFS that contains parquet 
> files has a problem as explained in this blog here 
> [https://medium.com/apache-hudi-blogs/reliable-ingestion-from-aws-s3-using-hudi-b7d5590c78a9]
>  here, I'm working on implementing a similar setup described in the above 
> mentioned blog for ingesting parquet files stored in Azure blob storage and 
> enable event triggers to Azure storage queue.
> Currently, hudi-utilities/sources contains support only for S3 events source 
> (S3EventsSource.java) and incremental pulls from S3 
> (S3EventsHoodieIncrSource.java). The ingestion pattern doesn't seem to 
> support equivalent Azure cloud stack.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to