[ 
https://issues.apache.org/jira/browse/FLINK-17590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-17590:
----------------------------
    Description: 
Hive sink will reuse the Buckets class of StreamingFileSink, which encapsulate 
most of the logic of StreamingFileSink. Hive sink requires to writing one-piece 
of meta-info into Hive meta store after a partition (namely Bucket in 
StreamingFileSink) has been terminated. Currently the termination is judged by 
event-time/processing time 
([FLIP-115|https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table]).

 

To support the requirement of the Hive Sink, we would add listener for 
acquiring the event bucket creation and getting inactive. A bucket get inactive 
if all the previous records have been committed. Then Hive Sink could safely 
writing meta-info if the time has exceeded the bucket's boundary and it has 
been inactive. 

  was:
Hive sink will reuse the Buckets class of StreamingFileSink, which encapsulate 
most of the logic of StreamingFileSink. Hive sink requires to writing one-piece 
of meta-info into Hive meta store after a partition (namely Bucket in 
StreamingFileSink) has been terminated. Currently the termination is judged by 
event-time/processing time 
([FLIP-115|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table]]).

 

To support the requirement of the Hive Sink, we would add listener for 
acquiring the event bucket creation and getting inactive. A bucket get inactive 
if all the previous records have been committed. Then Hive Sink could safely 
writing meta-info if the time has exceeded the bucket's boundary and it has 
been inactive. 


> Add Bucket lifecycle listener to support acquiring bucket state
> ---------------------------------------------------------------
>
>                 Key: FLINK-17590
>                 URL: https://issues.apache.org/jira/browse/FLINK-17590
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / FileSystem
>            Reporter: Yun Gao
>            Assignee: Yun Gao
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.11.0
>
>
> Hive sink will reuse the Buckets class of StreamingFileSink, which 
> encapsulate most of the logic of StreamingFileSink. Hive sink requires to 
> writing one-piece of meta-info into Hive meta store after a partition (namely 
> Bucket in StreamingFileSink) has been terminated. Currently the termination 
> is judged by event-time/processing time 
> ([FLIP-115|https://cwiki.apache.org/confluence/display/FLINK/FLIP-115%3A+Filesystem+connector+in+Table]).
>  
> To support the requirement of the Hive Sink, we would add listener for 
> acquiring the event bucket creation and getting inactive. A bucket get 
> inactive if all the previous records have been committed. Then Hive Sink 
> could safely writing meta-info if the time has exceeded the bucket's boundary 
> and it has been inactive. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to