[jira] [Commented] (FLINK-10949) When use flink-1.6.2's intervalJoin funtion, the thread is stucked in rockdb's seek for too long time

Stefan Richter (JIRA) Mon, 26 Nov 2018 04:32:34 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-10949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698862#comment-16698862
 ]


Stefan Richter commented on FLINK-10949:
----------------------------------------

Hi, as commented on the mailing list I think this needs to be rephrased because 
the seek is not blocking, but multiple seek calls are slow. Are you running on 
disks or ssds? Starting with 1.7 we are also exposing many internal metrics of 
RocksDB. Maybe you can have a look there if anything looks suspiciouis? From my 
experience, sometimes seeks can become slow if there are many tombstones for 
deleted elements and the seek is not precise and has to skip through all the 
tombstones. Maybe a more aggressive config for compation could help in that 
case. Or are you just experiencing the slowdown when going from memory (caches) 
to hitting the disk?

> When use flink-1.6.2's intervalJoin funtion, the thread is stucked in 
> rockdb's seek for too long time
> -----------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-10949
>                 URL: https://issues.apache.org/jira/browse/FLINK-10949
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.6.2
>         Environment: flink1.6.2, linux
>            Reporter: Liu
>            Priority: Major
>              Labels: features, performance
>
> 0down vote 
> [favorite|https://stackoverflow.com/questions/53393775/when-use-flink-1-6-2s-intervaljoin-funtion-the-thread-is-stucked-in-seek-for-t]
>   
>  I am using IntervalJoin function to join two streams within 10 minutes. As 
> below:
>  
>  {{labelStream.intervalJoin(adLogStream)}}
>                          .between(Time.milliseconds(0), 
> Time.milliseconds(600000)) 
>  {{           .process(new processFunction())}}
>  {{           .sink(kafkaProducer)}}
>  labelStream and adLogStream are proto-buf class that are keyed by Long id.
> Our two input-streams are huge. After running about 30minutes, the output to 
> kafka go down slowly, like this: 
> !https://i.stack.imgur.com/UW4V1.png!
> When data output begins going down, I use jstack and pstack sevaral times to 
> get these: 
> !https://i.stack.imgur.com/uxOZn.png!
> !https://i.stack.imgur.com/JTyIC.png!
> It seems the program is stucked in rockdb's seek. And I find that some 
> rockdb's srt file are accessed slowly by iteration. 
> [!https://i.stack.imgur.com/Avdyo.png!|https://i.stack.imgur.com/Avdyo.png]
> I have tried several ways:
> {{1)Reduce the input amount to half. This works well.}}
> 2)Replace labelStream and adLogStream with simple Strings. This way, data 
> amount will not change. This works well.}}
> 3)Use PredefinedOptions like SPINNING_DISK_OPTIMIZED and 
> SPINNING_DISK_OPTIMIZED_HIGH_MEM. This still fails.}}
> 4)Use new versions of rocksdbjni. This still fails.}}
> {{Can anyone give me some suggestions? Thank you very much.}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10949) When use flink-1.6.2's intervalJoin funtion, the thread is stucked in rockdb's seek for too long time

Reply via email to