[ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10265:
-----------------------------
    Description: 
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
      // This new merge adds to the backlog: increase IO throttle by 20%
      targetMBPerSec *= 1.20; 
      if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
        targetMBPerSec = MAX_MERGE_MB_PER_SEC;
      }
      ......
} else {
      // We are not falling behind: decrease IO throttle by 10%
      targetMBPerSec /= 1.10;
      if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
        targetMBPerSec = MIN_MERGE_MB_PER_SEC;
      }
     ......
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS             ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS             ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS             ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS             ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS             ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. changing it by the atomic operation.
2. adding the `volatile` attribute to `targetMBPerSec`.


  was:
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
      // This new merge adds to the backlog: increase IO throttle by 20%
      targetMBPerSec *= 1.20; 
      if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
        targetMBPerSec = MAX_MERGE_MB_PER_SEC;
      }
      ......
} else {
      // We are not falling behind: decrease IO throttle by 10%
      targetMBPerSec /= 1.10;
      if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
        targetMBPerSec = MIN_MERGE_MB_PER_SEC;
      }
     ......
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS             ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS             ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS             ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS             ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS             ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. change it by the atomic operation.
2. add the `volatile` attribute to `targetMBPerSec`.



> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-10265
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10265
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/other
>    Affects Versions: 8.6.2
>            Reporter: kkewwei
>            Priority: Major
>
> It's known that merge io write throttle rate is under the control  of 
> `targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
> Ceiling(1024MB/s).
> `targetMBPerSec` is shared by many merge threads, it will be changed by the 
> way:
> {code:java}
> if (newBacklog) {
>       // This new merge adds to the backlog: increase IO throttle by 20%
>       targetMBPerSec *= 1.20; 
>       if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
>         targetMBPerSec = MAX_MERGE_MB_PER_SEC;
>       }
>       ......
> } else {
>       // We are not falling behind: decrease IO throttle by 10%
>       targetMBPerSec /= 1.10;
>       if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
>         targetMBPerSec = MIN_MERGE_MB_PER_SEC;
>       }
>      ......
> }
> {code}
> The modification process is not a atomic operation:
> # `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
> # other merge thread will read the new value(1024*1.2).
> # the first merge thread limit the value to be 1024.
> The bad case will happen.
> In product, we do find that IO write throttle rate is beyond the 
> Ceiling(1024MB/s) in the merge.
> {code:java}
> [2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS             ] [data1] 
> [test1][25] elasticsearch[data1][refresh][T#5] MS: io throttle: current merge 
> backlog; leave IO rate at 3589.1 MB/sec
> [2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS             ] [data1] 
> [test1][13] elasticsearch[data1][write][T#3] MS: io throttle: current merge 
> backlog; leave IO rate at 192.4 MB/sec
> [2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS             ] [data1] 
> [test1][22] elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: 
> io throttle: current merge backlog; leave IO rate at 96.3 MB/sec
> [2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS             ] [data1] 
> [test1][16] elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: 
> io throttle: current merge backlog; leave IO rate at 419.2 MB/sec
> [2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS             ] [data1] 
> [test1][19] elasticsearch[data1][write][T#2] MS: io throttle: current merge 
> backlog; leave IO rate at 3051.5 MB/sec
> {code}
> If we shoud do the following:
> 1. changing it by the atomic operation.
> 2. adding the `volatile` attribute to `targetMBPerSec`.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to