[jira] [Commented] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-05-19 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539903#comment-17539903
 ] 

kkewwei commented on LUCENE-10516:
--

For the spareDocValues, we use compression to store data: sameCount, 
detailValue,  In the BKDReader, we compare the same batch docIds in the loop, 
the iterator seems useless.

{code:java}
// read cardinality and point
  private void visitSparseRawDocValues(int[] commonPrefixLengths, byte[] 
scratchPackedValue, IndexInput in, BKDReaderDocIDSetIterator scratchIterator, 
int count, IntersectVisitor visitor) throws IOException {
int i;
for (i = 0; i < count;) {
  // read the same values count
  int length = in.readVInt();
 // read the detail values
  for(int dim = 0; dim < numDataDims; dim++) {
int prefix = commonPrefixLengths[dim];
in.readBytes(scratchPackedValue, dim*bytesPerDim + prefix, bytesPerDim 
- prefix);
  }
  scratchIterator.reset(i, length); 
 // iterate compare every same values.
  visitor.visit(scratchIterator, scratchPackedValue); 
  i += length;
}
if (i != count) {
  throw new CorruptIndexException("Sub blocks do not add up to the expected 
count: " + count + " != " + i, in);
}
  }
{code}


> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:*scratchPackedValue*, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop seems useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196
> If we should override the *visit(DocIdSetIterator iterator, byte[] 
> packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
> calling the default implement:
> {code:java}
> @Override
> public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> queryCancellation.checkCancelled();
> in.visit(iterator, packedValue);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-04-15 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei resolved LUCENE-10448.
--
Resolution: Not A Problem

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:*scratchPackedValue*, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}




> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:*scratchPackedValue*, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> 

[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:*scratchPackedValue*, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop seems useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:*scratchPackedValue*, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}




> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:*scratchPackedValue*, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop seems useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> 

[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Affects Version/s: (was: 8.11.1)

> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196
> If we should override the *visit(DocIdSetIterator iterator, byte[] 
> packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
> calling the default implement:
> {code:java}
> @Override
> public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> queryCancellation.checkCancelled();
> in.visit(iterator, packedValue);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

if we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196
> if we should override the *visit(DocIdSetIterator iterator, byte[] 
> packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
> calling the default implement:
> {code:java}
> @Override
> public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> 

[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

If we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}



  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
We know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

We should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196

if we should override the *visit(DocIdSetIterator iterator, byte[] 
packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
calling the default implement:
{code:java}
@Override
public void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
queryCancellation.checkCancelled();
in.visit(iterator, packedValue);
}
{code}




> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> 

[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Affects Version/s: 8.11.1
   8.6.2

> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Affects Versions: 8.6.2, 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> We know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> We should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196
> If we should override the *visit(DocIdSetIterator iterator, byte[] 
> packedValue)* in *ExitableDirectoryReader$ExitableIntersectVisitor* to avoid 
> calling the default implement:
> {code:java}
> @Override
> public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> queryCancellation.checkCancelled();
> in.visit(iterator, packedValue);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same for the batch of docIds, if the first doc 
match the range, the batch of other docIds will also match the range, so the 
loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196


  was:
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> we know that the packedValue are same for the batch of docIds, if the first 
> doc match the range, the batch of other docIds will also match the range, so 
> the loop is useless.
> we should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10516:
-
Description: 
In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
*visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196


  was:
In `BKDReader.visitSparseRawDocValues()`, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
`visitor.visit(scratchIterator, scratchPackedValue)` to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



> reduce unnecessary loop matches in BKDReader
> 
>
> Key: LUCENE-10516
> URL: https://issues.apache.org/jira/browse/LUCENE-10516
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/other
>Reporter: kkewwei
>Priority: Major
>
> In *BKDReader.visitSparseRawDocValues()*, we will read a batch of docIds 
> which have the same point value:scratchPackedValue, then call 
> *visitor.visit(scratchIterator, scratchPackedValue)* to find which docIDs 
> match the range.
> {code:java}
> default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
> IOException {
>   int docID;
>   while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
> visit(docID, packedValue); 
>   }
> }
> {code}
> we know that the packedValue are same, if the first doc match the range, the 
> batch of docIds will also match the range, so the loop is useless.
> we should call the method as follow:
> {code:java}
>   public void visit(DocIdSetIterator iterator, byte[] packedValue) 
> throws IOException {
> if (matches(packedValue)) {
>   int docID;
>   while ((docID = iterator.nextDoc()) != 
> DocIdSetIterator.NO_MORE_DOCS) {
> visit(docID);
>   }
> }
>   }
> {code}
> https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10516) reduce unnecessary loop matches in BKDReader

2022-04-14 Thread kkewwei (Jira)
kkewwei created LUCENE-10516:


 Summary: reduce unnecessary loop matches in BKDReader
 Key: LUCENE-10516
 URL: https://issues.apache.org/jira/browse/LUCENE-10516
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: kkewwei


In `BKDReader.visitSparseRawDocValues()`, we will read a batch of docIds which 
have the same point value:scratchPackedValue, then call 
`visitor.visit(scratchIterator, scratchPackedValue)` to find which docIDs match 
the range.

{code:java}
default void visit(DocIdSetIterator iterator, byte[] packedValue) throws 
IOException {
  int docID;
  while ((docID = iterator.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { 
visit(docID, packedValue); 
  }
}
{code}
we know that the packedValue are same, if the first doc match the range, the 
batch of docIds will also match the range, so the loop is useless.

we should call the method as follow:

{code:java}
  public void visit(DocIdSetIterator iterator, byte[] packedValue) 
throws IOException {
if (matches(packedValue)) {
  int docID;
  while ((docID = iterator.nextDoc()) != 
DocIdSetIterator.NO_MORE_DOCS) {
visit(docID);
  }
}
  }
{code}

https://github.com/apache/lucene/blob/2e941fcfed6cad3d9c8667ff5324cd04858ba547/lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java#L196




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-23 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511062#comment-17511062
 ] 

kkewwei edited comment on LUCENE-10448 at 3/23/22, 7:16 AM:


As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( [PR-#741|https://github.com/apache/lucene/pull/741]), the 
rate will be far more the limited rate(the rate will depend on the disk).

I don't know whether it's a bug, if not, I will close the issue.


was (Author: kkewwei):
As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( [PR-#741|https://github.com/apache/lucene/pull/741]), the 
rate will be far more the limited rate.

I don't know whether it's a bug, if not, I will close the issue.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-23 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511062#comment-17511062
 ] 

kkewwei edited comment on LUCENE-10448 at 3/23/22, 7:15 AM:


As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( [PR-#741|https://github.com/apache/lucene/pull/741]), the 
rate will be far more the limited rate.
I don't know whether it's a bug, if not, I will close the issue.


was (Author: kkewwei):
As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( [PR-#741|https://github.com/apache/lucene/pull/741]), the 
rate will be far more the limited rate.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-23 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511062#comment-17511062
 ] 

kkewwei edited comment on LUCENE-10448 at 3/23/22, 7:15 AM:


As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( [PR-#741|https://github.com/apache/lucene/pull/741]), the 
rate will be far more the limited rate.

I don't know whether it's a bug, if not, I will close the issue.


was (Author: kkewwei):
As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( [PR-#741|https://github.com/apache/lucene/pull/741]), the 
rate will be far more the limited rate.
I don't know whether it's a bug, if not, I will close the issue.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-23 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511062#comment-17511062
 ] 

kkewwei edited comment on LUCENE-10448 at 3/23/22, 7:14 AM:


As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( [PR-#741|https://github.com/apache/lucene/pull/741]), the 
rate will be far more the limited rate.


was (Author: kkewwei):
As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( PR-#741,), the rate will be far more the limited rate.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-23 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511062#comment-17511062
 ] 

kkewwei edited comment on LUCENE-10448 at 3/23/22, 7:13 AM:


As the instant rate limit, the point is whether we care about these non-pause 
chunk writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( PR-#741,), the rate will be far more the limited rate.


was (Author: kkewwei):
As the instant rate limit, the point is whether we care about these non-pause 
writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( PR-#741,), the rate will be far more the limited rate.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-23 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17511062#comment-17511062
 ] 

kkewwei commented on LUCENE-10448:
--

As the instant rate limit, the point is whether we care about these non-pause 
writes:
* If we count the lasted time from the last chunk writing(the next chunk write 
may happen after a long time gap), then the average rate will be small
*  If we count the lasted time from the time we start write the first byte of 
the current chunk( PR-#741,), the rate will be far more the limited rate.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-22 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17510477#comment-17510477
 ] 

kkewwei edited comment on LUCENE-10448 at 3/22/22, 1:09 PM:


Optimization seems have nothing to do with memory pressure, it is guaranteed 
that there will be no big chunks written to disk in theory, what am I missing?


was (Author: kkewwei):
Optimization seems have nothing to do with memory pressure, it is guaranteed 
that there will be no big chunks written to disk in theory, What am I missing?

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-22 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17510477#comment-17510477
 ] 

kkewwei edited comment on LUCENE-10448 at 3/22/22, 1:08 PM:


Optimization seems have nothing to do with memory pressure, it is guaranteed 
that there will be no big chunks written to disk in theory, What am I missing?


was (Author: kkewwei):
Optimization seems have nothing to do with memory pressure, it is guaranteed 
that there will be no big chunks written to disk in theory.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-22 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17510477#comment-17510477
 ] 

kkewwei commented on LUCENE-10448:
--

Optimization seems have nothing to do with memory pressure, it is guaranteed 
that there will be no big chunks in theory.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-22 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17510477#comment-17510477
 ] 

kkewwei edited comment on LUCENE-10448 at 3/22/22, 1:00 PM:


Optimization seems have nothing to do with memory pressure, it is guaranteed 
that there will be no big chunks written to disk in theory.


was (Author: kkewwei):
Optimization seems have nothing to do with memory pressure, it is guaranteed 
that there will be no big chunks in theory.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-21 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509637#comment-17509637
 ] 

kkewwei commented on LUCENE-10448:
--

 {quote}
 I think it is better to check and pause before actually writing instead of 
after
{quote}
In such cases, the code should be:
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
checkRate();
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
  }
{code}
instead of current way, maybe it's a minor change and not worth change it.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-18 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509171#comment-17509171
 ] 

kkewwei edited comment on LUCENE-10448 at 3/19/22, 5:39 AM:


[~vigyas] I test again, and want to find any no-pause bytes, which is 1.5 times 
bigger than minPauseCheckBytes,  but find nothing, maybe we should write it in 
chunks logically.
 
If as you say, we should include the pause time, MergeRateLimiter is doing just 
that.

[~jpountz], please help confirm RateLimitedIndexOutput.writeBytes():
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
{code}

We should execute delegate.writeBytes(b, offset, length) first, and execute 
checkRate() later, because the bytes are not written, but are counted into the 
written indicator, here exists a little logical problem. we should change the 
order as follow:
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
{code}





was (Author: kkewwei):
[~vigyas] I test again, and want to find any no-pause bytes, which 1.5 times 
bigger than minPauseCheckBytes,  but find nothing, maybe we should write it in 
chunks logically.
 
If as you say, we should include the pause time, MergeRateLimiter is doing just 
that.

[~jpountz], please help confirm RateLimitedIndexOutput.writeBytes():
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
{code}

We should execute delegate.writeBytes(b, offset, length) first, and execute 
checkRate() later, because the bytes are not written, but are counted into the 
written indicator, here exists a little logical problem. we should change the 
order as follow:
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
{code}




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-18 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509171#comment-17509171
 ] 

kkewwei edited comment on LUCENE-10448 at 3/19/22, 5:38 AM:


[~vigyas] I test again, and want to find any no-pause bytes, which 1.5 times 
bigger than minPauseCheckBytes,  but find nothing, maybe we should write it in 
chunks logically.
 
If as you say, we should include the pause time, MergeRateLimiter is doing just 
that.

[~jpountz], please help confirm RateLimitedIndexOutput.writeBytes():
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
{code}

We should execute delegate.writeBytes(b, offset, length) first, and execute 
checkRate() later, because the bytes are not written, but are counted into the 
written indicator, here exists a little logical problem. we should change the 
order as follow:
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
{code}





was (Author: kkewwei):
[~vigyas] I test again, and want to find any no-pause bytes, which 1.5 times 
bigger than minPauseCheckBytes,  but find nothing. if as you say, we should 
include the pause time, MergeRateLimiter is doing just that. maybe we should 
write it in chunks.

[~jpountz], please help confirm:
RateLimitedIndexOutput.writeBytes():
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
{code}

We should execute delegate.writeBytes(b, offset, length) first, and execute 
checkRate() later, because the bytes are not written, but are counted into the 
written indicator, here exists a little logical problem. we should change the 
order as follow:
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
{code}




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-18 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509171#comment-17509171
 ] 

kkewwei commented on LUCENE-10448:
--

[~vigyas] I test again, and want to find any no-pause bytes, which 1.5 times 
bigger than minPauseCheckBytes,  but find nothing. if as you say, we should 
include the pause time, MergeRateLimiter is doing just that. maybe we should 
write it in chunks.

[~jpountz], please help confirm:
RateLimitedIndexOutput.writeBytes():
{code:java}
  @Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
{code}

We should execute delegate.writeBytes(b, offset, length) first, and execute 
checkRate() later, because the bytes are not written, but are counted into the 
written indicator, here exists a little logical problem. we should change the 
order as follow:
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
{code}




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504753#comment-17504753
 ] 

kkewwei edited comment on LUCENE-10448 at 3/11/22, 7:15 AM:


When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time1=0.9125ms, to slow down we have to pause for 24.0875ms(time2-time1).


was (Author: kkewwei):
When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
rate1=0.72/800=0.9125ms,  rate2=0.72/29.2=25.0ms. If we don't stop, the cost 
rate1=0.9125ms, to slow down we have to pause for 24.0875ms(rate2-rate1).

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504753#comment-17504753
 ] 

kkewwei edited comment on LUCENE-10448 at 3/11/22, 7:14 AM:


When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
rate1=0.72/800=0.9125ms,  rate2=0.72/29.2=25.0ms. If we don't stop, the cost 
rate1=0.9125ms, to slow down we have to pause for 24.0875ms(rate2-rate1).


was (Author: kkewwei):
When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504753#comment-17504753
 ] 

kkewwei edited comment on LUCENE-10448 at 3/11/22, 7:13 AM:


When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).


was (Author: kkewwei):
When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504753#comment-17504753
 ] 

kkewwei edited comment on LUCENE-10448 at 3/11/22, 7:11 AM:


When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).


was (Author: kkewwei):
When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504753#comment-17504753
 ] 

kkewwei edited comment on LUCENE-10448 at 3/11/22, 7:10 AM:


When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time1, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).


was (Author: kkewwei):
When we write into the new chunk, the time is time1, when the chunk bytes reach 
currentMinPauseCheckBytes, the time is time1, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504753#comment-17504753
 ] 

kkewwei edited comment on LUCENE-10448 at 3/11/22, 7:10 AM:


When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time2 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).


was (Author: kkewwei):
When we start write into the new chunk, the time is time1, when the chunk bytes 
reach currentMinPauseCheckBytes, the time is time1, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504753#comment-17504753
 ] 

kkewwei commented on LUCENE-10448:
--

When we write into the new chunk, the time is time1, when the chunk bytes reach 
currentMinPauseCheckBytes, the time is time1, 
detailRate=bytesSinceLastPause/(time2-time1), the reason why  instant rate is 
so high is that time2-time1 is too small, to meet the requirement, we must 
pause.

 

For example, the chunk=0.72mb/s, the io limit=800mb/s, the throttle=29.2mb/s,  
time1=0.72/800=0.9125ms,  time2=0.72/29.2=25.0ms. If we don't stop, the cost 
time=0.9125ms, to slow down we have to pause for 24.0875ms(25.0-0.9125).

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504268#comment-17504268
 ] 

kkewwei edited comment on LUCENE-10448 at 3/10/22, 2:28 PM:


With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

[~vigyas] I have seen you code, it seems don't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, the start time of chunk writing lasted time is the 
time we start write bytes, not the last chunk writing end time.



was (Author: kkewwei):
With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

[~vigyas] I have seen you code, it seems don't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we count the starting 
time, not the last end time.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504268#comment-17504268
 ] 

kkewwei edited comment on LUCENE-10448 at 3/10/22, 2:25 PM:


With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

[~vigyas] I have seen you code, it seems don't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we count the starting 
time, not the last end time.



was (Author: kkewwei):
With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

[~vigyas] I have seen you code, it seems don't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we begin to count the 
starting time, not the last end time.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504268#comment-17504268
 ] 

kkewwei edited comment on LUCENE-10448 at 3/10/22, 2:24 PM:


With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

[~vigyas] I have seen you code, it seems don't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we begin to count the 
starting time, not the last end time.



was (Author: kkewwei):
With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

[~vigyas] I have see you code, it seems doesn't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we begin to count the 
starting time, not the last end time.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504268#comment-17504268
 ] 

kkewwei edited comment on LUCENE-10448 at 3/10/22, 1:39 PM:


With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

[~vigyas] I have see you code, it seems doesn't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we begin to count the 
starting time, not the last end time.



was (Author: kkewwei):
With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

I have see you code, it seems doesn't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we begin to count the 
starting time, not the last end time.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-10 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504268#comment-17504268
 ] 

kkewwei commented on LUCENE-10448:
--

With many the statistics, all the detailBytes(mb) are always ~0.73mb, it is a 
high probability that there are no large chunk bytes.  the reason why instant 
rate of up to 460 mb/s is that there is no wait. if we write 1mb to disk, the 
writing rate can be 500mb/s+.

I have see you code, it seems doesn't solve the case: 
If there is a long interval between two chunks writing, then the second chunk 
write will be not paused, as the result, the instant writing rate of the second 
chunk is high, which is far more than the limited rate. 

I raised [PR-#741|https://github.com/apache/lucene/pull/741], to avoid the long 
interval between two chunks, when we start write bytes, we begin to count the 
starting time, not the last end time.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-08 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17503238#comment-17503238
 ] 

kkewwei commented on LUCENE-10448:
--

In our product, the instant rate of writing is 200mb/s in a few cases(the 
statistics collection is the above method, there should be no problem), which 
is far more than the limited rate, In that case, we should pause 10ms+, but the 
pause is ignored. 

[~jpountz][~vigyas], very looking forward to your confirmation, if you think it 
is not a bug, I will close the issue.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/8/22, 12:43 AM:


[~vigyas], [~jpountz] I count the burst write rate of no-pause bytes with high 
pressure of writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The rate of the no-pause frequency is about 0-10%, It depends on the writing 
pressure, In my test, the write thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas], [~jpountz] I count the burst write rate of no-pause bytes with high 
pressure of writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}
*callTimes=852* 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/8/22, 12:42 AM:


[~vigyas], [~jpountz] I count the burst write rate of no-pause bytes with high 
pressure of writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate of the no-pause-write frequency is about 0-10%, It depends 
on the writing pressure, In my test, the write thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas], [~jpountz] I count the burst write rate of no-pause bytes with high 
pressure of writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 9:31 AM:
---

[~vigyas], [~jpountz] I count the burst write rate of no-pause bytes with high 
pressure of writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] [~jpountz] I count the burst write rate of no-pause bytes with high 
pressure of writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-07 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 9:25 AM:
---

[~vigyas] [~jpountz] I count the burst write rate of no-pause bytes with high 
pressure of writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 6:26 AM:
---

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][26] merge segment [_43a] done: took [25.2s], [317.8 MB], [176,633 
docs], [0s stopped], [19.8s throttled], [625.8 MB written], [29.2 MB/sec 
throttle], [callTimes=852],[ignorePauseTimes=49],  [detailBytes(mb) = 
[0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 0.7303219, 0.7305584, 
0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 0.73030186, 
0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 0.73110104, 0.7306318, 
0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 0.73031235, 0.7302904, 
0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 0.7303152, 0.7303295, 
0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 0.734375, 0.734375, 
0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 0.734375, 0.734375, 
0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 3.8783703, 
2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 4.484641, 
3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 2.7105507E-8, 
0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 0.77062935, 0.7450095, 
2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 4.7222166, 1.9614059, 7.041826, 
8.008545, 2.534279, 17.670755, 6.7497787, 2.7129008E-8, 19.349627, 26.39924, 
16.710173, 5.9312387, 12.802376, 10.644308, 4.5160117, 14.152909, 2.8590457, 
460.67938, 432.62634, 92.555466, 290.15073, 12.046124]], 
[biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 4:59 AM:
---

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 
290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the lasted time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei edited comment on LUCENE-10448 at 3/7/22, 4:59 AM:
---

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 
290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the last time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}
with the lasted time and writing bytes, It's easy to compute the instant rate.





was (Author: kkewwei):
[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], 

[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-06 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17502076#comment-17502076
 ] 

kkewwei commented on LUCENE-10448:
--

[~vigyas] I count the burst write rate of no-pause bytes with high pressure of 
writing:
{code:java}
[2022-03-07T08:23:05,864][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[log.wpt_mopadres_tracelog.20220307_280][26] merge segment [_43a] done: took 
[25.2s], [317.8 MB], [176,633 docs], [0s stopped], [19.8s throttled], [625.8 MB 
written], [29.2 MB/sec throttle], [callTimes=852],[ignorePauseTimes=49],  
[detailBytes(mb) = [0.7422037, 0.73268795, 0.7350941, 0.730608, 0.7306595, 
0.7303219, 0.7305584, 0.73028755, 0.7304802, 0.73048687, 0.7303219, 0.73038864, 
0.73030186, 0.7305927, 0.7303219, 0.73028755, 0.73043823, 0.7314129, 
0.73110104, 0.7306318, 0.7303457, 0.7315569, 0.731061, 0.73035717, 0.73029804, 
0.73031235, 0.7302904, 0.7303295, 0.73033714, 0.7304115, 0.7304363, 0.73035145, 
0.7303152, 0.7303295, 0.7309208, 0.73061085, 0.7315531, 0.7372618, 0.734375, 
0.734375, 0.734375, 0.73813057, 0.734375, 0.7342024, 0.734375, 0.734375, 
0.734375, 0.734375, 0.734375]], [detailRate(mb/s) = [2.75478E-8, 2.3733156, 
3.8783703, 2.7117402E-8, 2.7119311E-8, 2.710678E-8, 2.711556E-8, 2.7105507E-8, 
4.484641, 3.1927187, 2.4854445, 2.0364974, 1.722687, 1.4875951, 1.2919687, 
2.7105507E-8, 0.98034406, 0.9306443, 0.8845948, 0.8425208, 0.80407506, 
0.77062935, 0.7450095, 2.7108092E-8, 4.1841693, 1.4971824, 16.385893, 
4.7222166, 1.9614059, 7.041826, 8.008545, 2.534279, 17.670755, 6.7497787, 
2.7129008E-8, 19.349627, 26.39924, 16.710173, 5.9312387, 12.802376, 10.644308, 
4.5160117, 14.152909, 2.8590457, 460.67938, 432.62634, 92.555466, 290.15073, 
12.046124]], [biggerThanLimitedRate(mb/s) = [460.67938, 432.62634, 92.555466, 
290.15073]]
{code}
*callTimes=852* means that MergeRateLimiter.pause is called 852 times.
*ignorePauseTimes=49* means that there are 49 no-pause times  in 852 times.
*detailBytes(mb)* means the detail no-pause bytes, total count is 49.
*detailRate(mb/s)* means the detail instant rate of the 49 *detailBytes*.
*biggerThanLimitedRate(mb/s)*: means the instant rate bigger than limited 
rate(29.2 MB/sec throttle), we can see that the max instant rate is 
460.67938mb/s, which is 10 times the limited rate.

The burst write rate (in addition to/ instead of) the no-pause-write frequency 
is about 0-10%, It depends on the writing pressure, In my test, the write 
thread is relatively busy.

This is how I count the statistics.
{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
if (bytesSinceLastPause == 0) {
  // writing time start at writing
  startTime = System.nanoTime();
}
bytesSinceLastPause += length;
delegate.writeBytes(b, offset, length);
checkRate();
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
 // count the last time.
  rateLimiter.pause(bytesSinceLastPause, (System.nanoTime())- startTime);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes();
}
  }
{code}






> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-05 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/6/22, 6:21 AM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing affect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.

In addition, in *writeBytes*, we should execute *delegate.writeBytes(b, offset, 
length)* first, and execute *checkRate()*  later, because the bytes are not 
written, but are counted into the written indicator.



was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing affect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.

In addition, why *checkRate()* need to be executed before 
*delegate.writeBytes(b, offset, length)*? we should write first, and check 
later, because the bytes are not writen, but are counted into the written 
indicator.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-05 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/6/22, 5:57 AM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing affect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.

In addition, why *checkRate()* need to be executed before 
*delegate.writeBytes(b, offset, length)*? we should write first, and check 
later, because the bytes are not writen, but are counted into the written 
indicator.



was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing affect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.

In addition, why *checkRate* need to be executed before *delegate.writeBytes*? 
we should write first, and check later.


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-05 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/6/22, 5:54 AM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing affect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.

In addition, why *checkRate* need to be executed before *delegate.writeBytes*? 
we should write first, and check later.



was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing affect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-05 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/6/22, 5:44 AM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing affect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.





was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-05 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/6/22, 5:43 AM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

All the no-pause bytes is ~0.28 MB from the detailed log, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values, but there's no 
guarantee that all  no-pause bytes are limited by the conditions.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.





was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-04 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/4/22, 1:27 PM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be writed unlimited, the next write bytes will not be 
paused for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.





was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be write unlimited, the next write bytes will not be paused 
for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-04 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/4/22, 1:26 PM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be write unlimited, the next write bytes will not be paused 
for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify), the 
frequency of no-pause bytes is high with the busy write, this is easy to 
reproduce.





was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be write unlimited, the next write bytes will not be paused 
for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify)




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 

[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-04 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei edited comment on LUCENE-10448 at 3/4/22, 1:09 PM:
---

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because *byte[] b* in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
For example, byte[] b=500MB>*currentMinPauseCheckBytes*, it doesn't pause in 
*checkRate*, and after *checkRate*, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be write unlimited, the next write bytes will not be paused 
for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify)





was (Author: kkewwei):
Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because byte[] b in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
for exmple, byte[] b=500MB>currentMinPauseCheckBytes, it doesn't pause in 
`checkRate`, and after `checkRate`, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be write unlimited, the next write bytes will not be paused 
for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify)




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 

[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-04 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17501314#comment-17501314
 ] 

kkewwei commented on LUCENE-10448:
--

Yes, bytes may be bigger than MIN_PAUSE_CHECK_MSEC, because byte[] b in 
RateLimitedIndexOutput.writeBytes(byte[] b, int offset, int length) may be large

{code:java}
@Override
  public void writeBytes(byte[] b, int offset, int length) throws IOException {
bytesSinceLastPause += length;
checkRate();
delegate.writeBytes(b, offset, length);
  }
  
  private void checkRate() throws IOException {
if (bytesSinceLastPause > currentMinPauseCheckBytes) {
  rateLimiter.pause(bytesSinceLastPause);
  bytesSinceLastPause = 0;
  currentMinPauseCheckBytes = rateLimiter.getMinPauseCheckBytes(); 
}
  }
{code}
for exmple, byte[] b=500MB>currentMinPauseCheckBytes, it doesn't pause in 
`checkRate`, and after `checkRate`, bytesSinceLastPause=0, so the 500MB 
no-pause bytes will be write unlimited, the next write bytes will not be paused 
for the 500MB bytes.
 {quote}
We could potentially add an upper bound on the bytes that writeBytes 
attempts to write in one shot
{quote}
It seems to be a good way.

After detailed log, all the no-pause bytes is ~0.28 MB, It indeeds determined 
by the configured mbPerSec and MIN_PAUSE_CHECK_MSEC values.

Because of the writing effect, I can't tell which high instant burst rate are 
due to no-pause bytes. it may be bigger than 11.2 MB/s(I can't verify)




> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500655#comment-17500655
 ] 

kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:46 AM:


[~vigyas] you mill nothing, there may be exist a small tip: 
After 50s wanted to write, say 500MB, the instant rate(50mb/s) is bigger than 
what we set(10mb/s),  the instant write will create presure to IO. According to 
my statistics, the frequency of no-pause bytes is [2%-20%],  this proportion of 
no-pause  is especially high when io pressure is high, too much high instant 
rate will leads to higher io pressure. 


was (Author: kkewwei):
[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO. According to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500655#comment-17500655
 ] 

kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:45 AM:


[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO. According to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 


was (Author: kkewwei):
[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500655#comment-17500655
 ] 

kkewwei edited comment on LUCENE-10448 at 3/3/22, 10:44 AM:


[~vigyas] you mill nothing, there may be exist a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 


was (Author: kkewwei):
[~vigyas] you mill nothing, there maybe exists a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-03 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500655#comment-17500655
 ] 

kkewwei commented on LUCENE-10448:
--

[~vigyas] you mill nothing, there maybe exists a small tip: After 50s wanted to 
write, say 500MB, the instant rate(50mb/s) is bigger than what we set(10mb/s),  
the instant write will create presure to IO, according to my statistics, the 
frequency of no-pause bytes is [2%-20%],  this proportion of no-pause  is 
especially high when io pressure is high, too much high instant rate will leads 
to higher io pressure. 

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500172#comment-17500172
 ] 

kkewwei edited comment on LUCENE-10448 at 3/2/22, 1:56 PM:
---

We know that if we reach bytesSinceLastPause, we will call maybePause at least 
two times.

I have a plan to solve it with the loop calls:  when curPauseNS is smaller than 
MIN_PAUSE_NS, we add the check: whether it's the first time calling maybePause, 
if true, we set curPauseNS = (long)((bytes / 1024. / 1024.) / rate * 
10), it may cost some time, but the instant rate is completely limited.


was (Author: kkewwei):
We know that if we reach bytesSinceLastPause, we will call maybePause at least 
two times.

I have a plan to solve it with the loop calls:  when curPauseNS is smaller than 
MIN_PAUSE_NS, we check whether it is the first time calling maybePause, if it's 
true, we set curPauseNS = (long)((bytes / 1024. / 1024.) / rate * 10), 
it may cost some time, but the instant rate is completely limited.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500172#comment-17500172
 ] 

kkewwei commented on LUCENE-10448:
--

We know that If we reach bytesSinceLastPause, we will call maybePause at least 
two times.

I have a plan to solve it with the loop calls:  when curPauseNS is smaller than 
MIN_PAUSE_NS, we check whether it is the first time calling maybePause, if it's 
true, we set curPauseNS = (long)((bytes / 1024. / 1024.) / rate * 10), 
it may cost some time, but the instant rate is completely limited.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500172#comment-17500172
 ] 

kkewwei edited comment on LUCENE-10448 at 3/2/22, 1:47 PM:
---

We know that if we reach bytesSinceLastPause, we will call maybePause at least 
two times.

I have a plan to solve it with the loop calls:  when curPauseNS is smaller than 
MIN_PAUSE_NS, we check whether it is the first time calling maybePause, if it's 
true, we set curPauseNS = (long)((bytes / 1024. / 1024.) / rate * 10), 
it may cost some time, but the instant rate is completely limited.


was (Author: kkewwei):
We know that If we reach bytesSinceLastPause, we will call maybePause at least 
two times.

I have a plan to solve it with the loop calls:  when curPauseNS is smaller than 
MIN_PAUSE_NS, we check whether it is the first time calling maybePause, if it's 
true, we set curPauseNS = (long)((bytes / 1024. / 1024.) / rate * 10), 
it may cost some time, but the instant rate is completely limited.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1754#comment-1754
 ] 

kkewwei edited comment on LUCENE-10448 at 3/2/22, 9:28 AM:
---

Yes,  this is what I wan to describe, the instant rate is unlimited.


was (Author: kkewwei):
Yes,  This is what I wan to describe, the instant rate is unlimited.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1754#comment-1754
 ] 

kkewwei commented on LUCENE-10448:
--

Yes,  This is what I wan to describe, the instant rate is unlimited.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1745#comment-1745
 ] 

kkewwei edited comment on LUCENE-10448 at 3/2/22, 9:17 AM:
---

For example, After the first time calling maybePause, it took long time to read 
segment data from the disk(maybe the disk is vary busy), when the second time 
calling maybePause, the interval time is long, so the lastNS is too old, 
targetNS = lastNS + (long) (10 * secondsToPause)  will be smaller than 
curNS, as the result, the second time calling maybePause will not pause, 
although the bytes is relatively big.

I count the no-pause bytes, the size of those bytes is not small, which can 
also confirm it.


was (Author: kkewwei):
For example, After the first time calling maybePause, it took long time to read 
segment data from the disk(maybe the disk is vary busy), when the second time 
calling maybePause, the interval time is long, so the lastNS is too old, 
targetNS = lastNS + (long) (10 * secondsToPause)  will be smaller than 
curNS, as the result, the second time calling maybePause will not pause, 
although the bytes is relatively big.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1745#comment-1745
 ] 

kkewwei edited comment on LUCENE-10448 at 3/2/22, 9:17 AM:
---

For example, after the first time calling maybePause, it took long time to read 
segment data from the disk(maybe the disk is vary busy), when the second time 
calling maybePause, the interval time is long, so the lastNS is too old, 
targetNS = lastNS + (long) (10 * secondsToPause)  will be smaller than 
curNS, as the result, the second time calling maybePause will not pause, 
although the bytes is relatively big.

I count the no-pause bytes, the size of those bytes is not small, which can 
also confirm it.


was (Author: kkewwei):
For example, After the first time calling maybePause, it took long time to read 
segment data from the disk(maybe the disk is vary busy), when the second time 
calling maybePause, the interval time is long, so the lastNS is too old, 
targetNS = lastNS + (long) (10 * secondsToPause)  will be smaller than 
curNS, as the result, the second time calling maybePause will not pause, 
although the bytes is relatively big.

I count the no-pause bytes, the size of those bytes is not small, which can 
also confirm it.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1745#comment-1745
 ] 

kkewwei edited comment on LUCENE-10448 at 3/2/22, 9:12 AM:
---

For example, After the first time calling maybePause, it took long time to read 
segment data from the disk(maybe the disk is vary busy), when the second time 
calling maybePause, the interval time is long, so the lastNS is too old, 
targetNS = lastNS + (long) (10 * secondsToPause)  will be smaller than 
curNS, as the result, the second time calling maybePause will not pause, 
although the bytes is relatively big.


was (Author: kkewwei):
For example, After the first time calling maybePause, it took long time to read 
segment data from the disk(maybe the disk is vary busy), when the second time 
calling maybePause, the interval time is long, because the lastNS is too old, 
targetNS = lastNS + (long) (10 * secondsToPause)  will be smaller than 
curNS, so the second time calling maybePause will never pause, although the 
bytes is relatively big.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1745#comment-1745
 ] 

kkewwei commented on LUCENE-10448:
--

For example, After the first time calling maybePause, it took long time to read 
segment data from the disk(maybe the disk is vary busy), when the second time 
calling maybePause, the interval time is long, because the lastNS is too old, 
targetNS = lastNS + (long) (10 * secondsToPause)  will be smaller than 
curNS, so the second time calling maybePause will never pause, although the 
bytes is relatively big.

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-02 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17499960#comment-17499960
 ] 

kkewwei commented on LUCENE-10448:
--

[~jpountz] would you help confirm it?

> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, no matter how big the bytes is, we will return -1 and ignore to 
> pause. 
> I count the total times(callTimes) calling *maybePause* and ignored pause 
> times(ignorePauseTimes) and detail ignored bytes(detailBytes):
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = 
> [0.28899956, 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 
> 0.27990723, 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 
> 0.28144264, 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 
> 0.28002167, 0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
> {code}
> There are 857 times calling *maybePause*, including 25 times which is ignored 
> to pause, we can see that the ignored detail bytes (such as 0.28125mb) are 
> not small.
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause action that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total times(callTimes) calling *maybePause* and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes (such as 0.28125mb) are not 
small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total times(callTimes) calling *maybePause* and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes (such as 0.28125mb) is not 
small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> 

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total times(callTimes) calling *maybePause* and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes (such as 0.28125mb) is not 
small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total times(callTimes) calling *maybePause* and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't 

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total times(callTimes) calling *maybePause* and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total call times(callTimes) and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is 

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, no matter 
how big the bytes is, we will return -1 and ignore to pause. 

I count the total call times(callTimes) and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the total call times(callTimes) and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the total call times(callTimes) and ignored pause 
times(ignorePauseTimes) and detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [ignorePauseTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling *maybePause*, including 25 times which is ignored 
to pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the total call times(callTimes) and ignored times(returnQuickTimes) and 
detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [returnQuickTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling maybePause, including 25 times which is ignored to 
pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the 

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the total call times(callTimes) and ignored times(returnQuickTimes) and 
detail ignored bytes(detailBytes):
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[callTimes=857], [returnQuickTimes=25],  [detailBytes(mb) = [0.28899956, 
0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
0.27992058, 0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 857 times calling maybePause, including 25 times which is ignored to 
pause, we can see that the ignored detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the times(returnQuickTimes) and detail bytes(detailBytes) which is 
ignored to pause:
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[returnQuickTimes=25],  [detailBytes(mb) = [0.28899956, 0.28140354, 0.28015518, 
0.27990818, 0.2801447, 0.27991104, 0.27990723, 0.27990913, 0.2799101, 
0.28010082, 0.2799921, 0.2799673, 0.28144264, 0.27991295, 0.27990818, 
0.27993107, 0.2799387, 0.27998447, 0.28002167, 0.27992058, 0.27998066, 
0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 25 times which is ignored to pause, the detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> 

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the times(returnQuickTimes) and detail bytes(detailBytes) which is 
ignored to pause:
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[returnQuickTimes=25],  [detailBytes(mb) = [0.28899956, 0.28140354, 0.28015518, 
0.27990818, 0.2801447, 0.27991104, 0.27990723, 0.27990913, 0.2799101, 
0.28010082, 0.2799921, 0.2799673, 0.28144264, 0.27991295, 0.27990818, 
0.27993107, 0.2799387, 0.27998447, 0.28002167, 0.27992058, 0.27998066, 
0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 25 times which is ignored to pause, the detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the count and detail bytes which is ignored to pause:
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[returnQuickTimes=25],  [detail bytes(mb) = [0.28899956, 0.28140354, 
0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 0.27990913, 
0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 0.27991295, 
0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 0.27992058, 
0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 25 times which is ignored to pause, the detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is 

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

I count the count and detail bytes which is ignored to pause:
{code:java}
[2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
[index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 docs], 
[0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec throttle], 
[returnQuickTimes=25],  [detail bytes(mb) = [0.28899956, 0.28140354, 
0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 0.27990913, 
0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 0.27991295, 
0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 0.27992058, 
0.27998066, 0.28098202, 0.28125, 0.28125, 0.28125]]
{code}
There are 25 times which is ignored to pause, the detail bytes is not small.

As long as the interval between two *maybePause* calls is relatively long, the 
pause action that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

As long as the interval between two *maybePause* calls is relatively long, the 
pause that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, so we will return -1. 
> I count the count and detail bytes which is ignored to pause:
> {code:java}
> [2022-03-02T15:16:51,972][DEBUG][o.e.i.e.I.EngineMergeScheduler] [node1] 
> [index1][21] merge segment [_4h] done: took [26.8s], [123.6 MB], [61,219 
> docs], [0s stopped], [24.4s throttled], [242.5 MB written], [11.2 MB/sec 
> throttle], [returnQuickTimes=25],  [detail bytes(mb) = [0.28899956, 
> 0.28140354, 0.28015518, 0.27990818, 0.2801447, 0.27991104, 0.27990723, 
> 0.27990913, 0.2799101, 0.28010082, 0.2799921, 0.2799673, 0.28144264, 
> 0.27991295, 0.27990818, 0.27993107, 0.2799387, 0.27998447, 0.28002167, 
> 0.27992058, 

[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05 again,  so the value of *targetNS=lastNS + 
(long) (10 * secondsToPause)* must be smaller than *curNS*, so we will 
return -1. 

As long as the interval between two *maybePause* calls is relatively long, the 
pause that should be executed will not be executed.

 

  was:
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05,  so the value of *targetNS=lastNS + (long) 
(10 * secondsToPause)* must smaller than *curNS*, so we will return -1. 
As long as the interval between two *maybePause* calls is relatively long, the 
pause that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05 again,  so the value of 
> *targetNS=lastNS + (long) (10 * secondsToPause)* must be smaller than 
> *curNS*, so we will return -1. 
> As long as the interval between two *maybePause* calls is relatively long, 
> the pause that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10448:
-
Description: 
We can see the code in *MergeRateLimiter*:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, then 
the *maybePause* is called in 7:05,  so the value of *targetNS=lastNS + (long) 
(10 * secondsToPause)* must smaller than *curNS*, so we will return -1. 
As long as the interval between two *maybePause* calls is relatively long, the 
pause that should be executed will not be executed.

 

  was:
We can see the code:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, `maybePause` is called in `7:00`, lastNS=7:00, 
then the `maybePause` is called in `7:05`,  so the value of `targetNS=lastNS + 
(long) (10 * secondsToPause)` must smaller than `curNS`, so we will 
return -1. As long as the interval between two `maybePause` calls is relatively 
long, the pause that should be executed will not be executed.

 


> MergeRateLimiter doesn't always limit instant rate.
> ---
>
> Key: LUCENE-10448
> URL: https://issues.apache.org/jira/browse/LUCENE-10448
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.11.1
>Reporter: kkewwei
>Priority: Major
>
> We can see the code in *MergeRateLimiter*:
> {code:java}
> private long maybePause(long bytes, long curNS) throws 
> MergePolicy.MergeAbortedException {
>
> double rate = mbPerSec; 
> double secondsToPause = (bytes / 1024. / 1024.) / rate;
> long targetNS = lastNS + (long) (10 * secondsToPause);
> long curPauseNS = targetNS - curNS;
> // We don't bother with thread pausing if the pause is smaller than 2 
> msec.
> if (curPauseNS <= MIN_PAUSE_NS) {
>   // Set to curNS, not targetNS, to enforce the instant rate, not
>   // the "averaged over all history" rate:
>   lastNS = curNS;
>   return -1;
> }
>..
>   }
> {code}
> If a Segment is been merged, *maybePause* is called in 7:00, lastNS=7:00, 
> then the *maybePause* is called in 7:05,  so the value of *targetNS=lastNS + 
> (long) (10 * secondsToPause)* must smaller than *curNS*, so we will 
> return -1. As long as the interval between two *maybePause* calls is 
> relatively long, the pause that should be executed will not be executed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-03-01 Thread kkewwei (Jira)
kkewwei created LUCENE-10448:


 Summary: MergeRateLimiter doesn't always limit instant rate.
 Key: LUCENE-10448
 URL: https://issues.apache.org/jira/browse/LUCENE-10448
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 8.11.1
Reporter: kkewwei


We can see the code:
{code:java}
private long maybePause(long bytes, long curNS) throws 
MergePolicy.MergeAbortedException {
   
double rate = mbPerSec; 
double secondsToPause = (bytes / 1024. / 1024.) / rate;
long targetNS = lastNS + (long) (10 * secondsToPause);
long curPauseNS = targetNS - curNS;

// We don't bother with thread pausing if the pause is smaller than 2 msec.
if (curPauseNS <= MIN_PAUSE_NS) {
  // Set to curNS, not targetNS, to enforce the instant rate, not
  // the "averaged over all history" rate:
  lastNS = curNS;
  return -1;
}
   ..
  }
{code}

If a Segment is been merged, `maybePause` is called in `7:00`, lastNS=7:00, 
then the `maybePause` is called in `7:05`,  so the value of `targetNS=lastNS + 
(long) (10 * secondsToPause)` must smaller than `curNS`, so we will 
return -1. As long as the interval between two `maybePause` calls is relatively 
long, the pause that should be executed will not be executed.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-10433) we should pass l instead of d to getFallbackSelector(d).select in RadixSelector.select()

2022-02-22 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei resolved LUCENE-10433.
--
Resolution: Resolved

> we should pass l instead of d to getFallbackSelector(d).select in 
> RadixSelector.select()
> 
>
> Key: LUCENE-10433
> URL: https://issues.apache.org/jira/browse/LUCENE-10433
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In the `RadixSelector.select`
> {code:java}
>   private void select(int from, int to, int k, int d, int l) {
> if (to - from <= LENGTH_THRESHOLD || d >= LEVEL_THRESHOLD) { 
>   getFallbackSelector(d).select(from, to, k); 
> } else {
>   radixSelect(from, to, k, d, l); 
> }
>   }
> {code}
> we know that `l` represent the levels of recursion, not the `d`, but when we 
> check the levels of recursion, we use `d >= LEVEL_THRESHOLD`, it's not right.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10433) we should pass l instead of d to getFallbackSelector(d).select in RadixSelector.select()

2022-02-22 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10433:
-
Component/s: core/other

> we should pass l instead of d to getFallbackSelector(d).select in 
> RadixSelector.select()
> 
>
> Key: LUCENE-10433
> URL: https://issues.apache.org/jira/browse/LUCENE-10433
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> In the `RadixSelector.select`
> {code:java}
>   private void select(int from, int to, int k, int d, int l) {
> if (to - from <= LENGTH_THRESHOLD || d >= LEVEL_THRESHOLD) { 
>   getFallbackSelector(d).select(from, to, k); 
> } else {
>   radixSelect(from, to, k, d, l); 
> }
>   }
> {code}
> we know that `l` represent the levels of recursion, not the `d`, but when we 
> check the levels of recursion, we use `d >= LEVEL_THRESHOLD`, it's not right.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-10433) we should pass l instead of d to getFallbackSelector(d).select in RadixSelector.select()

2022-02-22 Thread kkewwei (Jira)
kkewwei created LUCENE-10433:


 Summary: we should pass l instead of d to 
getFallbackSelector(d).select in RadixSelector.select()
 Key: LUCENE-10433
 URL: https://issues.apache.org/jira/browse/LUCENE-10433
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 8.6.2
Reporter: kkewwei


In the `RadixSelector.select`
{code:java}
  private void select(int from, int to, int k, int d, int l) {
if (to - from <= LENGTH_THRESHOLD || d >= LEVEL_THRESHOLD) { 
  getFallbackSelector(d).select(from, to, k); 
} else {
  radixSelect(from, to, k, d, l); 
}
  }
{code}
we know that `l` represent the levels of recursion, not the `d`, but when we 
check the levels of recursion, we use `d >= LEVEL_THRESHOLD`, it's not right.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-28 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449967#comment-17449967
 ] 

kkewwei edited comment on LUCENE-10265 at 11/28/21, 8:09 AM:
-

Yes, you are right, I read it wrong.

10GB/s is not a universal ceiling. Slow disks(hdd) are so common used, it can 
only support hundreds of MB/s write, if the merge speed is too high, it will 
affect index and search. Although the merge speed can be dynamically adjusted, 
in fact it does exceed the the ceiling of slow disk(hdd).

 If we should lower the value, or it can be change by the outside.


was (Author: kkewwei):
Yes, you are right, I read it wrong.

If we should lower the value, or it can be change by the outside. we will use 
io to build index, search, merge, after all, Slow disks(hdd) are so common, it 
can only support Hundreds of MB/s write, if the merge speed is too high, it 
will affect index and search.

> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> It's known that merge io write throttle rate is under the control  of 
> `targetMBPerSec` In ConcurrentMergeSchedule, it should't beyond the 
> Ceiling(1024MB/s).
> `targetMBPerSec` is shared by many merge threads, it will be changed by the 
> way:
> {code:java}
> if (newBacklog) {
>   // This new merge adds to the backlog: increase IO throttle by 20%
>   targetMBPerSec *= 1.20; 
>   if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
> targetMBPerSec = MAX_MERGE_MB_PER_SEC;
>   }
>   ..
> } else {
>   // We are not falling behind: decrease IO throttle by 10%
>   targetMBPerSec /= 1.10;
>   if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
> targetMBPerSec = MIN_MERGE_MB_PER_SEC;
>   }
>  ..
> }
> {code}
> The modification process is not a atomic operation:
> # `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
> # other merge thread will read the new value(1024*1.2).
> # the first merge thread limit the value to be 1024.
> The bad case will happen.
> In product, we do find that IO write throttle rate is beyond the 
> Ceiling(1024MB/s) in the merge.
> {code:java}
> [2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][25] elasticsearch[data1][refresh][T#5] MS: io throttle: current merge 
> backlog; leave IO rate at 3589.1 MB/sec
> [2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][13] elasticsearch[data1][write][T#3] MS: io throttle: current merge 
> backlog; leave IO rate at 192.4 MB/sec
> [2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][22] elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: 
> io throttle: current merge backlog; leave IO rate at 96.3 MB/sec
> [2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][16] elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: 
> io throttle: current merge backlog; leave IO rate at 419.2 MB/sec
> [2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][19] elasticsearch[data1][write][T#2] MS: io throttle: current merge 
> backlog; leave IO rate at 3051.5 MB/sec
> {code}
> If we shoud do the following:
> 1. changing it by the atomic operation.
> 2. adding the `volatile` attribute to `targetMBPerSec`.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-28 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449967#comment-17449967
 ] 

kkewwei edited comment on LUCENE-10265 at 11/28/21, 8:00 AM:
-

Yes, you are right, I read it wrong.

If we should lower the value, or it can be change by the outside. we will use 
io to build index, search, merge, after all, Slow disks(hdd) are so common, it 
can only support Hundreds of MB/s write, if the merge speed is too high, it 
will affect index and search.


was (Author: kkewwei):
Yes, you are right, I read it wrong.

If we should lower the value, or it can be change by the outside. e will use io 
to build index, search, merge, after all, Slow disks(hdd) are so common, it can 
only support Hundreds of MB/s write, if the merge speed is too high, it will 
affect index and search.

> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> It's known that merge io write throttle rate is under the control  of 
> `targetMBPerSec` In ConcurrentMergeSchedule, it should't beyond the 
> Ceiling(1024MB/s).
> `targetMBPerSec` is shared by many merge threads, it will be changed by the 
> way:
> {code:java}
> if (newBacklog) {
>   // This new merge adds to the backlog: increase IO throttle by 20%
>   targetMBPerSec *= 1.20; 
>   if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
> targetMBPerSec = MAX_MERGE_MB_PER_SEC;
>   }
>   ..
> } else {
>   // We are not falling behind: decrease IO throttle by 10%
>   targetMBPerSec /= 1.10;
>   if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
> targetMBPerSec = MIN_MERGE_MB_PER_SEC;
>   }
>  ..
> }
> {code}
> The modification process is not a atomic operation:
> # `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
> # other merge thread will read the new value(1024*1.2).
> # the first merge thread limit the value to be 1024.
> The bad case will happen.
> In product, we do find that IO write throttle rate is beyond the 
> Ceiling(1024MB/s) in the merge.
> {code:java}
> [2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][25] elasticsearch[data1][refresh][T#5] MS: io throttle: current merge 
> backlog; leave IO rate at 3589.1 MB/sec
> [2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][13] elasticsearch[data1][write][T#3] MS: io throttle: current merge 
> backlog; leave IO rate at 192.4 MB/sec
> [2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][22] elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: 
> io throttle: current merge backlog; leave IO rate at 96.3 MB/sec
> [2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][16] elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: 
> io throttle: current merge backlog; leave IO rate at 419.2 MB/sec
> [2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][19] elasticsearch[data1][write][T#2] MS: io throttle: current merge 
> backlog; leave IO rate at 3051.5 MB/sec
> {code}
> If we shoud do the following:
> 1. changing it by the atomic operation.
> 2. adding the `volatile` attribute to `targetMBPerSec`.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-28 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449967#comment-17449967
 ] 

kkewwei commented on LUCENE-10265:
--

Yes, you are right, I read it wrong.

If we should lower the value, or it can be change by the outside. e will use io 
to build index, search, merge, after all, Slow disks(hdd) are so common, it can 
only support Hundreds of MB/s write, if the merge speed is too high, it will 
affect index and search.

> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.6.2
>Reporter: kkewwei
>Priority: Major
>
> It's known that merge io write throttle rate is under the control  of 
> `targetMBPerSec` In ConcurrentMergeSchedule, it should't beyond the 
> Ceiling(1024MB/s).
> `targetMBPerSec` is shared by many merge threads, it will be changed by the 
> way:
> {code:java}
> if (newBacklog) {
>   // This new merge adds to the backlog: increase IO throttle by 20%
>   targetMBPerSec *= 1.20; 
>   if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
> targetMBPerSec = MAX_MERGE_MB_PER_SEC;
>   }
>   ..
> } else {
>   // We are not falling behind: decrease IO throttle by 10%
>   targetMBPerSec /= 1.10;
>   if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
> targetMBPerSec = MIN_MERGE_MB_PER_SEC;
>   }
>  ..
> }
> {code}
> The modification process is not a atomic operation:
> # `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
> # other merge thread will read the new value(1024*1.2).
> # the first merge thread limit the value to be 1024.
> The bad case will happen.
> In product, we do find that IO write throttle rate is beyond the 
> Ceiling(1024MB/s) in the merge.
> {code:java}
> [2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][25] elasticsearch[data1][refresh][T#5] MS: io throttle: current merge 
> backlog; leave IO rate at 3589.1 MB/sec
> [2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][13] elasticsearch[data1][write][T#3] MS: io throttle: current merge 
> backlog; leave IO rate at 192.4 MB/sec
> [2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][22] elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: 
> io throttle: current merge backlog; leave IO rate at 96.3 MB/sec
> [2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][16] elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: 
> io throttle: current merge backlog; leave IO rate at 419.2 MB/sec
> [2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] 
> [test1][19] elasticsearch[data1][write][T#2] MS: io throttle: current merge 
> backlog; leave IO rate at 3051.5 MB/sec
> {code}
> If we shoud do the following:
> 1. changing it by the atomic operation.
> 2. adding the `volatile` attribute to `targetMBPerSec`.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-26 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10265:
-
Description: 
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should't beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.

The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. changing it by the atomic operation.
2. adding the `volatile` attribute to `targetMBPerSec`.


  was:
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should't beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. changing it by the atomic operation.
2. adding the `volatile` attribute to `targetMBPerSec`.



> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  

[jira] [Updated] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-26 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10265:
-
Description: 
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should't beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. changing it by the atomic operation.
2. adding the `volatile` attribute to `targetMBPerSec`.


  was:
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. changing it by the atomic operation.
2. adding the `volatile` attribute to `targetMBPerSec`.



> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  

[jira] [Updated] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-26 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10265:
-
Description: 
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. changing it by the atomic operation.
2. adding the `volatile` attribute to `targetMBPerSec`.


  was:
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. change it by the atomic operation.
2. add the `volatile` attribute to `targetMBPerSec`.



> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  

[jira] [Updated] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-26 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10265:
-
Description: 
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:

{code:java}
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
{code}


The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.

{code:java}
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
{code}


If we shoud do the following:
1. change it by the atomic operation.
2. add the `volatile` attribute to `targetMBPerSec`.


  was:
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:
```
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
```
The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.
```
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
```

If we shoud do the following:
1. change it by the atomic operation.
2. add the `volatile` attribute to `targetMBPerSec`.



> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects 

[jira] [Updated] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-26 Thread kkewwei (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kkewwei updated LUCENE-10265:
-
Description: 
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:
```
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
```
The modification process is not a atomic operation:
# `targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
# other merge thread will read the new value(1024*1.2).
# the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.
```
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
```

If we shoud do the following:
1. change it by the atomic operation.
2. add the `volatile` attribute to `targetMBPerSec`.


  was:
It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:
```
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
```
The modification process is not a atomic operation:
1.`targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
2.other merge thread will read the new value(1024*1.2).
3.the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.
```
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
```

If we shoud do the following:
1. change it by the atomic operation.
2. add the `volatile` attribute to `targetMBPerSec`.



> IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge
> -
>
> Key: LUCENE-10265
> URL: https://issues.apache.org/jira/browse/LUCENE-10265
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/other
>Affects Versions: 8.6.2
>

[jira] [Created] (LUCENE-10265) IO write throttle rate will beyond the Ceiling(1024MB/s) in the merge

2021-11-26 Thread kkewwei (Jira)
kkewwei created LUCENE-10265:


 Summary: IO write throttle rate will beyond the Ceiling(1024MB/s) 
in the merge
 Key: LUCENE-10265
 URL: https://issues.apache.org/jira/browse/LUCENE-10265
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Affects Versions: 8.6.2
Reporter: kkewwei


It's known that merge io write throttle rate is under the control  of 
`targetMBPerSec` In ConcurrentMergeSchedule, it should beyond the 
Ceiling(1024MB/s).

`targetMBPerSec` is shared by many merge threads, it will be changed by the way:
```
if (newBacklog) {
  // This new merge adds to the backlog: increase IO throttle by 20%
  targetMBPerSec *= 1.20; 
  if (targetMBPerSec > MAX_MERGE_MB_PER_SEC) {
targetMBPerSec = MAX_MERGE_MB_PER_SEC;
  }
  ..
} else {
  // We are not falling behind: decrease IO throttle by 10%
  targetMBPerSec /= 1.10;
  if (targetMBPerSec < MIN_MERGE_MB_PER_SEC) {
targetMBPerSec = MIN_MERGE_MB_PER_SEC;
  }
 ..
}
```
The modification process is not a atomic operation:
1.`targetMBPerSec` is changed by the first merge thread from 1024 to 1024*1.2
2.other merge thread will read the new value(1024*1.2).
3.the first merge thread limit the value to be 1024.
The bad case will happen.

In product, we do find that IO write throttle rate is beyond the 
Ceiling(1024MB/s) in the merge.
```
[2021-11-26T15:27:19,861][TRACE][o.e.i.e.E.MS ] [data1] [test1][25] 
elasticsearch[data1][refresh][T#5] MS: io throttle: current merge backlog; 
leave IO rate at 3589.1 MB/sec
[2021-11-26T15:27:20,304][TRACE][o.e.i.e.E.MS ] [data1] [test1][13] 
elasticsearch[data1][write][T#3] MS: io throttle: current merge backlog; leave 
IO rate at 192.4 MB/sec
[2021-11-26T15:27:25,330][TRACE][o.e.i.e.E.MS ] [data1] [test1][22] 
elasticsearch[data1][[test1][22]: Lucene Merge Thread #1026] MS: io throttle: 
current merge backlog; leave IO rate at 96.3 MB/sec
[2021-11-26T15:27:25,995][TRACE][o.e.i.e.E.MS ] [data1] [test1][16] 
elasticsearch[data1][[test1][16]: Lucene Merge Thread #1063] MS: io throttle: 
current merge backlog; leave IO rate at 419.2 MB/sec
[2021-11-26T15:27:38,335][TRACE][o.e.i.e.E.MS ] [data1] [test1][19] 
elasticsearch[data1][write][T#2] MS: io throttle: current merge backlog; leave 
IO rate at 3051.5 MB/sec
```

If we shoud do the following:
1. change it by the atomic operation.
2. add the `volatile` attribute to `targetMBPerSec`.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-09-05 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410114#comment-17410114
 ] 

kkewwei edited comment on LUCENE-10004 at 9/5/21, 12:11 PM:


Yes, before `bulk` merge,  we should `flush` to make it clean. 

Also We should reduce the frequency of `flush`, we can use a flag to record 
whether the last merge is `bulk` or not, if it' not `bulk` merge, we needn't to 
flush it.

We can reduce the `flush` rate by the way, but I don’t know how useful it is.




was (Author: kkewwei):
Yes, before `bulk` merge,  we should `flush` to make it clean. 

Also We should reduce the frequency of `flush`, we can use a flag to record 
whether the last merge is `bulk` or not, if it' not `bulk` merge, we needn't to 
flush it.

We can reduce the `flush` rate by the way, but I don’t know how effective it is.



> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-09-05 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410114#comment-17410114
 ] 

kkewwei commented on LUCENE-10004:
--

Yes, before `bulk` merge,  we should `flush` to make it clean. 

Also We should reduce the frequency of `flush`, we can use a flag to record 
whether the last merge is `bulk` or not, if it' not `bulk` merge, we needn't to 
flush it.

We can reduce the `flush` rate by the way, but I don’t know how effective it is.



> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-18 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365411#comment-17365411
 ] 

kkewwei edited comment on LUCENE-10004 at 6/18/21, 10:51 AM:
-

I read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flushed to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new  order is determined by the global parameter 
`docBase`.


was (Author: kkewwei):
I read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flushed to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.

> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-18 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365411#comment-17365411
 ] 

kkewwei edited comment on LUCENE-10004 at 6/18/21, 9:48 AM:


I read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flushed to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.


was (Author: kkewwei):
I read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flush to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.

> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-18 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365411#comment-17365411
 ] 

kkewwei edited comment on LUCENE-10004 at 6/18/21, 9:47 AM:


I read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flush to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.


was (Author: kkewwei):
I do read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flush to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.

> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-18 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365411#comment-17365411
 ] 

kkewwei commented on LUCENE-10004:
--

I do read merge logic, including DocIdMerger. In the new segment, it doesn't 
matter if the chunk is flush to file earlier or later. 

The order of stored document is important when we read it from old segment, 
when reading it from old segment and put into memory, the old order of stored 
document is useless. the new 
 order is determined by the global parameter `docBase`.

> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363529#comment-17363529
 ] 

kkewwei edited comment on LUCENE-10004 at 6/15/21, 10:31 AM:
-

There seems no need to guarantee the order of stored documents in the merged 
segment. the order is mainly used in deleting, the buffered docs will not be 
deleted at present.

Can you describe the situation that disorder will produce wrong results? I am 
very appreciated.


was (Author: kkewwei):
There seems no need to guarantee the order of stored documents in the merged 
segment. the order is mainly used in deleting, the buffered docs will not be 
deleted at present.

Can you describe the situation that disorder will produce wrong results?

> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-10004) Delete unnecessary flush in Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks

2021-06-15 Thread kkewwei (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-10004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363529#comment-17363529
 ] 

kkewwei edited comment on LUCENE-10004 at 6/15/21, 10:16 AM:
-

There seems no need to guarantee the order of stored documents in the merged 
segment. the order is mainly used in deleting, the buffered docs will not be 
deleted at present.

Can you describe the situation that disorder will produce wrong results?


was (Author: kkewwei):
There seems no need to guarantee the order of stored documents in one segment. 
the order is mainly used in deleting, the buffered docs will not be deleted at 
present.

Can you describe the situation that disorder will produce wrong results?

> Delete unnecessary flush in 
> Lucene90CompressingStoredFieldsWriter.copyChunks() to reduce dirty chunks
> -
>
> Key: LUCENE-10004
> URL: https://issues.apache.org/jira/browse/LUCENE-10004
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs
>Affects Versions: 8.8.2
>Reporter: kkewwei
>Priority: Major
>
> In CompressingStoredFieldsWriter.merge(): if the segment meet the following 
> conditions:
> {code:java}
> else if (matchingFieldsReader.getCompressionMode() == compressionMode && 
>  matchingFieldsReader.getChunkSize() == chunkSize && 
>  matchingFieldsReader.getPackedIntsVersion() 
> ==PackedInts.VERSION_CURRENT &&
>  liveDocs == null &&
>  !tooDirty(matchingFieldsReader)) { 
>..
>// flush any pending chunks
> if (numBufferedDocs > 0) {
>   flush();
>   numDirtyChunks++; // incomplete: we had to force this flush
> }
>..
> }
> {code}
> We will copy the the all chunk to the new fdt, before copying the chunk, we 
> will flush the buffer docs if numBufferedDocs >0, but the flush is 
> unnecessary.
> The bufferedDocs in memory have nothing to do with copyChunk. We just need to 
> ensure that it will be flush at the end of merge(In finish()).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >