[
https://issues.apache.org/jira/browse/HDFS-16949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17706593#comment-17706593
]
ASF GitHub Bot commented on HDFS-16949:
---------------------------------------
goiri commented on code in PR #5495:
URL: https://github.com/apache/hadoop/pull/5495#discussion_r1152443921
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java:
##########
@@ -92,27 +93,68 @@ public void testClear() throws IOException {
public void testQuantileError() throws IOException {
final int count = 100000;
Random r = new Random(0xDEADDEAD);
Review Comment:
Where are we using this random?
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java:
##########
@@ -92,27 +93,68 @@ public void testClear() throws IOException {
public void testQuantileError() throws IOException {
final int count = 100000;
Random r = new Random(0xDEADDEAD);
- Long[] values = new Long[count];
+ int[] values = new int[count];
for (int i = 0; i < count; i++) {
- values[i] = (long) (i + 1);
+ values[i] = i + 1;
}
+
// Do 10 shuffle/insert/check cycles
for (int i = 0; i < 10; i++) {
- System.out.println("Starting run " + i);
+
+ // Shuffle
Collections.shuffle(Arrays.asList(values), r);
estimator.clear();
+
+ // Insert
for (int j = 0; j < count; j++) {
estimator.insert(values[j]);
}
Map<Quantile, Long> snapshot;
snapshot = estimator.snapshot();
+
+ // Check
for (Quantile q : quantiles) {
long actual = (long) (q.quantile * count);
long error = (long) (q.error * count);
long estimate = snapshot.get(q);
- System.out
- .println(String.format("Expected %d with error %d, estimated %d",
- actual, error, estimate));
+ assertThat(estimate <= actual + error).isTrue();
+ assertThat(estimate >= actual - error).isTrue();
+ }
+ }
+ }
+
+ /**
+ * Correctness test that checks that absolute error of the estimate for
inverse quantiles
+ * is within specified error bounds for some randomly permuted streams of
items.
+ */
+ @Test
+ public void testInverseQuantiles() throws IOException {
+ SampleQuantiles inverseQuantilesEstimator = new
SampleQuantiles(MutableInverseQuantiles.INVERSE_QUANTILES);
+ final int count = 100000;
+ Random r = new Random(0xDEADDEAD);
+ int[] values = new int[count];
+ for (int i = 0; i < count; i++) {
+ values[i] = i + 1;
+ }
+
+ // Do 10 shuffle/insert/check cycles
+ for (int i = 0; i < 10; i++) {
Review Comment:
Make 10 a constant just to show is NUM_REPEATS or something like that.
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java:
##########
@@ -92,27 +93,68 @@ public void testClear() throws IOException {
public void testQuantileError() throws IOException {
final int count = 100000;
Random r = new Random(0xDEADDEAD);
Review Comment:
OK, is the shuffle.
Hard to search single letter vars, make it `rnd`.
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java:
##########
@@ -92,27 +93,68 @@ public void testClear() throws IOException {
public void testQuantileError() throws IOException {
final int count = 100000;
Random r = new Random(0xDEADDEAD);
- Long[] values = new Long[count];
+ int[] values = new int[count];
for (int i = 0; i < count; i++) {
- values[i] = (long) (i + 1);
+ values[i] = i + 1;
}
+
// Do 10 shuffle/insert/check cycles
for (int i = 0; i < 10; i++) {
- System.out.println("Starting run " + i);
+
+ // Shuffle
Collections.shuffle(Arrays.asList(values), r);
estimator.clear();
+
+ // Insert
for (int j = 0; j < count; j++) {
estimator.insert(values[j]);
}
Map<Quantile, Long> snapshot;
snapshot = estimator.snapshot();
+
+ // Check
for (Quantile q : quantiles) {
long actual = (long) (q.quantile * count);
long error = (long) (q.error * count);
long estimate = snapshot.get(q);
- System.out
- .println(String.format("Expected %d with error %d, estimated %d",
- actual, error, estimate));
+ assertThat(estimate <= actual + error).isTrue();
+ assertThat(estimate >= actual - error).isTrue();
+ }
+ }
+ }
+
+ /**
+ * Correctness test that checks that absolute error of the estimate for
inverse quantiles
+ * is within specified error bounds for some randomly permuted streams of
items.
+ */
+ @Test
+ public void testInverseQuantiles() throws IOException {
+ SampleQuantiles inverseQuantilesEstimator = new
SampleQuantiles(MutableInverseQuantiles.INVERSE_QUANTILES);
+ final int count = 100000;
+ Random r = new Random(0xDEADDEAD);
+ int[] values = new int[count];
+ for (int i = 0; i < count; i++) {
+ values[i] = i + 1;
+ }
+
+ // Do 10 shuffle/insert/check cycles
+ for (int i = 0; i < 10; i++) {
+ // Shuffle
+ Collections.shuffle(Arrays.asList(values), r);
+ inverseQuantilesEstimator.clear();
+
+ // Insert
+ for (int j = 0; j < count; j++) {
Review Comment:
```
for (int value : values) {
inverseQuantilesEstimator.insert(value);
}
```
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java:
##########
@@ -92,27 +93,68 @@ public void testClear() throws IOException {
public void testQuantileError() throws IOException {
final int count = 100000;
Random r = new Random(0xDEADDEAD);
- Long[] values = new Long[count];
+ int[] values = new int[count];
for (int i = 0; i < count; i++) {
- values[i] = (long) (i + 1);
+ values[i] = i + 1;
}
+
// Do 10 shuffle/insert/check cycles
for (int i = 0; i < 10; i++) {
- System.out.println("Starting run " + i);
+
+ // Shuffle
Collections.shuffle(Arrays.asList(values), r);
estimator.clear();
+
+ // Insert
for (int j = 0; j < count; j++) {
Review Comment:
As we are at cleaning:
```
for (int value : values) {
estimator.insert(value);
}
```
##########
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/util/TestSampleQuantiles.java:
##########
@@ -118,4 +119,40 @@ public void testQuantileError() throws IOException {
}
}
}
+
+ /**
+ * Correctness test that checks that absolute error of the estimate for
inverse quantiles
+ * is within specified error bounds for some randomly permuted streams of
items.
+ */
+ @Test
+ public void testInverseQuantiles() throws IOException {
+ SampleQuantiles inverseQuantilesEstimator = new
SampleQuantiles(MutableInverseQuantiles.INVERSE_QUANTILES);
+ final int count = 100000;
+ Random r = new Random(0xDEADDEAD);
Review Comment:
Make it `rnd`; it is hard to find.
> Update ReadTransferRate to ReadLatencyPerGB for effective percentile metrics
> ----------------------------------------------------------------------------
>
> Key: HDFS-16949
> URL: https://issues.apache.org/jira/browse/HDFS-16949
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Reporter: Ravindra Dingankar
> Assignee: Ravindra Dingankar
> Priority: Minor
> Labels: pull-request-available
> Fix For: 3.3.0, 3.4.0
>
>
> HDFS-16917 added ReadTransferRate quantiles to calculate the rate which data
> is read per unit of time.
> With percentiles the values are sorted in ascending order and hence for the
> transfer rate p90 gives us the value where 90 percent rates are lower
> (worse), p99 gives us the value where 99 percent values are lower (worse).
> Note that value(p90) < p(99) thus p99 is a better transfer rate as compared
> to p90.
> However as the percentile increases the value should become worse in order to
> know how good our system is.
> Hence instead of calculating the data read transfer rate, we should calculate
> it's inverse. We will instead calculate the time taken for a GB of data to be
> read. ( seconds / GB )
> After this the p90 value will give us 90 percentage of total values where the
> time taken is less than value(p90), similarly for p99 and others.
> Also p(90) < p(99) and here p(99) will become a worse value (taking more time
> each byte) as compared to p(90)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]