[GitHub] [druid] techdocsmith commented on a change in pull request #10985: Dynamic auto scale Kinesis-Stream ingest tasks

GitBox Mon, 09 Aug 2021 17:11:44 -0700


techdocsmith commented on a change in pull request #10985:
URL: https://github.com/apache/druid/pull/10985#discussion_r685595201




##########
File path: docs/development/extensions-core/kafka-ingestion.md
##########
@@ -146,7 +146,7 @@ A sample supervisor spec is shown below:
 |`lateMessageRejectionStartDateTime`|ISO8601 DateTime|Configure tasks to 
reject messages with timestamps earlier than this date time; for example if 
this is set to `2016-01-01T11:00Z` and the supervisor creates a task at 
*2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* 
will be dropped. This may help prevent concurrency issues if your data stream 
has late messages and you have multiple pipelines that need to operate on the 
same segments (e.g. a realtime and a nightly batch ingestion pipeline).|no 
(default == none)|
 |`lateMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject 
messages with timestamps earlier than this period before the task was created; 
for example if this is set to `PT1H` and the supervisor creates a task at 
*2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* 
will be dropped. This may help prevent concurrency issues if your data stream 
has late messages and you have multiple pipelines that need to operate on the 
same segments (e.g. a realtime and a nightly batch ingestion pipeline). Please 
note that only one of `lateMessageRejectionPeriod` or 
`lateMessageRejectionStartDateTime` can be specified.|no (default == none)|
 |`earlyMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject 
messages with timestamps later than this period after the task reached its 
taskDuration; for example if this is set to `PT1H`, the taskDuration is set to 
`PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with 
timestamps later than *2016-01-01T14:00Z* will be dropped. **Note:** Tasks 
sometimes run past their task duration, for example, in cases of supervisor 
failover. Setting earlyMessageRejectionPeriod too low may cause messages to be 
dropped unexpectedly whenever a task runs past its originally configured task 
duration.|no (default == none)|
-|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kafka ingest tasks. ONLY supported for Kafka indexing as of now. See 
[Tasks Autoscaler Properties](#Task Autoscaler Properties) for details.|no 
(default == null)|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kafka ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|

Review comment:
       ```suggestion
   |`autoScalerConfig`|Object|`Defines auto scaling behavior for Kafka ingest 
tasks. See [Tasks Autoscaler Properties](#Task Autoscaler Properties).|no 
(default == null)|
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |

Review comment:
       ```suggestion
   | `enableTaskAutoScaler` | Enable or disable the auto scaler. When false or 
or absent Druid disables the `autoScaler` even when `autoScalerConfig` is not 
null| no (default == false) |
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |

Review comment:
       ```suggestion
   | `taskCountMin` | Minimum number of Kinesis ingestion tasks. When you 
enable the auto scaler, Druid ignores the value of taskCount in `IOConfig` and 
uses`taskCountMin` for the initial number of tasks to launch.| yes |
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |
+| `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` 
percent of lag points are higher than `scaleOutThreshold`, then do scale out 
action. | no (default == 0.3) |

Review comment:
       Are lag points defined somewhere? Maybe we need an example of how this 
works together with the `scaleOutThreshold` 

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |
+| `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` 
percent of lag points are higher than `scaleOutThreshold`, then do scale out 
action. | no (default == 0.3) |
+| `scaleInThreshold` | The Threshold of scale in action | no (default == 
1000000) |
+| `triggerScaleInFractionThreshold` | If `triggerScaleInFractionThreshold` 
percent of lag points are lower than `scaleOutThreshold`, then do scale in 
action. | no (default == 0.9) |
+| `scaleActionStartDelayMillis` | Number of milliseconds after supervisor 
starts when first check scale logic. | no (default == 300000) |
+| `scaleActionPeriodMillis` | The frequency of checking whether to do scale 
action in millis | no (default == 60000) |
+| `scaleInStep` | How many tasks to reduce at a time | no (default == 1) |
+| `scaleOutStep` | How many tasks to add at a time | no (default == 2) |
+
+A sample supervisor spec with `lagBased` autoScaler enabled is shown below:

Review comment:
       ```suggestion
   The following example demonstrates a supervisor spec with `lagBased` 
autoScaler enabled:
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|

Review comment:
       ```suggestion
   |`autoScalerConfig`|Object|Defines auto scaling behavior for Kinesis ingest 
tasks. See [Tasks Autoscaler Properties](#Task Autoscaler Properties).|no 
(default == null)|
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |
+| `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` 
percent of lag points are higher than `scaleOutThreshold`, then do scale out 
action. | no (default == 0.3) |
+| `scaleInThreshold` | The Threshold of scale in action | no (default == 
1000000) |
+| `triggerScaleInFractionThreshold` | If `triggerScaleInFractionThreshold` 
percent of lag points are lower than `scaleOutThreshold`, then do scale in 
action. | no (default == 0.9) |
+| `scaleActionStartDelayMillis` | Number of milliseconds after supervisor 
starts when first check scale logic. | no (default == 300000) |

Review comment:
       ```suggestion
   | `scaleActionStartDelayMillis` | Number of milliseconds to delay after the 
supervisor starts before the first scale logic check. | no (default == 300000) |
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |

Review comment:
       ```suggestion
   | `lagCollectionIntervalMillis` | Time period between lag points collection. 
 | no (default == 30000) |
   ```
   Not sure if this is the time period between collections or if it relates to 
some sort of time lag.

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |

Review comment:
       ```suggestion
   | `taskCountMax` | Maximum number of Kinesis ingestion tasks. Must be 
greater than or equal to `taskCountMin`. If greater than {numKinesisShards}`, 
the maximum number of reading tasks is `{numKinesisShards}` and `taskCountMax` 
is ignored.  | yes |
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.

Review comment:
       ```suggestion
   The Kinesis indexing service reports lag metrics measured in time 
milliseconds rather than message count which is used by Kafka.
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |
+| `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` 
percent of lag points are higher than `scaleOutThreshold`, then do scale out 
action. | no (default == 0.3) |
+| `scaleInThreshold` | The Threshold of scale in action | no (default == 
1000000) |

Review comment:
       same comments as for scale out

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |
+| `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` 
percent of lag points are higher than `scaleOutThreshold`, then do scale out 
action. | no (default == 0.3) |
+| `scaleInThreshold` | The Threshold of scale in action | no (default == 
1000000) |
+| `triggerScaleInFractionThreshold` | If `triggerScaleInFractionThreshold` 
percent of lag points are lower than `scaleOutThreshold`, then do scale in 
action. | no (default == 0.9) |
+| `scaleActionStartDelayMillis` | Number of milliseconds after supervisor 
starts when first check scale logic. | no (default == 300000) |
+| `scaleActionPeriodMillis` | The frequency of checking whether to do scale 
action in millis | no (default == 60000) |

Review comment:
       ```suggestion
   | `scaleActionPeriodMillis` | Frequency in milliseconds to check if a scale 
action is triggered | no (default == 60000) |
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |

Review comment:
       Does this mean that when the time lage reaches 6000000 (default) the 
autoscaler launches another task?

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |
+| `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` 
percent of lag points are higher than `scaleOutThreshold`, then do scale out 
action. | no (default == 0.3) |
+| `scaleInThreshold` | The Threshold of scale in action | no (default == 
1000000) |
+| `triggerScaleInFractionThreshold` | If `triggerScaleInFractionThreshold` 
percent of lag points are lower than `scaleOutThreshold`, then do scale in 
action. | no (default == 0.9) |
+| `scaleActionStartDelayMillis` | Number of milliseconds after supervisor 
starts when first check scale logic. | no (default == 300000) |
+| `scaleActionPeriodMillis` | The frequency of checking whether to do scale 
action in millis | no (default == 60000) |
+| `scaleInStep` | How many tasks to reduce at a time | no (default == 1) |
+| `scaleOutStep` | How many tasks to add at a time | no (default == 2) |

Review comment:
       ```suggestion
   | `scaleOutStep` | Number of tasks to add at a time when scaling out | no 
(default == 2) |
   ```

##########
File path: docs/development/extensions-core/kinesis-ingestion.md
##########
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below:
 |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional 
permissions.|no|
 |`awsExternalId`|String|The AWS external id to use for additional 
permissions.|no|
 |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. 
See below for details.|no|
+|`autoScalerConfig`|Object|`autoScalerConfig` to specify how to auto scale the 
number of Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task 
Autoscaler Properties) for details.|no (default == null)|
+
+### Task Autoscaler Properties
+
+> Note that Task AutoScaler is currently designated as experimental.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `enableTaskAutoScaler` | Whether enable this feature or not. Set false or 
ignored here will disable `autoScaler` even though `autoScalerConfig` is not 
null| no (default == false) |
+| `taskCountMax` | Maximum value of task count. Make Sure `taskCountMax >= 
taskCountMin`. If `taskCountMax > {numKinesisShards}`, the maximum number of 
reading tasks would be equal to `{numKinesisShards}` and `taskCountMax` would 
be ignored.  | yes |
+| `taskCountMin` | Minimum value of task count. When enable autoscaler, the 
value of taskCount in `IOConfig` will be ignored, and `taskCountMin` will be 
the number of tasks that ingestion starts going up to `taskCountMax`| yes |
+| `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two 
scale actions | no (default == 600000) |
+| `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is 
supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag 
Based AutoScaler Strategy Related Properties) for details.| no (default == 
`lagBased`) |
+
+### Lag Based AutoScaler Strategy Related Properties
+
+> Unlike the Kafka Indexing Service, Kinesis reports lag metrics measured in 
time milliseconds rather than message count.
+
+| Property | Description | Required |
+| ------------- | ------------- | ------------- |
+| `lagCollectionIntervalMillis` | Period of lag points collection.  | no 
(default == 30000) |
+| `lagCollectionRangeMillis` | The total time window of lag collection, Use 
with `lagCollectionIntervalMillis`，it means that in the recent 
`lagCollectionRangeMillis`, collect lag metric points every 
`lagCollectionIntervalMillis`. | no (default == 600000) |
+| `scaleOutThreshold` | The Threshold of scale out action | no (default == 
6000000) |
+| `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` 
percent of lag points are higher than `scaleOutThreshold`, then do scale out 
action. | no (default == 0.3) |
+| `scaleInThreshold` | The Threshold of scale in action | no (default == 
1000000) |
+| `triggerScaleInFractionThreshold` | If `triggerScaleInFractionThreshold` 
percent of lag points are lower than `scaleOutThreshold`, then do scale in 
action. | no (default == 0.9) |
+| `scaleActionStartDelayMillis` | Number of milliseconds after supervisor 
starts when first check scale logic. | no (default == 300000) |
+| `scaleActionPeriodMillis` | The frequency of checking whether to do scale 
action in millis | no (default == 60000) |
+| `scaleInStep` | How many tasks to reduce at a time | no (default == 1) |

Review comment:
       ```suggestion
   | `scaleInStep` | Number of tasks to reduce at a time when scaling down | no 
(default == 1) |
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] techdocsmith commented on a change in pull request #10985: Dynamic auto scale Kinesis-Stream ingest tasks

Reply via email to