gemini-code-assist[bot] commented on code in PR #37377:
URL: https://github.com/apache/beam/pull/37377#discussion_r3079455108
##########
sdks/go/pkg/beam/runners/dataflow/dataflow.go:
##########
@@ -61,9 +61,11 @@ var (
numWorkers = flag.Int64("num_workers", 0, "Number of
workers (optional).")
workerHarnessThreads = flag.Int64("number_of_worker_harness_threads",
0, "The number of threads per each worker harness process (optional).")
maxNumWorkers = flag.Int64("max_num_workers", 0, "Maximum
number of workers during scaling (optional).")
- diskSizeGb = flag.Int64("disk_size_gb", 0, "Size of root
disk for VMs, in GB (optional).")
- diskType = flag.String("disk_type", "", "Type of root
disk for VMs (optional).")
- autoscalingAlgorithm = flag.String("autoscaling_algorithm", "",
"Autoscaling mode to use (optional).")
+ diskSizeGb = flag.Int64("disk_size_gb", 0, "Size of
root disk for VMs, in GB (optional).")
+ diskType = flag.String("disk_type", "", "Type of
root disk for VMs (optional).")
+ diskProvisionedIops = flag.Int64("disk_provisioned_iops", 0,
"Provisioned IOPS for the worker disk (optional).")
+ diskProvisionedThroughputMibps =
flag.Int64("disk_provisioned_throughput_mibps", 0, "Provisioned throughput in
MiB/s for the worker disk (optional).")
Review Comment:

The Go implementation appears to be incomplete. While flags and `JobOptions`
fields have been added, there is no corresponding logic to translate these
values into the `WorkerPool` configuration within the Dataflow API request
(typically handled in `environment.go`). Without this translation, these flags
will have no effect on the provisioned workers.
##########
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java:
##########
@@ -193,6 +193,20 @@ public String getAlgorithm() {
void setWorkerDiskType(String value);
+ /** Provisioned IOPS for the worker disk. */
+ @Description("Provisioned IOPS for the worker disk.")
+ @Nullable
+ Integer getDiskProvisionedIops();
+
+ void setDiskProvisionedIops(Integer value);
Review Comment:

Consider using `Long` instead of `Integer` for these performance-related
fields. While `Integer` is likely sufficient for current IOPS limits, using
`Long` is more consistent with the underlying Dataflow API (which the
translator casts to `Long`) and the Go implementation in this PR. It also
future-proofs the options against very high performance disk specifications.
```suggestion
Long getDiskProvisionedIops();
void setDiskProvisionedIops(Long value);
```
##########
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java:
##########
@@ -489,6 +489,13 @@ public Job translate(List<DataflowPackage> packages) {
if (options.getDiskSizeGb() > 0) {
workerPool.setDiskSizeGb(options.getDiskSizeGb());
}
+ if (options.getDiskProvisionedIops() != null) {
+
workerPool.setDiskProvisionedIops(Long.valueOf(options.getDiskProvisionedIops()));
+ }
+ if (options.getDiskProvisionedThroughputMibps() != null) {
+ workerPool.setDiskProvisionedThroughputMibps(
+ Long.valueOf(options.getDiskProvisionedThroughputMibps()));
+ }
Review Comment:

For consistency with how `diskSizeGb` is handled (lines 489-491), it is
recommended to also check that these values are greater than zero. This
provides basic client-side validation and ensures that a value of 0 (if
provided) doesn't inadvertently override service defaults in an unexpected way.
Additionally, the conversion to `Long` can be simplified.
```suggestion
if (options.getDiskProvisionedIops() != null &&
options.getDiskProvisionedIops() > 0) {
workerPool.setDiskProvisionedIops((long)
options.getDiskProvisionedIops());
}
if (options.getDiskProvisionedThroughputMibps() != null &&
options.getDiskProvisionedThroughputMibps() > 0) {
workerPool.setDiskProvisionedThroughputMibps(
(long) options.getDiskProvisionedThroughputMibps());
}
```
##########
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java:
##########
@@ -193,6 +193,20 @@ public String getAlgorithm() {
void setWorkerDiskType(String value);
+ /** Provisioned IOPS for the worker disk. */
+ @Description("Provisioned IOPS for the worker disk.")
+ @Nullable
+ Integer getDiskProvisionedIops();
+
+ void setDiskProvisionedIops(Integer value);
+
+ /** Provisioned throughput in MiB/s for the worker disk. */
+ @Description("Provisioned throughput in MiB/s for the worker disk.")
+ @Nullable
+ Integer getDiskProvisionedThroughputMibps();
+
+ void setDiskProvisionedThroughputMibps(Integer value);
Review Comment:

Consider using `Long` for throughput as well, for consistency and
future-proofing.
```suggestion
Long getDiskProvisionedThroughputMibps();
void setDiskProvisionedThroughputMibps(Long value);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]