snazy commented on code in PR #1208:
URL: https://github.com/apache/polaris/pull/1208#discussion_r2003527755


##########
benchmarks/build.gradle.kts:
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+plugins {
+    id("scala")

Review Comment:
   Nit:
   ```suggestion
       scala
   ```



##########
benchmarks/build.gradle.kts:
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+plugins {
+    id("scala")
+    id("io.gatling.gradle") version "3.13.4.1"
+    id("com.diffplug.spotless")
+}
+
+dependencies {
+    gatling("com.typesafe.play:play-json_2.13:2.9.4")
+}
+
+description = "Polaris Iceberg REST API performance tests"
+
+spotless {
+    scala {
+        // Use scalafmt for Scala formatting
+        scalafmt("3.9.3").configFile("../.scalafmt.conf")
+    }
+}

Review Comment:
   It's probably better to have [the spotless 
parts](https://github.com/apache/polaris/pull/1203/files#diff-8dfadd8fa1b8d3481c35056ba0b776f249e34b2337dd5d43f545cdf6910501dcR184-R191)
 in `polaris-java.gradle.kts`. Adding `polaris-server` as a plugin would then 
pull that in.



##########
.scalafmt.conf:
##########
@@ -0,0 +1,25 @@
+# Scalafmt is used to reformat Gatling benchmarks from the benchmarks/ 
directory

Review Comment:
   I suspect this fails the `rat` check (no (C) header).



##########
benchmarks/src/gatling/resources/logback-test.xml:
##########
@@ -0,0 +1,19 @@
+<?xml version="1.0" encoding="UTF-8"?>

Review Comment:
   (C) header missing



##########
gradle/libs.versions.toml:
##########
@@ -100,3 +100,4 @@ jandex = { id = "org.kordamp.gradle.jandex", version = 
"2.1.0" }
 openapi-generator = { id = "org.openapi.generator", version = "7.12.0" }
 quarkus = { id = "io.quarkus", version.ref = "quarkus" }
 rat = { id = "org.nosphere.apache.rat", version = "0.8.1" }
+gatling = { id = "io.gatling.gradle", version = "3.13.4.1" }

Review Comment:
   Please add a trailing linebreak



##########
benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/actions/CatalogActions.scala:
##########
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.polaris.benchmarks.actions
+
+import io.gatling.core.Predef._
+import io.gatling.core.feeder.Feeder
+import io.gatling.core.structure.ChainBuilder
+import io.gatling.http.Predef._
+import org.apache.polaris.benchmarks.RetryOnHttpCodes.{
+  retryOnHttpStatus,
+  HttpRequestBuilderWithStatusSave
+}
+import org.apache.polaris.benchmarks.parameters.DatasetParameters
+import org.slf4j.LoggerFactory
+
+import java.util.concurrent.atomic.AtomicReference
+
+/**
+ * Actions for performance testing catalog operations in Apache Iceberg. This 
class provides methods
+ * to create and fetch catalogs.
+ *
+ * @param dp Dataset parameters controlling the dataset generation
+ * @param accessToken Reference to the authentication token for API requests
+ * @param maxRetries Maximum number of retry attempts for failed operations
+ * @param retryableHttpCodes HTTP status codes that should trigger a retry
+ */
+case class CatalogActions(
+    dp: DatasetParameters,
+    accessToken: AtomicReference[String],
+    maxRetries: Int = 10,
+    retryableHttpCodes: Set[Int] = Set(409, 500)
+) {
+  private val logger = LoggerFactory.getLogger(getClass)

Review Comment:
   Not sure about the usual Scala nomenclature - should `logger` be all upper 
case to indicate that it's a constant?



##########
benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/DatasetVerificationSimulation.scala:
##########
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.polaris.benchmarks
+
+import 
org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDatasetConcurrent
+
+/**
+ * This simulation is only kept for compatibility reasons. It is recommended 
to use the
+ * ReadTreeDataset class directly instead.
+ */
+class DatasetVerificationSimulation extends ReadUpdateTreeDatasetConcurrent {}

Review Comment:
   Same here (remove, nothing to be compatible with)



##########
benchmarks/docs/dataset-shape-1-1000-7.puml:
##########
@@ -0,0 +1,49 @@
+@startuml

Review Comment:
   Similar here, I suspect those files fail the `rat` check



##########
.scalafmt.conf:
##########
@@ -0,0 +1,25 @@
+# Scalafmt is used to reformat Gatling benchmarks from the benchmarks/ 
directory

Review Comment:
   spotless parts are also in #1203 (see also below). I don't mind to have this 
file as "the" formatting rule - but it should live in the `gradle/` or maybe 
better in the `codestyle/` folder.



##########
getting-started/assets/polaris/create-catalog.sh:
##########
@@ -50,7 +50,7 @@ curl -s -H "Authorization: Bearer ${token}" \
        "storageConfigInfo": {
          "storageType": "FILE",
          "allowedLocations": [
-           "file:///tmp"
+           "file:///tmp/polaris/"

Review Comment:
   Unrelated change?



##########
benchmarks/README.md:
##########
@@ -0,0 +1,177 @@
+# Polaris Benchmarks
+
+This repository contains benchmarks for the Polaris service using Gatling.
+
+## Available Benchmarks
+
+### Dataset Creation Benchmark
+
+The CreateTreeDataset benchmark creates a test dataset with a specific 
structure. It exists in two variants:
+
+- `org.apache.polaris.benchmarks.simulations.CreateTreeDatasetSequential`: 
Creates entities one at a time
+- `org.apache.polaris.benchmarks.simulations.CreateTreeDatasetConcurrent`: 
Creates up to 50 entities simultaneously
+
+These are write-only workloads designed to populate the system for subsequent 
benchmarks.
+
+### Read/Update Benchmark
+
+The ReadUpdateTreeDataset benchmark tests read and update operations on an 
existing dataset. It exists in two variants:
+
+- `org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDatasetSequential`: 
Performs read/update operations one at a time
+- `org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDatasetConcurrent`: 
Performs up to 20 read/update operations simultaneously
+
+These benchmarks can only be run after using CreateTreeDataset to populate the 
system.
+
+## Parameters
+
+### Dataset Structure Parameters
+
+These parameters must be consistent across all benchmarks:
+
+- `NUM_CATALOGS`: Number of catalogs to create (default: 1)
+- `NAMESPACE_WIDTH`: Width of the namespace tree (default: 2)
+- `NAMESPACE_DEPTH`: Depth of the namespace tree (default: 4)
+- `NUM_TABLES_PER_NAMESPACE`: Tables per namespace (default: 5)
+- `NUM_VIEWS_PER_NAMESPACE`: Views per namespace (default: 3)
+- `NUM_COLUMNS`: Columns per table (default: 10)
+- `DEFAULT_BASE_LOCATION`: Base location for datasets (default: 
file:///tmp/polaris)
+
+### Workload Parameters
+
+These parameters can vary between benchmarks:
+
+- `CLIENT_ID`: Required OAuth2 client ID
+- `CLIENT_SECRET`: Required OAuth2 client secret
+- `BASE_URL`: Service URL (default: http://localhost:8181)
+- `READ_WRITE_RATIO`: Ratio of read to write operations (for 
ReadUpdateTreeDataset only)
+
+## Running the Benchmarks
+
+Required environment variables:
+```bash
+export CLIENT_ID=your_client_id
+export CLIENT_SECRET=your_client_secret
+export BASE_URL=http://your-polaris-instance:8181
+```
+
+To run the sequential dataset creation benchmark:
+```bash
+./gradlew 
gatlingRun-org.apache.polaris.benchmarks.simulations.CreateTreeDatasetSequential
+```
+
+To run the concurrent dataset creation benchmark:
+```bash
+./gradlew 
gatlingRun-org.apache.polaris.benchmarks.simulations.CreateTreeDatasetConcurrent
+```
+
+To run the sequential read/update benchmark:
+```bash
+export READ_WRITE_RATIO=0.8  # 80% reads, 20% writes
+./gradlew 
gatlingRun-org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDatasetSequential
+```
+
+To run the concurrent read/update benchmark:
+```bash
+export READ_WRITE_RATIO=0.8  # 80% reads, 20% writes
+./gradlew 
gatlingRun-org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDatasetConcurrent
+```
+
+A message will show the location of the Gatling report:
+```
+Reports generated in: 
./benchmarks/build/reports/gatling/<simulation-name>/index.html
+```
+

Review Comment:
   ```suggestion
   
   ### Example Polaris server startup
   
   For repeated testing and benchmarking purposes it's convenient to have fixed 
client-ID + client-secret combinations. **The following example is ONLY for 
testing and benchmarking against an airgapped Polaris instance**
   
   ```bash
   # Start Polaris with the fixed client-ID/secret admin/admin
   # DO NEVER EVER USE THE FOLLOWING FOR ANY NON-AIRGAPPED POLARIS INSTANCE !!
   ./gradlew :polaris-quarkus-server:quarkusBuild &&  java \
     -Dpolaris.bootstrap.credentials=POLARIS,admin,admin \
     -Djava.security.manager=allow \
     -jar quarkus/server/build/quarkus-app/quarkus-run.jar
   
   With the above you can run the benchmarks with the environment variables 
`CLIENT_ID=admin` and `CLIENT_SECRET=admin` - meant only for convenience in a 
fully airgapped system.
   ```
   



##########
gradle/projects.main.properties:
##########
@@ -36,6 +36,7 @@ aggregated-license-report=aggregated-license-report
 polaris-immutables=tools/immutables
 polaris-container-spec-helper=tools/container-spec-helper
 polaris-version=tools/version
+polaris-benchmarks=benchmarks

Review Comment:
   Wonder if this work should go under `benchmarks/iceberg-rest` so we can have 
all benchmarks grouped together.



##########
benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/DatasetCreationSimulation.scala:
##########
@@ -0,0 +1,28 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.polaris.benchmarks
+
+import org.apache.polaris.benchmarks.simulations.CreateTreeDatasetConcurrent
+
+/**
+ * This simulation is only kept for compatibility reasons. It is recommended 
to use the
+ * CreateTreeDataset class directly instead.

Review Comment:
   Let's remove this class - there's nothing to be compatible with ;) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to