MonkeyCanCode commented on code in PR #3265:
URL: https://github.com/apache/polaris/pull/3265#discussion_r2617265072


##########
site/content/in-dev/unreleased/configuring-polaris-for-production/configuring-helm.md:
##########
@@ -0,0 +1,132 @@
+---
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+title: Configuring Helm for Production
+linkTitle: Configuring Helm
+type: docs
+weight: 601
+---
+
+This guide provides instructions for configuring the Polaris Helm chart for a 
production environment. For full list of chart values, see the [main Helm chart 
documentation](../helm.md).
+
+The default Helm chart values are suitable for development and testing, but 
they are not recommended for production. Following are the key areas to 
consider for production deployment.
+
+## Persistence
+
+By default, the Polaris Helm chart uses an `in-memory` metastore, which is not 
suitable for production. A persistent backend must be configured to ensure data 
is not lost when pods restart.
+
+To use a persistent backend, `persistence.type` must be set to 
`relational-jdbc`, and a Kubernetes secret containing the database connection 
details must be provided.
+
+```yaml
+persistence:
+  type: relational-jdbc
+  relationalJdbc:
+    secret:
+      name: "polaris-persistence-secret" # A secret containing db credentials
+      username: "username"
+      password: "password"
+      jdbcUrl: "jdbcUrl"
+```
+
+## Resource Management
+
+For a production environment, it is crucial to define resource requests and 
limits for the Polaris pods. Resource requests ensure that pods are allocated 
enough resources to run, while limits prevent them from consuming too many 
resources on the node.
+
+Resource requests and limits can be set in the `values.yaml` file:
+
+```yaml
+resources:
+  requests:
+    memory: "1Gi"
+    cpu: "500m"
+  limits:
+    memory: "2Gi"
+    cpu: "1"
+```
+
+Adjust these values based on expected workload and available cluster resources.
+
+## Authentication
+
+In a multi-replica production environment, all Polaris pods must share the 
same token signing keys. The default chart generates random keys for each pod, 
which will cause token validation failures.
+
+To use a shared set of keys, a Kubernetes secret to store an RSA key pair or a 
symmetric key must first be created.
+
+### RSA Key Pair
+
+```yaml
+authentication:
+  tokenBroker:
+    type: rsa-key-pair
+    secret:
+      name: "polaris-rsa-key-pair-secret" # A secret containing the RSA key 
pair
+      rsaKeyPair:
+        publicKey: "public.pem"
+        privateKey: "private.pem"
+```
+
+### Symmetric Key
+
+```yaml
+authentication:
+  tokenBroker:
+    type: symmetric-key
+    secret:
+      name: "polaris-symmetric-key-secret" # A secret containing the symmetric 
key
+      symmetricKey:
+        secretKey: "symmetric.key"
+```
+
+## Scaling
+
+For high availability, multiple replicas of the Polaris server can be run. 
This requires a persistent backend to be configured as described above.
+
+### Static Replicas
+
+`replicaCount` must be set to the desired number of pods.
+
+```yaml
+replicaCount: 3
+```
+
+### Autoscaling
+
+`autoscaling` can be enabled to define the minimum and maximum number of 
replicas, and CPU or memory utilization targets.
+
+```yaml
+autoscaling:
+  enabled: true
+  minReplicas: 2
+  maxReplicas: 5
+  targetCPUUtilizationPercentage: 80
+  targetMemoryUtilizationPercentage: 80
+```
+
+### Pod Topology Spreading
+
+For better fault tolerance, `topologySpreadConstraints` can be used to 
distribute pods across different nodes, racks, or availability zones. This 
helps prevent a single infrastructure failure from taking down all Polaris 
replicas.
+

Review Comment:
   So that should be happen automatically (there is also no option for current 
chart to use a diff persistent metastore). Same as we increase replica counts 
where by default it depends on the underlying node scaler (e.g. karpenter). 
Often time, these node scaler will needed additional configuration to enforce 
the pods spreading. With this addition, we basically tell node scaler don't 
just schedule them blindly but auto respect the required constraints.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to