Copilot commented on code in PR #151:
URL: 
https://github.com/apache/gravitino-playground/pull/151#discussion_r2635106238


##########
playground.sh:
##########
@@ -223,11 +229,35 @@ stop() {
 
 case "$1" in
 start)
-  if [[ "$2" == "--enable-ranger" ]]; then
-    enableRanger=true
-  else
-    enableRanger=false
+  enableRanger=false
+  enableAuth=false
+
+  # Parse options
+  shift
+  while [[ $# -gt 0 ]]; do
+    case "$1" in
+      --enable-ranger)
+        enableRanger=true
+        shift
+        ;;
+      --enable-auth)
+        enableAuth=true
+        shift
+        ;;
+      *)
+        echo "Unknown option: $1"
+        echo "Usage: playground.sh start [--enable-ranger | --enable-auth]"

Review Comment:
   The usage message uses a pipe separator '|' which typically indicates 
either/or options, but this is inconsistent with the error message on line 257 
that says the options "cannot be used together". The pipe notation suggests 
they're alternatives rather than independent options that happen to be mutually 
exclusive. Consider using 'and' instead of '|' or clarifying the syntax as 
'[--enable-ranger] [--enable-auth]' to indicate they're separate optional flags.



##########
README.md:
##########
@@ -334,6 +334,433 @@ You can run the command
 The demo is located in the `jupyter` folder, you can open the 
`gravitino-access-control-example.ipynb`
 demo via Jupyter Notebook by [http://localhost:18888](http://localhost:18888).
 
+### Using Gravitino Iceberg REST Server with Access Control
+
+Gravitino 1.1 introduced built-in access control for the Iceberg REST server, 
enabling fine-grained
+authorization for Iceberg tables without requiring external authorization 
services like Ranger. 
+This feature allows you to manage user permissions through Gravitino's unified 
API with native 
+access control enforcement at the REST API level.
+
+**Note on Authentication**: The Iceberg REST catalog uses HTTP Basic 
Authentication to pass the username through the `Authorization` header. The 
password is not verified by Gravitino currently, but the username is used to 
identify the user for access control enforcement.
+
+#### Demo Steps
+
+**Step 1: Start the Playground with Auth Enabled**
+
+```shell
+./playground.sh start --enable-auth
+```
+
+**Note**: The `--enable-auth` flag enables Gravitino's access control by 
removing the PassThroughAuthorizer, which allows proper privilege enforcement 
for the Iceberg REST catalog.
+
+**Step 2: Create Users**
+
+Create users through Gravitino's REST API:
+
+```shell
+# Add manager user
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"manager"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/users
+
+# Add data_analyst user
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"data_analyst"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/users
+
+# Set manager as owner of the metalake
+curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"manager"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/owners
+```
+
+**Step 3: Create Database and Table with Manager**
+
+Login to Spark container:
+
+```shell
+docker exec -it playground-spark bash
+```
+
+Start spark-sql as manager:
+
+```shell
+cd /opt/spark && /bin/bash bin/spark-sql --conf 
spark.sql.catalog.catalog_rest.rest.auth.type=manager --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.username=data_analyst --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.password=123
+```
+
+Create database and table:
+
+```sql
+USE catalog_rest;
+CREATE DATABASE IF NOT EXISTS demo_db;
+USE demo_db;
+
+CREATE TABLE IF NOT EXISTS employees (
+    employee_id INT,
+    name STRING,
+    department STRING,
+    salary DECIMAL(10,2)
+) USING iceberg;
+
+INSERT INTO employees VALUES
+  (1, 'Alice Johnson', 'Engineering', 95000.00),
+  (2, 'Bob Smith', 'Sales', 75000.00);
+
+SELECT * FROM employees;
+```
+
+**Step 4: Test Access Control Before Granting Privileges**
+
+
+Exit spark-sql and start a new session as data_analyst (without any privileges 
yet):
+
+```shell
+export HADOOP_USER_NAME=data_analyst
+cd /opt/spark
+/bin/bash bin/spark-sql  --conf 
spark.sql.catalog.catalog_rest.rest.auth.type=basic --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.username=data_analyst --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.password=123
+```
+
+Try to query the table (this should FAIL):
+
+```sql
+USE catalog__rest.demo_db;

Review Comment:
   The SQL command references 'catalog__rest.demo_db' with double underscores, 
but it should be 'catalog_rest.demo_db' with a single underscore to match the 
catalog name used in other parts of the documentation.
   ```suggestion
   USE catalog_rest.demo_db;
   ```



##########
README.md:
##########
@@ -334,6 +334,433 @@ You can run the command
 The demo is located in the `jupyter` folder, you can open the 
`gravitino-access-control-example.ipynb`
 demo via Jupyter Notebook by [http://localhost:18888](http://localhost:18888).
 
+### Using Gravitino Iceberg REST Server with Access Control
+
+Gravitino 1.1 introduced built-in access control for the Iceberg REST server, 
enabling fine-grained
+authorization for Iceberg tables without requiring external authorization 
services like Ranger. 
+This feature allows you to manage user permissions through Gravitino's unified 
API with native 
+access control enforcement at the REST API level.
+
+**Note on Authentication**: The Iceberg REST catalog uses HTTP Basic 
Authentication to pass the username through the `Authorization` header. The 
password is not verified by Gravitino currently, but the username is used to 
identify the user for access control enforcement.
+

Review Comment:
   This section documents that the Iceberg REST catalog uses HTTP Basic 
Authentication but explicitly does **not** verify the password, and instead 
trusts only the username from the `Authorization` header for access control 
decisions. This effectively removes real authentication: any client that can 
reach the REST endpoint can impersonate any user simply by choosing their 
username in the header, which can lead to privilege escalation in any 
non-trivial deployment. This should be clearly limited to demo-only usage and a 
real authentication mechanism (or integration with an external auth layer that 
Gravitino actually validates) should be required before relying on these access 
controls.
   ```suggestion
   **Security note (authentication)**: The Iceberg REST catalog examples shown 
here use HTTP Basic Authentication only as a transport to pass the username 
through the `Authorization` header. Gravitino currently **does not verify the 
Basic Auth password** and instead fully trusts the username provided in the 
header for access control decisions. As a result, this mechanism **does not 
provide real authentication**: any client that can reach the REST endpoint 
could impersonate any user by choosing their username in the header.
   
   This behavior is intended **for local/demo use only** (such as when running 
the playground) and **must not be relied upon in production** or any 
environment exposed to untrusted clients. For secure deployments, you must 
front the Iceberg REST server with a real authentication mechanism (for 
example, an authenticating reverse proxy, API gateway, or other identity 
provider) and configure Gravitino to validate the authenticated identity, 
rather than trusting arbitrary usernames from the `Authorization` header.
   ```



##########
README.md:
##########
@@ -334,6 +334,433 @@ You can run the command
 The demo is located in the `jupyter` folder, you can open the 
`gravitino-access-control-example.ipynb`
 demo via Jupyter Notebook by [http://localhost:18888](http://localhost:18888).
 
+### Using Gravitino Iceberg REST Server with Access Control
+
+Gravitino 1.1 introduced built-in access control for the Iceberg REST server, 
enabling fine-grained
+authorization for Iceberg tables without requiring external authorization 
services like Ranger. 
+This feature allows you to manage user permissions through Gravitino's unified 
API with native 
+access control enforcement at the REST API level.
+
+**Note on Authentication**: The Iceberg REST catalog uses HTTP Basic 
Authentication to pass the username through the `Authorization` header. The 
password is not verified by Gravitino currently, but the username is used to 
identify the user for access control enforcement.
+
+#### Demo Steps
+
+**Step 1: Start the Playground with Auth Enabled**
+
+```shell
+./playground.sh start --enable-auth
+```
+
+**Note**: The `--enable-auth` flag enables Gravitino's access control by 
removing the PassThroughAuthorizer, which allows proper privilege enforcement 
for the Iceberg REST catalog.
+
+**Step 2: Create Users**
+
+Create users through Gravitino's REST API:
+
+```shell
+# Add manager user
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"manager"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/users
+
+# Add data_analyst user
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"data_analyst"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/users
+
+# Set manager as owner of the metalake
+curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"manager"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/owners
+```
+
+**Step 3: Create Database and Table with Manager**
+
+Login to Spark container:
+
+```shell
+docker exec -it playground-spark bash
+```
+
+Start spark-sql as manager:
+
+```shell
+cd /opt/spark && /bin/bash bin/spark-sql --conf 
spark.sql.catalog.catalog_rest.rest.auth.type=manager --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.username=data_analyst --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.password=123

Review Comment:
   The spark-sql command uses incorrect configuration parameter names. The 
parameter 'spark.sql.catalog.catalog_rest.rest.auth.type' should be 'basic' not 
'manager', and there's a typo in line 391 where 'rest.auth.basic.username' is 
set to 'data_analyst' but the comment says we're logging in as 'manager'.
   ```suggestion
   cd /opt/spark && /bin/bash bin/spark-sql --conf 
spark.sql.catalog.catalog_rest.rest.auth.type=basic --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.username=manager --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.password=123
   ```



##########
init/gravitino/init.sh:
##########
@@ -27,6 +27,17 @@ cp 
/root/gravitino/catalogs/jdbc-mysql/libs/mysql-connector-java-8.0.27.jar /roo
 
 cp /tmp/gravitino/gravitino.conf /root/gravitino/conf
 
+# If auth is enabled, remove the PassThroughAuthorizer configuration
+if [ "${GRAVITINO_AUTH_ENABLE}" == "true" ]; then
+  echo "Auth is enabled, removing PassThroughAuthorizer configuration..."
+  sed -i '/gravitino.authorization.impl = 
org.apache.gravitino.server.authorization.PassThroughAuthorizer/d' 
/root/gravitino/conf/gravitino.conf
+fi
+
+apt-get install jq -y
+apt-get install host -y
+# Resolve hive hostname to IP address and add to /etc/hosts
+IP=$(host hive | awk '/has address/ {print $4; exit}')

Review Comment:
   The apt-get install commands should include update and use the -qq flag to 
suppress unnecessary output, or at minimum should check if the packages are 
already installed. Additionally, there's no error handling if the installation 
fails or if the 'host' command fails to resolve the hive hostname.
   ```suggestion
   # Ensure jq is installed; install quietly if missing
   if ! command -v jq >/dev/null 2>&1; then
     apt-get update -qq && apt-get install -y -qq jq || { echo "Failed to 
install jq" >&2; exit 1; }
   fi
   
   # Ensure host command is installed; install quietly if missing
   if ! command -v host >/dev/null 2>&1; then
     apt-get update -qq && apt-get install -y -qq host || { echo "Failed to 
install host package" >&2; exit 1; }
   fi
   
   # Resolve hive hostname to IP address and add to /etc/hosts
   IP=$(host hive 2>/dev/null | awk '/has address/ {print $4; exit}')
   if [ -z "$IP" ]; then
     echo "Failed to resolve hostname 'hive'" >&2
     exit 1
   fi
   ```



##########
README.md:
##########
@@ -334,6 +334,433 @@ You can run the command
 The demo is located in the `jupyter` folder, you can open the 
`gravitino-access-control-example.ipynb`
 demo via Jupyter Notebook by [http://localhost:18888](http://localhost:18888).
 
+### Using Gravitino Iceberg REST Server with Access Control
+
+Gravitino 1.1 introduced built-in access control for the Iceberg REST server, 
enabling fine-grained
+authorization for Iceberg tables without requiring external authorization 
services like Ranger. 
+This feature allows you to manage user permissions through Gravitino's unified 
API with native 
+access control enforcement at the REST API level.
+
+**Note on Authentication**: The Iceberg REST catalog uses HTTP Basic 
Authentication to pass the username through the `Authorization` header. The 
password is not verified by Gravitino currently, but the username is used to 
identify the user for access control enforcement.
+
+#### Demo Steps
+
+**Step 1: Start the Playground with Auth Enabled**
+
+```shell
+./playground.sh start --enable-auth
+```
+
+**Note**: The `--enable-auth` flag enables Gravitino's access control by 
removing the PassThroughAuthorizer, which allows proper privilege enforcement 
for the Iceberg REST catalog.
+
+**Step 2: Create Users**
+
+Create users through Gravitino's REST API:
+
+```shell
+# Add manager user
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"manager"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/users
+
+# Add data_analyst user
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"data_analyst"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/users
+
+# Set manager as owner of the metalake
+curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{"name":"manager"}' \
+  http://localhost:8090/api/metalakes/metalake_demo/owners
+```
+
+**Step 3: Create Database and Table with Manager**
+
+Login to Spark container:
+
+```shell
+docker exec -it playground-spark bash
+```
+
+Start spark-sql as manager:
+
+```shell
+cd /opt/spark && /bin/bash bin/spark-sql --conf 
spark.sql.catalog.catalog_rest.rest.auth.type=manager --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.username=data_analyst --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.password=123
+```
+
+Create database and table:
+
+```sql
+USE catalog_rest;
+CREATE DATABASE IF NOT EXISTS demo_db;
+USE demo_db;
+
+CREATE TABLE IF NOT EXISTS employees (
+    employee_id INT,
+    name STRING,
+    department STRING,
+    salary DECIMAL(10,2)
+) USING iceberg;
+
+INSERT INTO employees VALUES
+  (1, 'Alice Johnson', 'Engineering', 95000.00),
+  (2, 'Bob Smith', 'Sales', 75000.00);
+
+SELECT * FROM employees;
+```
+
+**Step 4: Test Access Control Before Granting Privileges**
+
+
+Exit spark-sql and start a new session as data_analyst (without any privileges 
yet):
+
+```shell
+export HADOOP_USER_NAME=data_analyst
+cd /opt/spark
+/bin/bash bin/spark-sql  --conf 
spark.sql.catalog.catalog_rest.rest.auth.type=basic --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.username=data_analyst --conf 
spark.sql.catalog.catalog_rest.rest.auth.basic.password=123
+```
+
+Try to query the table (this should FAIL):
+
+```sql
+USE catalog__rest.demo_db;
+
+-- This should FAIL - schema doesn't exist, because we don't have USE_SCHEMA 
privilege
+```
+
+**Step 5: Create Role with Privileges and Assign to User**
+
+Exit spark-sql and create a role with the necessary privileges:
+
+```shell
+# Create role with all required privileges
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "analyst_role",
+    "securableObjects": [
+      {
+        "fullName": "catalog_iceberg",
+        "type": "CATALOG",
+        "privileges": [
+          {"name": "USE_CATALOG", "condition": "ALLOW"}
+        ]
+      },
+      {
+        "fullName": "catalog_iceberg.demo_db",
+        "type": "SCHEMA",
+        "privileges": [
+          {"name": "USE_SCHEMA", "condition": "ALLOW"}
+        ]
+      },
+      {
+        "fullName": "catalog_iceberg.demo_db.employees",
+        "type": "TABLE",
+        "privileges": [
+          {"name": "SELECT_TABLE", "condition": "ALLOW"}
+        ]
+      }
+    ]
+  }' \
+  http://localhost:8090/api/metalakes/metalake_demo/roles
+
+# Assign role to user
+curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
+  -H "Content-Type: application/json" -d '{
+    "roleNames": ["analyst_role"]
+}' 
http://localhost:8090/api/metalakes/metalake_demo/permissions/users/data_analyst/grant
+```
+
+Start spark-sql as data_analyst again and test:
+
+```shell
+cd /opt/spark && /bin/bash bin/spark-sql \
+  --conf spark.sql.catalog.catalog_rest.rest.auth.type=basic \
+  --conf spark.sql.catalog.catalog_rest.rest.auth.basic.username=data_analyst \
+  --conf spark.sql.catalog.catalog_iceberg_rest.rest.auth.basic.password=123

Review Comment:
   There's a typo in the configuration parameter name. It should be 
'spark.sql.catalog.catalog_rest.rest.auth.basic.password' but it's written as 
'spark.sql.catalog.catalog_iceberg_rest.rest.auth.basic.password' (using 
'catalog_iceberg_rest' instead of 'catalog_rest').
   ```suggestion
     --conf spark.sql.catalog.catalog_rest.rest.auth.basic.password=123
   ```



##########
playground.sh:
##########
@@ -237,7 +267,7 @@ stop)
   stop
   ;;
 *)
-  echo "Usage: $0 <start|status|stop> [--enable-ranger]"
+  echo "Usage: $0 <start|status|stop> [--enable-ranger | --enable-auth]"

Review Comment:
   The usage message format is inconsistent between line 249 and line 270. Line 
249 uses 'playground.sh start [--enable-ranger | --enable-auth]' while line 270 
uses '$0 <start|status|stop> [--enable-ranger | --enable-auth]'. Consider 
making them consistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to