richard-scott opened a new issue, #216:
URL: https://github.com/apache/bifromq/issues/216

   ### **Describe the bug**
   
   Custom Auth Provider plugin configured via FQN (Fully Qualified Name) is not 
being loaded or used by BifroMQ. Despite correct configuration in 
`standalone.yml` and successful plugin JAR deployment, BifroMQ appears to be 
using the Demo Auth Provider plugin instead. Our custom plugin code is never 
executed, as evidenced by deliberate failures and logging attempts that produce 
no output.
   
   **Expected behavior**: When `authProviderFQN` is configured with a custom 
plugin class name, BifroMQ should instantiate and use that plugin for 
authentication operations.
   
   **Actual behavior**: BifroMQ uses the Demo Auth Provider plugin 
(`DemoAuthProvider`) loaded via PF4J discovery, ignoring the FQN configuration.
   
   #### **Environment**
   
   - **Version**: `4.0.0-incubating-RC3`
   - **JVM Version**: OpenJDK 17 (Eclipse Temurin)
   - **Hardware Spec**: Docker containers (tested on various host systems)
   - **OS**: Linux (containerized deployment)
   - **Testing Tools**: `mosquitto_pub`, `mosquitto_sub`, custom Java plugin 
implementation
   - **Deployment**: Docker Compose cluster (3 BifroMQ nodes + HAProxy load 
balancer)
   
   #### **Reproducible Steps**
   
   **1. Setup Cluster Configuration**
   
   Create a BifroMQ configuration file (`standalone.yml` or cluster config) 
with FQN configuration:
   
   ```yaml
   authProviderFQN: "com.bifromq.plugin.authprovider.CustomAuthProvider"
   ```
   
   **2. Build Custom Auth Provider Plugin**
   
   Create a custom Auth Provider plugin implementing `IAuthProvider`.
   
   **How the plugin should work:**
   
   The plugin implements two main methods:
   
   1. **`auth(MQTT3AuthData authData)`** - Called when a client attempts to 
connect:
      - Receives username (format: `"tenant/user"`), password, and client ID
      - Parses username to extract tenant ID and user ID
      - Loads `/home/bifromq/conf/users.json` file
      - Verifies password matches the stored password for that tenant/user
      - Returns `MQTT3AuthResult` with tenant and user if authentication 
succeeds
      - Returns `null` or failed future if authentication fails
   
   2. **`check(ClientInfo clientInfo, MQTTAction action)`** - Called for 
authorization checks:
      - Receives client info (tenant, user) and action (topic, operation type)
      - Checks ACL rules to determine if operation is allowed
      - Returns `true` if allowed, `false` if denied
   
   For testing purposes, we add a deliberate failure to prove the plugin is 
being called:
   
   ```java
   package com.bifromq.plugin.authprovider;
   
   import org.apache.bifromq.plugin.authprovider.IAuthProvider;
   import org.apache.bifromq.plugin.authprovider.type.MQTT3AuthData;
   import org.apache.bifromq.plugin.authprovider.type.MQTT3AuthResult;
   import org.apache.bifromq.type.ClientInfo;
   import org.apache.bifromq.plugin.authprovider.type.MQTTAction;
   import org.pf4j.Extension;
   import org.pf4j.Plugin;
   
   import java.util.concurrent.CompletableFuture;
   
   @Extension
   public class CustomAuthProvider extends Plugin implements IAuthProvider {
   
       public CustomAuthProvider() {
           super(null);
       }
   
       @Override
       public CompletableFuture<MQTT3AuthResult> auth(MQTT3AuthData authData) {
           // DELIBERATE FAILURE FOR TESTING
           // This will cause all authentication to fail if our plugin is used
           // In production, this would be replaced with actual authentication 
logic that:
           // 1. Parses username (format: "tenant/user") to extract tenant ID 
and user ID
           // 2. Loads /home/bifromq/conf/users.json file
           // 3. Verifies password matches the stored password for that 
tenant/user combination
           // 4. Returns MQTT3AuthResult with tenant and user if authentication 
succeeds
           // 5. Returns null or failed future if authentication fails
           return CompletableFuture.failedFuture(
               new RuntimeException("DELIBERATE FAILURE: 
CustomAuthProvider.auth() is being called!")
           );
       }
   
       @Override
       public CompletableFuture<Boolean> check(ClientInfo clientInfo, 
MQTTAction action) {
           // ACL check - returns true to allow all operations for testing
           // In production, this would check ACL rules and return true/false 
based on permissions
           return CompletableFuture.completedFuture(true);
       }
   }
   
   **Plugin POM Configuration:**
   
   ```xml
   <dependency>
       <groupId>org.apache.bifromq</groupId>
       <artifactId>bifromq-plugin-auth-provider</artifactId>
       <version>4.0.0-incubating</version>
   </dependency>
   ```
   
   **3. Deploy Plugin**
   
   - Build plugin JAR: `mvn clean package`
   - Copy JAR to `/home/bifromq/plugins/auth-provider-1.0.0.jar` in container
   - Verify JAR exists: `docker exec node1 ls -la 
/home/bifromq/plugins/auth-provider-1.0.0.jar`
   - Verify configuration: `docker exec node1 cat 
/home/bifromq/conf/standalone.yml | grep authProviderFQN`
   
   Should show:
   ```yaml
   authProviderFQN: "com.bifromq.plugin.authprovider.CustomAuthProvider"
   ```
   
   **4. Start BifroMQ Cluster**
   
   Start the BifroMQ cluster with the custom plugin deployed.
   
   **5. Test Authentication**
   
   **PUB Client Parameters:**
   - MQTT Connection:
     - Host: `localhost`
     - Port: `8883` (TLS)
     - ClientIdentifier: `test-client-1`
     - Username: `tenant1/user1`
     - Password: `tenant1_user1_pass123`
     - TLS: Enabled (use `--cafile` with CA certificate)
   - MQTT Pub:
     - Topic: `test/topic`
     - QoS: `1`
     - Retain: `false`
     - Payload: `hello world`
   
   **Test Command:**
   ```bash
   mosquitto_pub -h localhost -p 8883 --cafile haproxy.crt \
     -u "tenant1/user1" -P "tenant1_user1_pass123" \
     -i "test-client-1" \
     -t "test/topic" -m "hello world" -q 1
   ```
   
   **Expected Result**: Authentication should fail with error message 
containing "DELIBERATE FAILURE" if our custom plugin is being used.
   
   **Actual Result**: Authentication succeeds, proving our custom plugin is NOT 
being used.
   
   **6. Verify Plugin Not Being Called**
   
   Check for plugin method invocation logs:
   ```bash
   docker exec node1 cat /home/bifromq/logs/plugin-load.log
   ```
   
   **Expected**: Log entries showing `auth()` method being called.
   
   **Actual**: File doesn't exist (method never called).
   
   **7. Check Startup Logs**
   
   ```bash
   docker logs node1 | grep -E "plugin|AuthProvider"
   ```
   
   Observed output:
   ```
   2026-01-02 20:19:50.093  INFO 1 [main] --- 
ache.bifromq.starter.module.PluginModule [PluginModule.java:63] Loaded plugin: 
[email protected]
   2026-01-02 20:19:50.093  INFO 1 [main] --- 
che.bifromq.demo.plugin.DemoAuthProvider [DemoAuthProvider.java:43] No webhook 
url specified, the fallback behavior will reject all auth/check requests.
   ```
   
   **Note**: The Demo Auth Provider is loaded and logged, despite FQN 
configuration pointing to our custom plugin.
   
   #### **Additional Evidence**
   
   **Test 1: Logging in Plugin Methods**
   
   Added file logging to `auth()` method:
   
   ```java
   @Override
   public CompletableFuture<MQTT3AuthResult> auth(MQTT3AuthData authData) {
       try {
           java.io.FileWriter fw = new 
java.io.FileWriter("/home/bifromq/logs/plugin-load.log", true);
           java.time.LocalDateTime now = java.time.LocalDateTime.now();
           fw.write(String.format("[%s] [CustomAuthProvider] auth() called for 
user: %s\n",
               now, authData != null ? authData.getUsername() : "null"));
           fw.close();
       } catch (Exception e) {
           // Fallback logging to /tmp/plugin-load.log
       }
       // ... rest of implementation
   }
   ```
   
   **Result**: Log file was never created, confirming the method is not being 
called.
   
   **Test 2: Constructor Logging**
   
   Added logging to plugin constructor:
   
   ```java
   public CustomAuthProvider() {
       super(null);
       System.err.println("[CustomAuthProvider] Plugin loaded");
       System.out.println("[CustomAuthProvider] Plugin loaded");
   }
   ```
   
   **Result**: No constructor logs appeared in container logs, suggesting 
plugin is not being instantiated.
   
   **Test 3: JAR Verification**
   
   Verified the deliberate failure code is present in the deployed JAR:
   
   ```bash
   docker exec node1 python3 -c "
   import zipfile
   z = zipfile.ZipFile('/home/bifromq/plugins/auth-provider-1.0.0.jar')
   content = z.read('com/bifromq/plugin/authprovider/CustomAuthProvider.class')
   print('Contains DELIBERATE:', b'DELIBERATE' in content)
   z.close()
   "
   ```
   
   **Result**: `Contains DELIBERATE: True` - confirming the failure code is 
present in the deployed JAR.
   
   Despite our custom plugin not being used, authentication works correctly:
   - ✅ Authentication succeeds with valid credentials
   - ✅ Authentication succeeds with invalid credentials
   
   This suggests BifroMQ is using the Demo Auth Provider 
(`demo-plugin-4.0.0-incubating.jar`) which is loaded via PF4J discovery, 
despite the FQN configuration.
   
   #### **Questions**
   
   1. **Why is the Demo Auth Provider being used instead of the FQN-configured 
plugin?**
      - Is there a priority/precedence issue?
      - Does PF4J discovery override FQN configuration?
   
   2. **How should FQN-configured Auth Provider plugins be loaded?**
      - Should they be instantiated via constructors?
      - Is there a different loading mechanism?
   
   3. **Why does the Demo Auth Provider work despite logging "reject all 
requests"?**
      - What fallback behavior does it implement?
      - Is this expected behavior?
   
   4. **Is there additional configuration required for FQN plugins?**
      - Do we need to disable PF4J discovery?
      - Are there other configuration options?
   
   #### **Additional Context**
   
   - Plugin JAR is successfully built and deployed to `/home/bifromq/plugins/`
   - Configuration is correctly rendered in `standalone.yml` with 
`authProviderFQN` set
   - Cluster starts successfully without errors
   - Authentication tests pass (but using Demo plugin, not our custom one)
   - Build/redeploy cycle works correctly
   - Custom plugin JAR is present and contains expected code (verified)
   
   **Request**: We need clarification on how FQN-configured Auth Provider 
plugins should be loaded and why the Demo plugin is being used instead of our 
custom implementation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to