[ 
https://issues.apache.org/jira/browse/UNOMI-897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Huber updated UNOMI-897:
------------------------------
    Description: 
This ticket documents *two separate but related issues* in the 
{code}GroovyActionDispatcher{code} that affect Unomi deployments using Groovy 
actions:

# *ISSUE #1: Data Contamination Race Condition* - Security vulnerability where 
user data can leak between different users
# *ISSUE #2: Performance and Memory Problems* - Scalability issues that prevent 
handling concurrent users effectively

*Business Impact*: These issues can lead to data integrity problems, customer 
trust issues, system reliability problems, and operational costs due to poor 
performance.

*Note*: These two issues are reported together because they both stem from the 
same architectural problems in the {code}GroovyActionDispatcher{code} and will 
require coordinated fixes in the same code area. Addressing one issue without 
considering the other could lead to incomplete solutions or introduce new 
problems.

----

h2. ISSUE #1: Data Contamination Race Condition

h3. 🚨 Security Vulnerability - User Data Leakage

*What's happening*: The system has a bug that can cause user data from one 
customer to appear in another customer's profile or event data. This is a 
*serious privacy and security problem*.

*Why it matters*: 
* Customer A's personal information could be visible to Customer B
* This can lead to loss of customer trust and data integrity issues
* Audit logs become unreliable
* System behavior becomes unpredictable

h4. Root Cause (Technical)
The implementation uses a *shared GroovyShell instance* across all threads in 
{code}GroovyActionDispatcher{code}, which means:
* When multiple users are processed simultaneously, their data can get mixed up
* Thread A sets variables for User A's action/event
* Thread B overwrites the same shared shell with User B's action/event data
* Both threads now see User B's data, causing contamination

h4. Race Condition Scenario (Simplified)
# *User A* logs in and their profile is being processed
# *User B* logs in at the same time and their profile is also being processed
# Due to the shared system component, User A's profile gets updated with User 
B's information
# *Result*: User A now sees User B's data in their profile

h3. 🔒 Security Implications

h4. Data Integrity Violation
* *User profile data leaks* between different user profiles
* *Event data corruption* where event IDs get mixed up
* *Unauthorized data disclosure* - data from one user visible to another
* *Audit trail corruption* - incorrect data in logs
* *Data integrity compromise* - unreliable system behavior

h4. Example Data Contamination
{code}
User A's profile: { "profileId": "user-a", "actionId": "action-a", "eventId": 
"event-a" }
User B's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
"event-b" }

After race condition:
User A's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
"event-b" } // CONTAMINATED!
User B's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
"event-b" }
{code}

h3. 📋 How to Reproduce the Data Contamination Issue

*⚠️ Important Note*: Race conditions are inherently difficult to reproduce 
synthetically because they depend on precise timing between threads. The 
following steps may not consistently reproduce the issue in controlled test 
environments, but the underlying problem exists and can manifest in production 
under the right conditions.

h4. Setup Environment
{code}
# Deploy Apache Unomi with Groovy actions extension enabled
# Ensure multi-threaded environment with concurrent user access
{code}

h4. Create Race Condition Sensitive Groovy Action
{code}
def execute(action, event) {
    // Read from shared variables multiple times to increase race condition 
probability
    def profile1 = event.getProfile()
    def action1 = action
    def event1 = event
    
    // Simulate some processing that takes time
    Thread.sleep(2)
    
    // Read again - this might now be different due to race condition!
    def profile2 = event.getProfile()
    def action2 = action
    def event2 = event
    
    def profileId1 = profile1.getProfileId()
    def profileId2 = profile2.getProfileId()
    def actionId1 = action1.getActionId()
    def actionId2 = action2.getActionId()
    def eventId1 = event1.getEventId()
    def eventId2 = event2.getEventId()
    
    // Return data for verification - this will show contamination if race 
condition occurs
    return [
        profileId: profileId1,
        profileId2: profileId2,
        actionId: actionId1,
        actionId2: actionId2,
        eventId: eventId1,
        eventId2: eventId2,
        visitCount: newValue,
        timestamp: System.currentTimeMillis(),
        threadId: Thread.currentThread().getId()
    ]
}
{code}

h4. Generate Concurrent Load
{code}
# Create a load test that simulates 1000+ concurrent users
# Each user should trigger the Groovy action multiple times
# Use tools like Apache JMeter, Gatling, or custom load testing scripts
# Ensure high concurrency to increase race condition probability
{code}

h4. Observe Data Contamination
* Check for profileId, actionId, and eventId mismatches in action results
* Observe that data from one user appears in another user's result
* Verify audit logs show incorrect data associations
* Look for inconsistent data patterns that indicate cross-contamination

*Note*: The race condition may not be consistently reproducible in test 
environments due to timing dependencies. However, the architectural flaw exists 
and can cause data contamination in production environments with real user load 
patterns.

h3. 🔍 Technical Evidence for Data Contamination

h4. Code Location
* *File*: 
{code}src/main/java/org/apache/unomi/groovy/actions/GroovyActionDispatcher.java{code}
* *Lines*: 67-72 (race condition in execute method)
* *Method*: {code}execute(Action action, Event event, String actionName){code}
* *Shared Shell*: Retrieved via 
{code}groovyActionsService.getGroovyShell(){code}

h4. Detection Challenges
* *Timing-dependent*: Race conditions require specific thread interleaving 
patterns
* *Environment-specific*: May not manifest in development or test environments
* *Load-dependent*: More likely under high concurrent load conditions
* *Intermittent*: Can appear sporadically in production without consistent 
reproduction

h4. Service Implementation Details
* *Shared GroovyShell*: Retrieved from GroovyActionsService as singleton 
instance
* *Variable Setting*: action and event variables set in shared shell binding
* *Script Parsing*: Script parsed from shared shell instance
* *Concurrent Access*: Multiple threads access same shell simultaneously
* *Variable Overwriting*: Thread A's variables overwritten by Thread B's 
variables

h4. Race Condition Characteristics
* *Timing-dependent*: May not be consistently reproducible in all environments
* *Concurrency-sensitive*: More likely with higher thread counts
* *Data-sensitive*: Affects profileId, actionId, and eventId
* *Processing-sensitive*: More likely with longer script execution times
* *Difficult to reproduce synthetically*: Race conditions are inherently 
timing-dependent and may not manifest in controlled test environments

----

h2. ISSUE #2: Performance and Memory Problems

h3. 📊 Severe Performance Issues

*What's happening*: The system performs very poorly when multiple users are 
active at the same time, using excessive memory and processing very slowly. 
Scripts are recompiled on every execution instead of being cached.

*Why it matters*:
* The system can only handle 957 operations per second (very slow for 
production)
* Memory usage grows to over 5GB, causing system instability
* Performance gets worse as more users connect simultaneously
* High garbage collection activity slows down the entire system
* Script compilation overhead adds significant latency to every action execution

h4. Performance Problems
# *Poor Throughput*: Only 957 executions/second (unacceptable for production)
# *Thread Contention*: Shared GroovyShell causes blocking
# *Memory Leaks*: 5,151MB peak memory usage
# *High GC Pressure*: 58 garbage collections, 2,294ms GC time
# *Poor Scalability*: Performance degrades with concurrency
# *No Script Caching*: Scripts are recompiled on every execution instead of 
being cached

h3. 📊 Performance Benchmark Results

h4. Thread Safety Test Results (1,000 threads, 50 iterations each)
{code}
Test Configuration:
- Concurrent Threads: 1,000
- Executions per Thread: 50
- Total Executions: 50,000
- Test Duration: 52+ seconds

Results:
- Throughput: 957 executions/second
- Memory Peak: 5,155 MB
- Memory Increase: 5,151 MB
- GC Time: 2,294 ms
- GC Collections: 58
- Thread Safety: FAILED (timing-dependent)
{code}

h4. Memory Usage Analysis
{code}
Initial Memory State:
   Used: 4 MB, Free: 64 MB, Total: 68 MB, Max: 16384 MB

Peak Memory State:
   Used: 5155 MB, Free: 1860 MB, Total: 7015 MB, Max: 16384 MB

Memory Growth: 5,151 MB (128,775% increase)
{code}

h4. GC Activity Analysis
{code}
GC Statistics for Original Implementation:
   - GC Time: 2,294 ms
   - GC Count: 58 collections
   - Average GC Time: 39 ms per collection
   - GC Pressure: High (affects overall system performance)
{code}

h3. 🏗️ Architecture Problems

h4. Current Architecture Issues
{code}
┌─────────────────────────────────────────────────────────────┐
│                    Current Architecture                     │
├─────────────────────────────────────────────────────────────┤
│ GroovyActionDispatcher                                      │
│ ├── Shared GroovyShell (SINGLETON) ← RACE CONDITION!       │
│ ├── No Script Caching ← PERFORMANCE ISSUE!                 │
│ ├── Script Recompilation on Every Execution                │
│ ├── Variable Overwriting in Shared Binding                 │
│ └── Thread-Unsafe Execution                                │
└─────────────────────────────────────────────────────────────┘
{code}

h4. Performance Issues Identified
* ❌ *Shared mutable state* across threads
* ❌ *Variable overwriting* in shared GroovyShell binding
* ❌ *No thread isolation* for script execution
* ❌ *Memory inefficiency* from shared instances
* ❌ *GC pressure* from object creation patterns
* ❌ *No script compilation caching* - scripts recompiled on every execution

----

h2. 🎯 Impact Assessment

h3. Business Impact
* *Data Integrity Issues*: Unreliable system behavior
* *Customer Trust*: Loss of confidence in data security
* *System Reliability*: Unpredictable data processing
* *Operational Costs*: High memory usage, poor performance

h3. Technical Impact
* *Scalability*: Cannot handle concurrent users effectively
* *Performance*: 957 executions/second (unacceptable)
* *Memory*: 5,155MB peak usage leads to OOM errors
* *Reliability*: Data corruption and inconsistencies
* *GC Pressure*: 58 collections impact overall system performance

h2. 🚨 Urgency

These issues are *MAJOR PRIORITY* and require *prompt attention* because:

h3. Issue #1 - Data Contamination
# *Active Security Vulnerability*: Data leaks between users
# *Production Impact*: Affects all Unomi deployments with Groovy actions
# *Data Integrity Risk*: Compromises system reliability

h3. Issue #2 - Performance Problems
# *Performance Degradation*: Unacceptable performance in production
# *Scalability Block*: Prevents handling concurrent users
# *Memory Issues*: Unbounded memory growth leads to OOM errors
# *GC Pressure*: High garbage collection activity impacts system performance

h2. 📝 Additional Information

h3. Affected Components
* {code}GroovyActionDispatcher{code}
* {code}GroovyActionsService{code} (provides shared GroovyShell)
* All Groovy action executions
* Profile and event data processing

h3. Monitoring Recommendations
* Monitor for profile data inconsistencies
* Track memory usage patterns
* Monitor GC frequency and duration
* Alert on data contamination patterns
* Monitor script execution times

h3. Workarounds (Temporary)
* Reduce concurrent event processing
* Monitor for data contamination
* Implement data validation checks
* Consider disabling Groovy actions in production
* Increase memory allocation to handle GC pressure


  was:
This ticket documents *bold:two separate but related issues* in the 
{code}GroovyActionDispatcher{code} that affect Unomi deployments using Groovy 
actions:

# *bold:ISSUE #1: Data Contamination Race Condition* - Security vulnerability 
where user data can leak between different users
# *bold:ISSUE #2: Performance and Memory Problems* - Scalability issues that 
prevent handling concurrent users effectively

*bold:Business Impact*: These issues can lead to data integrity problems, 
customer trust issues, system reliability problems, and operational costs due 
to poor performance.

*bold:Note*: These two issues are reported together because they both stem from 
the same architectural problems in the {code}GroovyActionDispatcher{code} and 
will require coordinated fixes in the same code area. Addressing one issue 
without considering the other could lead to incomplete solutions or introduce 
new problems.

----

h2. ISSUE #1: Data Contamination Race Condition

h3. 🚨 Security Vulnerability - User Data Leakage

*bold:What's happening*: The system has a bug that can cause user data from one 
customer to appear in another customer's profile or event data. This is a 
*bold:serious privacy and security problem*.

*bold:Why it matters*: 
* Customer A's personal information could be visible to Customer B
* This can lead to loss of customer trust and data integrity issues
* Audit logs become unreliable
* System behavior becomes unpredictable

h4. Root Cause (Technical)
The implementation uses a *bold:shared GroovyShell instance* across all threads 
in {code}GroovyActionDispatcher{code}, which means:
* When multiple users are processed simultaneously, their data can get mixed up
* Thread A sets variables for User A's action/event
* Thread B overwrites the same shared shell with User B's action/event data
* Both threads now see User B's data, causing contamination

h4. Race Condition Scenario (Simplified)
# *bold:User A* logs in and their profile is being processed
# *bold:User B* logs in at the same time and their profile is also being 
processed
# Due to the shared system component, User A's profile gets updated with User 
B's information
# *bold:Result*: User A now sees User B's data in their profile

h3. 🔒 Security Implications

h4. Data Integrity Violation
* *bold:User profile data leaks* between different user profiles
* *bold:Event data corruption* where event IDs get mixed up
* *bold:Unauthorized data disclosure* - data from one user visible to another
* *bold:Audit trail corruption* - incorrect data in logs
* *bold:Data integrity compromise* - unreliable system behavior

h4. Example Data Contamination
{code}
User A's profile: { "profileId": "user-a", "actionId": "action-a", "eventId": 
"event-a" }
User B's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
"event-b" }

After race condition:
User A's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
"event-b" } // CONTAMINATED!
User B's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
"event-b" }
{code}

h3. 📋 How to Reproduce the Data Contamination Issue

*bold:⚠️ Important Note*: Race conditions are inherently difficult to reproduce 
synthetically because they depend on precise timing between threads. The 
following steps may not consistently reproduce the issue in controlled test 
environments, but the underlying problem exists and can manifest in production 
under the right conditions.

h4. Setup Environment
{code}
# Deploy Apache Unomi with Groovy actions extension enabled
# Ensure multi-threaded environment with concurrent user access
{code}

h4. Create Race Condition Sensitive Groovy Action
{code}
def execute(action, event) {
    // Read from shared variables multiple times to increase race condition 
probability
    def profile1 = event.getProfile()
    def action1 = action
    def event1 = event
    
    // Simulate some processing that takes time
    Thread.sleep(2)
    
    // Read again - this might now be different due to race condition!
    def profile2 = event.getProfile()
    def action2 = action
    def event2 = event
    
    def profileId1 = profile1.getProfileId()
    def profileId2 = profile2.getProfileId()
    def actionId1 = action1.getActionId()
    def actionId2 = action2.getActionId()
    def eventId1 = event1.getEventId()
    def eventId2 = event2.getEventId()
    
    // Return data for verification - this will show contamination if race 
condition occurs
    return [
        profileId: profileId1,
        profileId2: profileId2,
        actionId: actionId1,
        actionId2: actionId2,
        eventId: eventId1,
        eventId2: eventId2,
        visitCount: newValue,
        timestamp: System.currentTimeMillis(),
        threadId: Thread.currentThread().getId()
    ]
}
{code}

h4. Generate Concurrent Load
{code}
# Create a load test that simulates 1000+ concurrent users
# Each user should trigger the Groovy action multiple times
# Use tools like Apache JMeter, Gatling, or custom load testing scripts
# Ensure high concurrency to increase race condition probability
{code}

h4. Observe Data Contamination
* Check for profileId, actionId, and eventId mismatches in action results
* Observe that data from one user appears in another user's result
* Verify audit logs show incorrect data associations
* Look for inconsistent data patterns that indicate cross-contamination

*bold:Note*: The race condition may not be consistently reproducible in test 
environments due to timing dependencies. However, the architectural flaw exists 
and can cause data contamination in production environments with real user load 
patterns.

h3. 🔍 Technical Evidence for Data Contamination

h4. Code Location
* *bold:File*: 
{code}src/main/java/org/apache/unomi/groovy/actions/GroovyActionDispatcher.java{code}
* *bold:Lines*: 67-72 (race condition in execute method)
* *bold:Method*: {code}execute(Action action, Event event, String 
actionName){code}
* *bold:Shared Shell*: Retrieved via 
{code}groovyActionsService.getGroovyShell(){code}

h4. Detection Challenges
* *bold:Timing-dependent*: Race conditions require specific thread interleaving 
patterns
* *bold:Environment-specific*: May not manifest in development or test 
environments
* *bold:Load-dependent*: More likely under high concurrent load conditions
* *bold:Intermittent*: Can appear sporadically in production without consistent 
reproduction

h4. Service Implementation Details
* *bold:Shared GroovyShell*: Retrieved from GroovyActionsService as singleton 
instance
* *bold:Variable Setting*: action and event variables set in shared shell 
binding
* *bold:Script Parsing*: Script parsed from shared shell instance
* *bold:Concurrent Access*: Multiple threads access same shell simultaneously
* *bold:Variable Overwriting*: Thread A's variables overwritten by Thread B's 
variables

h4. Race Condition Characteristics
* *bold:Timing-dependent*: May not be consistently reproducible in all 
environments
* *bold:Concurrency-sensitive*: More likely with higher thread counts
* *bold:Data-sensitive*: Affects profileId, actionId, and eventId
* *bold:Processing-sensitive*: More likely with longer script execution times
* *bold:Difficult to reproduce synthetically*: Race conditions are inherently 
timing-dependent and may not manifest in controlled test environments

----

h2. ISSUE #2: Performance and Memory Problems

h3. 📊 Severe Performance Issues

*bold:What's happening*: The system performs very poorly when multiple users 
are active at the same time, using excessive memory and processing very slowly. 
Scripts are recompiled on every execution instead of being cached.

*bold:Why it matters*:
* The system can only handle 957 operations per second (very slow for 
production)
* Memory usage grows to over 5GB, causing system instability
* Performance gets worse as more users connect simultaneously
* High garbage collection activity slows down the entire system
* Script compilation overhead adds significant latency to every action execution

h4. Performance Problems
# *bold:Poor Throughput*: Only 957 executions/second (unacceptable for 
production)
# *bold:Thread Contention*: Shared GroovyShell causes blocking
# *bold:Memory Leaks*: 5,151MB peak memory usage
# *bold:High GC Pressure*: 58 garbage collections, 2,294ms GC time
# *bold:Poor Scalability*: Performance degrades with concurrency
# *bold:No Script Caching*: Scripts are recompiled on every execution instead 
of being cached

h3. 📊 Performance Benchmark Results

h4. Thread Safety Test Results (1,000 threads, 50 iterations each)
{code}
Test Configuration:
- Concurrent Threads: 1,000
- Executions per Thread: 50
- Total Executions: 50,000
- Test Duration: 52+ seconds

Results:
- Throughput: 957 executions/second
- Memory Peak: 5,155 MB
- Memory Increase: 5,151 MB
- GC Time: 2,294 ms
- GC Collections: 58
- Thread Safety: FAILED (timing-dependent)
{code}

h4. Memory Usage Analysis
{code}
Initial Memory State:
   Used: 4 MB, Free: 64 MB, Total: 68 MB, Max: 16384 MB

Peak Memory State:
   Used: 5155 MB, Free: 1860 MB, Total: 7015 MB, Max: 16384 MB

Memory Growth: 5,151 MB (128,775% increase)
{code}

h4. GC Activity Analysis
{code}
GC Statistics for Original Implementation:
   - GC Time: 2,294 ms
   - GC Count: 58 collections
   - Average GC Time: 39 ms per collection
   - GC Pressure: High (affects overall system performance)
{code}

h3. 🏗️ Architecture Problems

h4. Current Architecture Issues
{code}
┌─────────────────────────────────────────────────────────────┐
│                    Current Architecture                     │
├─────────────────────────────────────────────────────────────┤
│ GroovyActionDispatcher                                      │
│ ├── Shared GroovyShell (SINGLETON) ← RACE CONDITION!       │
│ ├── No Script Caching ← PERFORMANCE ISSUE!                 │
│ ├── Script Recompilation on Every Execution                │
│ ├── Variable Overwriting in Shared Binding                 │
│ └── Thread-Unsafe Execution                                │
└─────────────────────────────────────────────────────────────┘
{code}

h4. Performance Issues Identified
* ❌ *bold:Shared mutable state* across threads
* ❌ *bold:Variable overwriting* in shared GroovyShell binding
* ❌ *bold:No thread isolation* for script execution
* ❌ *bold:Memory inefficiency* from shared instances
* ❌ *bold:GC pressure* from object creation patterns
* ❌ *bold:No script compilation caching* - scripts recompiled on every execution

----

h2. 🎯 Impact Assessment

h3. Business Impact
* *bold:Data Integrity Issues*: Unreliable system behavior
* *bold:Customer Trust*: Loss of confidence in data security
* *bold:System Reliability*: Unpredictable data processing
* *bold:Operational Costs*: High memory usage, poor performance

h3. Technical Impact
* *bold:Scalability*: Cannot handle concurrent users effectively
* *bold:Performance*: 957 executions/second (unacceptable)
* *bold:Memory*: 5,155MB peak usage leads to OOM errors
* *bold:Reliability*: Data corruption and inconsistencies
* *bold:GC Pressure*: 58 collections impact overall system performance

h2. 🚨 Urgency

These issues are *bold:MAJOR PRIORITY* and require *bold:prompt attention* 
because:

h3. Issue #1 - Data Contamination
# *bold:Active Security Vulnerability*: Data leaks between users
# *bold:Production Impact*: Affects all Unomi deployments with Groovy actions
# *bold:Data Integrity Risk*: Compromises system reliability

h3. Issue #2 - Performance Problems
# *bold:Performance Degradation*: Unacceptable performance in production
# *bold:Scalability Block*: Prevents handling concurrent users
# *bold:Memory Issues*: Unbounded memory growth leads to OOM errors
# *bold:GC Pressure*: High garbage collection activity impacts system 
performance

h2. 📝 Additional Information

h3. Affected Components
* {code}GroovyActionDispatcher{code}
* {code}GroovyActionsService{code} (provides shared GroovyShell)
* All Groovy action executions
* Profile and event data processing

h3. Monitoring Recommendations
* Monitor for profile data inconsistencies
* Track memory usage patterns
* Monitor GC frequency and duration
* Alert on data contamination patterns
* Monitor script execution times

h3. Workarounds (Temporary)
* Reduce concurrent event processing
* Monitor for data contamination
* Implement data validation checks
* Consider disabling Groovy actions in production
* Increase memory allocation to handle GC pressure



> Two separate issues in Groovy Actions implementation - Data contamination 
> race condition and severe performance problems
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: UNOMI-897
>                 URL: https://issues.apache.org/jira/browse/UNOMI-897
>             Project: Apache Unomi
>          Issue Type: Bug
>          Components: unomi(-core)
>    Affects Versions: unomi-2.4.0, unomi-2.5.0, unomi-2.6.0, unomi-2.6.1
>         Environment: - Apache Unomi 2.0.0+
> - Groovy actions extension enabled
> - Multi-threaded environments
> - Production deployments with concurrent users
> - High-concurrency scenarios (1000+ concurrent threads)
>            Reporter: Serge Huber
>            Priority: Major
>             Fix For: unomi-3.0.0, unomi-2.7.0
>
>
> This ticket documents *two separate but related issues* in the 
> {code}GroovyActionDispatcher{code} that affect Unomi deployments using Groovy 
> actions:
> # *ISSUE #1: Data Contamination Race Condition* - Security vulnerability 
> where user data can leak between different users
> # *ISSUE #2: Performance and Memory Problems* - Scalability issues that 
> prevent handling concurrent users effectively
> *Business Impact*: These issues can lead to data integrity problems, customer 
> trust issues, system reliability problems, and operational costs due to poor 
> performance.
> *Note*: These two issues are reported together because they both stem from 
> the same architectural problems in the {code}GroovyActionDispatcher{code} and 
> will require coordinated fixes in the same code area. Addressing one issue 
> without considering the other could lead to incomplete solutions or introduce 
> new problems.
> ----
> h2. ISSUE #1: Data Contamination Race Condition
> h3. 🚨 Security Vulnerability - User Data Leakage
> *What's happening*: The system has a bug that can cause user data from one 
> customer to appear in another customer's profile or event data. This is a 
> *serious privacy and security problem*.
> *Why it matters*: 
> * Customer A's personal information could be visible to Customer B
> * This can lead to loss of customer trust and data integrity issues
> * Audit logs become unreliable
> * System behavior becomes unpredictable
> h4. Root Cause (Technical)
> The implementation uses a *shared GroovyShell instance* across all threads in 
> {code}GroovyActionDispatcher{code}, which means:
> * When multiple users are processed simultaneously, their data can get mixed 
> up
> * Thread A sets variables for User A's action/event
> * Thread B overwrites the same shared shell with User B's action/event data
> * Both threads now see User B's data, causing contamination
> h4. Race Condition Scenario (Simplified)
> # *User A* logs in and their profile is being processed
> # *User B* logs in at the same time and their profile is also being processed
> # Due to the shared system component, User A's profile gets updated with User 
> B's information
> # *Result*: User A now sees User B's data in their profile
> h3. 🔒 Security Implications
> h4. Data Integrity Violation
> * *User profile data leaks* between different user profiles
> * *Event data corruption* where event IDs get mixed up
> * *Unauthorized data disclosure* - data from one user visible to another
> * *Audit trail corruption* - incorrect data in logs
> * *Data integrity compromise* - unreliable system behavior
> h4. Example Data Contamination
> {code}
> User A's profile: { "profileId": "user-a", "actionId": "action-a", "eventId": 
> "event-a" }
> User B's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
> "event-b" }
> After race condition:
> User A's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
> "event-b" } // CONTAMINATED!
> User B's profile: { "profileId": "user-b", "actionId": "action-b", "eventId": 
> "event-b" }
> {code}
> h3. 📋 How to Reproduce the Data Contamination Issue
> *⚠️ Important Note*: Race conditions are inherently difficult to reproduce 
> synthetically because they depend on precise timing between threads. The 
> following steps may not consistently reproduce the issue in controlled test 
> environments, but the underlying problem exists and can manifest in 
> production under the right conditions.
> h4. Setup Environment
> {code}
> # Deploy Apache Unomi with Groovy actions extension enabled
> # Ensure multi-threaded environment with concurrent user access
> {code}
> h4. Create Race Condition Sensitive Groovy Action
> {code}
> def execute(action, event) {
>     // Read from shared variables multiple times to increase race condition 
> probability
>     def profile1 = event.getProfile()
>     def action1 = action
>     def event1 = event
>     
>     // Simulate some processing that takes time
>     Thread.sleep(2)
>     
>     // Read again - this might now be different due to race condition!
>     def profile2 = event.getProfile()
>     def action2 = action
>     def event2 = event
>     
>     def profileId1 = profile1.getProfileId()
>     def profileId2 = profile2.getProfileId()
>     def actionId1 = action1.getActionId()
>     def actionId2 = action2.getActionId()
>     def eventId1 = event1.getEventId()
>     def eventId2 = event2.getEventId()
>     
>     // Return data for verification - this will show contamination if race 
> condition occurs
>     return [
>         profileId: profileId1,
>         profileId2: profileId2,
>         actionId: actionId1,
>         actionId2: actionId2,
>         eventId: eventId1,
>         eventId2: eventId2,
>         visitCount: newValue,
>         timestamp: System.currentTimeMillis(),
>         threadId: Thread.currentThread().getId()
>     ]
> }
> {code}
> h4. Generate Concurrent Load
> {code}
> # Create a load test that simulates 1000+ concurrent users
> # Each user should trigger the Groovy action multiple times
> # Use tools like Apache JMeter, Gatling, or custom load testing scripts
> # Ensure high concurrency to increase race condition probability
> {code}
> h4. Observe Data Contamination
> * Check for profileId, actionId, and eventId mismatches in action results
> * Observe that data from one user appears in another user's result
> * Verify audit logs show incorrect data associations
> * Look for inconsistent data patterns that indicate cross-contamination
> *Note*: The race condition may not be consistently reproducible in test 
> environments due to timing dependencies. However, the architectural flaw 
> exists and can cause data contamination in production environments with real 
> user load patterns.
> h3. 🔍 Technical Evidence for Data Contamination
> h4. Code Location
> * *File*: 
> {code}src/main/java/org/apache/unomi/groovy/actions/GroovyActionDispatcher.java{code}
> * *Lines*: 67-72 (race condition in execute method)
> * *Method*: {code}execute(Action action, Event event, String actionName){code}
> * *Shared Shell*: Retrieved via 
> {code}groovyActionsService.getGroovyShell(){code}
> h4. Detection Challenges
> * *Timing-dependent*: Race conditions require specific thread interleaving 
> patterns
> * *Environment-specific*: May not manifest in development or test environments
> * *Load-dependent*: More likely under high concurrent load conditions
> * *Intermittent*: Can appear sporadically in production without consistent 
> reproduction
> h4. Service Implementation Details
> * *Shared GroovyShell*: Retrieved from GroovyActionsService as singleton 
> instance
> * *Variable Setting*: action and event variables set in shared shell binding
> * *Script Parsing*: Script parsed from shared shell instance
> * *Concurrent Access*: Multiple threads access same shell simultaneously
> * *Variable Overwriting*: Thread A's variables overwritten by Thread B's 
> variables
> h4. Race Condition Characteristics
> * *Timing-dependent*: May not be consistently reproducible in all environments
> * *Concurrency-sensitive*: More likely with higher thread counts
> * *Data-sensitive*: Affects profileId, actionId, and eventId
> * *Processing-sensitive*: More likely with longer script execution times
> * *Difficult to reproduce synthetically*: Race conditions are inherently 
> timing-dependent and may not manifest in controlled test environments
> ----
> h2. ISSUE #2: Performance and Memory Problems
> h3. 📊 Severe Performance Issues
> *What's happening*: The system performs very poorly when multiple users are 
> active at the same time, using excessive memory and processing very slowly. 
> Scripts are recompiled on every execution instead of being cached.
> *Why it matters*:
> * The system can only handle 957 operations per second (very slow for 
> production)
> * Memory usage grows to over 5GB, causing system instability
> * Performance gets worse as more users connect simultaneously
> * High garbage collection activity slows down the entire system
> * Script compilation overhead adds significant latency to every action 
> execution
> h4. Performance Problems
> # *Poor Throughput*: Only 957 executions/second (unacceptable for production)
> # *Thread Contention*: Shared GroovyShell causes blocking
> # *Memory Leaks*: 5,151MB peak memory usage
> # *High GC Pressure*: 58 garbage collections, 2,294ms GC time
> # *Poor Scalability*: Performance degrades with concurrency
> # *No Script Caching*: Scripts are recompiled on every execution instead of 
> being cached
> h3. 📊 Performance Benchmark Results
> h4. Thread Safety Test Results (1,000 threads, 50 iterations each)
> {code}
> Test Configuration:
> - Concurrent Threads: 1,000
> - Executions per Thread: 50
> - Total Executions: 50,000
> - Test Duration: 52+ seconds
> Results:
> - Throughput: 957 executions/second
> - Memory Peak: 5,155 MB
> - Memory Increase: 5,151 MB
> - GC Time: 2,294 ms
> - GC Collections: 58
> - Thread Safety: FAILED (timing-dependent)
> {code}
> h4. Memory Usage Analysis
> {code}
> Initial Memory State:
>    Used: 4 MB, Free: 64 MB, Total: 68 MB, Max: 16384 MB
> Peak Memory State:
>    Used: 5155 MB, Free: 1860 MB, Total: 7015 MB, Max: 16384 MB
> Memory Growth: 5,151 MB (128,775% increase)
> {code}
> h4. GC Activity Analysis
> {code}
> GC Statistics for Original Implementation:
>    - GC Time: 2,294 ms
>    - GC Count: 58 collections
>    - Average GC Time: 39 ms per collection
>    - GC Pressure: High (affects overall system performance)
> {code}
> h3. 🏗️ Architecture Problems
> h4. Current Architecture Issues
> {code}
> ┌─────────────────────────────────────────────────────────────┐
> │                    Current Architecture                     │
> ├─────────────────────────────────────────────────────────────┤
> │ GroovyActionDispatcher                                      │
> │ ├── Shared GroovyShell (SINGLETON) ← RACE CONDITION!       │
> │ ├── No Script Caching ← PERFORMANCE ISSUE!                 │
> │ ├── Script Recompilation on Every Execution                │
> │ ├── Variable Overwriting in Shared Binding                 │
> │ └── Thread-Unsafe Execution                                │
> └─────────────────────────────────────────────────────────────┘
> {code}
> h4. Performance Issues Identified
> * ❌ *Shared mutable state* across threads
> * ❌ *Variable overwriting* in shared GroovyShell binding
> * ❌ *No thread isolation* for script execution
> * ❌ *Memory inefficiency* from shared instances
> * ❌ *GC pressure* from object creation patterns
> * ❌ *No script compilation caching* - scripts recompiled on every execution
> ----
> h2. 🎯 Impact Assessment
> h3. Business Impact
> * *Data Integrity Issues*: Unreliable system behavior
> * *Customer Trust*: Loss of confidence in data security
> * *System Reliability*: Unpredictable data processing
> * *Operational Costs*: High memory usage, poor performance
> h3. Technical Impact
> * *Scalability*: Cannot handle concurrent users effectively
> * *Performance*: 957 executions/second (unacceptable)
> * *Memory*: 5,155MB peak usage leads to OOM errors
> * *Reliability*: Data corruption and inconsistencies
> * *GC Pressure*: 58 collections impact overall system performance
> h2. 🚨 Urgency
> These issues are *MAJOR PRIORITY* and require *prompt attention* because:
> h3. Issue #1 - Data Contamination
> # *Active Security Vulnerability*: Data leaks between users
> # *Production Impact*: Affects all Unomi deployments with Groovy actions
> # *Data Integrity Risk*: Compromises system reliability
> h3. Issue #2 - Performance Problems
> # *Performance Degradation*: Unacceptable performance in production
> # *Scalability Block*: Prevents handling concurrent users
> # *Memory Issues*: Unbounded memory growth leads to OOM errors
> # *GC Pressure*: High garbage collection activity impacts system performance
> h2. 📝 Additional Information
> h3. Affected Components
> * {code}GroovyActionDispatcher{code}
> * {code}GroovyActionsService{code} (provides shared GroovyShell)
> * All Groovy action executions
> * Profile and event data processing
> h3. Monitoring Recommendations
> * Monitor for profile data inconsistencies
> * Track memory usage patterns
> * Monitor GC frequency and duration
> * Alert on data contamination patterns
> * Monitor script execution times
> h3. Workarounds (Temporary)
> * Reduce concurrent event processing
> * Monitor for data contamination
> * Implement data validation checks
> * Consider disabling Groovy actions in production
> * Increase memory allocation to handle GC pressure



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to