Abhishek Kumar created RANGER-5621:
--------------------------------------

             Summary: CI build-17 times out due to oversized schema-registry 
plugin assembly after audit dispatcher distro dependencies
                 Key: RANGER-5621
                 URL: https://issues.apache.org/jira/browse/RANGER-5621
             Project: Ranger
          Issue Type: Bug
          Components: audit
            Reporter: Abhishek Kumar
            Assignee: Abhishek Kumar


After RANGER-5520, the GitHub Actions CI build-17 job started timing out during 
the final target-17 artifact upload.
 
The failure is not caused by Maven test/build failure. The job reaches the 
Upload artifacts step, but upload-artifact reports hundreds of thousands of 
files, for example ~841,939 files, and the job hits the 60-minute timeout.
 
{*}Root cause{*}:
RANGER-5520 added new audit dispatcher / audit destination dependencies to 
distro/pom.xml without an explicit scope. Maven defaults these dependencies to 
compile scope, which also makes them runtime dependencies of the distro module.
 
The existing schema-registry plugin assembly is defined with:
 
<scope>runtime</scope>
<unpack>true</unpack>
 
Because this assembly runs from the distro module, the new audit-related 
runtime dependencies and their transitive dependency trees became eligible for 
unpacking into the schema-registry plugin assembly. This pulled in large 
dependency trees, including Hadoop/Hive/Kafka/AWS SDK generated classes.
 
As a result, target/ranger-3.0.0-SNAPSHOT-schema-registry-plugin.jar became 
extremely large. In local verification it had:
 
size: 822 MB
zip entries: 497,417
 
coverage.sh then picked up this root 
target/ranger-*-schema-registry-plugin.jar, unzipped it into 
target/coverage-classes, and JaCoCo generated a huge target/coverage/all tree. 
That caused the target-17 artifact upload to include hundreds of thousands of 
generated coverage files and time out.
 
{*}Fix{*}:
Mark the new audit-related distro dependencies as provided, matching the intent 
of the distro dependency list and the surrounding dependencies. These 
dependencies are needed for build/reactor ordering and assembly references, but 
should not become runtime inputs to broad distro dependency sets.
 
The fix adds <scope>provided</scope> for:
 
audit-dispatcher-hdfs
audit-dispatcher-solr
ranger-audit-dest-hdfs
ranger-audit-dest-solr
ranger-audit-dispatcher-common
ranger-audit-server-common
 
Verification:
After adding provided scope and rebuilding distro:
 
target/ranger-3.0.0-SNAPSHOT-schema-registry-plugin.jar size: 8.1 KB
zip entries: 10
AWS/Hive/Kafka/HDFS dependency package entries: 0
 
Coverage verification with the same coverage.sh jar-selection behavior produced:
 
selected jars: 99
target/coverage-classes: 1,263 files
JaCoCo analyzed: 1,015 classes
target/coverage/all: 1,998 files, 26 MB
coverage totals: non-zero coverage reported
 
This restores the expected artifact size and prevents build-17 from timing out 
during artifact upload.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to