lhotari opened a new issue #9572:
URL: https://github.com/apache/pulsar/issues/9572


   **Is your enhancement request related to a problem? Please describe.**
   
   Currently the Pulsar IO .nar files are large in size. The total size of 
Pulsar IO files is 1952MB!
   Break down: https://gist.github.com/lhotari/810a543524e25457b521ac666913ad3c
   
   **Describe the solution you'd like**
   
   Exclude all Pulsar Functions Worker dependencies from Pulsar IO .nar files .
   
   For example, 
   
   ```
   $ unzip -l 
~/.m2/repository/org/apache/pulsar/pulsar-io-data-generator/2.8.0-SNAPSHOT/pulsar-io-data-generator-2.8.0-SNAPSHOT.nar
 |grep META-INF/bundled-dependencies | sort -k 4,4
           0  02-12-2021 07:04   META-INF/bundled-dependencies/
      183117  02-12-2021 07:04   
META-INF/bundled-dependencies/aircompressor-0.16.jar
        4467  02-12-2021 07:04   
META-INF/bundled-dependencies/aopalliance-1.0.jar
      449146  02-12-2021 07:04   
META-INF/bundled-dependencies/async-http-client-2.12.1.jar
        9909  02-12-2021 07:04   
META-INF/bundled-dependencies/async-http-client-netty-utils-2.12.1.jar
      566992  02-12-2021 07:04   META-INF/bundled-dependencies/avro-1.9.1.jar
       25683  02-12-2021 07:04   
META-INF/bundled-dependencies/avro-protobuf-1.9.1.jar
      887800  02-12-2021 07:04   
META-INF/bundled-dependencies/bcpkix-jdk15on-1.68.jar
     6031548  02-12-2021 07:04   
META-INF/bundled-dependencies/bcprov-ext-jdk15on-1.68.jar
     5961178  02-12-2021 07:04   
META-INF/bundled-dependencies/bcprov-jdk15on-1.68.jar
      146056  02-12-2021 07:04   
META-INF/bundled-dependencies/bookkeeper-common-4.12.1.jar
       16852  02-12-2021 07:04   
META-INF/bundled-dependencies/bookkeeper-common-allocator-4.12.1.jar
       19351  02-12-2021 07:04   
META-INF/bundled-dependencies/bookkeeper-stats-api-4.12.1.jar
    11082557  02-12-2021 07:04   
META-INF/bundled-dependencies/bouncy-castle-bc-2.8.0-SNAPSHOT-pkg.jar
      214381  02-12-2021 07:04   
META-INF/bundled-dependencies/checker-qual-3.5.0.jar
       65366  02-12-2021 07:04   
META-INF/bundled-dependencies/circe-checksum-4.12.1.jar
      284184  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-codec-1.10.jar
      615064  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-compress-1.19.jar
      362679  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-configuration-1.10.jar
      208700  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-io-2.5.jar
      284220  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-lang-2.6.jar
      494856  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-lang3-3.6.jar
       61829  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-logging-1.2.jar
     2213560  02-12-2021 07:04   
META-INF/bundled-dependencies/commons-math3-3.6.1.jar
       23508  02-12-2021 07:04   
META-INF/bundled-dependencies/cpu-affinity-4.12.1.jar
       13879  02-12-2021 07:04   
META-INF/bundled-dependencies/error_prone_annotations-2.3.4.jar
        4617  02-12-2021 07:04   
META-INF/bundled-dependencies/failureaccess-1.0.1.jar
      240255  02-12-2021 07:04   META-INF/bundled-dependencies/gson-2.8.6.jar
     2862361  02-12-2021 07:04   
META-INF/bundled-dependencies/guava-30.1-jre.jar
      674028  02-12-2021 07:04   META-INF/bundled-dependencies/guice-4.1.0.jar
       42873  02-12-2021 07:04   
META-INF/bundled-dependencies/guice-assistedinject-4.1.0.jar
       45012  02-12-2021 07:04   META-INF/bundled-dependencies/iban4j-3.2.1.jar
        8781  02-12-2021 07:04   
META-INF/bundled-dependencies/j2objc-annotations-1.3.jar
       68167  02-12-2021 07:04   
META-INF/bundled-dependencies/jackson-annotations-2.11.1.jar
      351575  02-12-2021 07:04   
META-INF/bundled-dependencies/jackson-core-2.11.1.jar
     1419800  02-12-2021 07:04   
META-INF/bundled-dependencies/jackson-databind-2.11.1.jar
       46983  02-12-2021 07:04   
META-INF/bundled-dependencies/jackson-dataformat-yaml-2.11.1.jar
       79295  02-12-2021 07:04   
META-INF/bundled-dependencies/jackson-module-jsonSchema-2.11.1.jar
      780265  02-12-2021 07:04   
META-INF/bundled-dependencies/javassist-3.25.0-GA.jar
       78030  02-12-2021 07:04   
META-INF/bundled-dependencies/javax.activation-1.2.0.jar
        2497  02-12-2021 07:04   
META-INF/bundled-dependencies/javax.inject-1.jar
      127509  02-12-2021 07:04   
META-INF/bundled-dependencies/javax.ws.rs-api-2.1.jar
        2254  02-12-2021 07:04   
META-INF/bundled-dependencies/jcip-annotations-1.0.jar
      252020  02-12-2021 07:04   
META-INF/bundled-dependencies/jctools-core-2.1.2.jar
      566323  02-12-2021 07:04   
META-INF/bundled-dependencies/jetty-util-9.4.35.v20201120.jar
      273528  02-12-2021 07:04   META-INF/bundled-dependencies/jfairy-0.5.9.jar
      640724  02-12-2021 07:04   
META-INF/bundled-dependencies/joda-time-2.10.1.jar
       19936  02-12-2021 07:04   META-INF/bundled-dependencies/jsr305-3.0.2.jar
        2199  02-12-2021 07:04   
META-INF/bundled-dependencies/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar
       24995  02-12-2021 07:04   META-INF/bundled-dependencies/memory-0.8.3.jar
      289921  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-buffer-4.1.51.Final.jar
      320174  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-codec-4.1.51.Final.jar
       61345  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-codec-dns-4.1.51.Final.jar
       36193  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-codec-haproxy-4.1.51.Final.jar
      617948  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-codec-http-4.1.51.Final.jar
      625057  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-common-4.1.51.Final.jar
      456702  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-handler-4.1.51.Final.jar
       21842  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-reactive-streams-2.0.4.jar
       33158  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-resolver-4.1.51.Final.jar
      151765  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-resolver-dns-4.1.51.Final.jar
     4017922  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-tcnative-boringssl-static-2.0.33.Final.jar
      473222  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-transport-4.1.51.Final.jar
      152317  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar
       33062  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final.jar
       56446  02-12-2021 07:04   
META-INF/bundled-dependencies/netty-transport-native-unix-common-4.1.51.Final-linux-x86_64.jar
     1660960  02-12-2021 07:04   
META-INF/bundled-dependencies/protobuf-java-3.11.4.jar
       73874  02-12-2021 07:04   
META-INF/bundled-dependencies/protobuf-java-util-3.11.4.jar
       47021  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-client-admin-api-2.8.0-SNAPSHOT.jar
      141344  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-client-api-2.8.0-SNAPSHOT.jar
      657161  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-client-original-2.8.0-SNAPSHOT.jar
      877274  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-common-2.8.0-SNAPSHOT.jar
       38477  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-config-validation-2.8.0-SNAPSHOT.jar
       21681  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-functions-api-2.8.0-SNAPSHOT.jar
       23202  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-io-core-2.8.0-SNAPSHOT.jar
       28200  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-package-core-2.8.0-SNAPSHOT.jar
        9037  02-12-2021 07:04   
META-INF/bundled-dependencies/pulsar-transaction-common-2.8.0-SNAPSHOT.jar
       11369  02-12-2021 07:04   
META-INF/bundled-dependencies/reactive-streams-1.0.3.jar
      130999  02-12-2021 07:04   
META-INF/bundled-dependencies/reflections-0.9.11.jar
      421509  02-12-2021 07:04   
META-INF/bundled-dependencies/sketches-core-0.8.3.jar
       41203  02-12-2021 07:04   
META-INF/bundled-dependencies/slf4j-api-1.7.25.jar
      284338  02-12-2021 07:04   
META-INF/bundled-dependencies/snakeyaml-1.18.jar
       21782  02-12-2021 07:04   
META-INF/bundled-dependencies/swagger-annotations-1.6.2.jar
       63777  02-12-2021 07:04   
META-INF/bundled-dependencies/validation-api-1.1.0.Final.jar
   ``` 
   
   pulsar-io-data-generator has a single unique dependency which is jfairy. 
This means that about 45MB of the dependencies are redundant in each pulsar-io 
.nar file. 
   
   These files won't get used at all for classloading. It is safe to remove all 
dependencies that are part of Pulsar Functions Worker's system classloader. The 
reason for this is that classloaders use parent-first lookups (by default, and 
also in Pulsar Functions Worker). 
   
   **Additional context**
   
   Reducing the size of Pulsar IO .nar files would help reducing the pulsar-all 
Docker image size too. There will be benefits in the Pulsar (core) build, 
although PIP-62 covers moving Pulsar IO connectors from apache/pulsar 
repository to apache/pulsar-connectors .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to