[ 
https://issues.apache.org/jira/browse/BEAM-10776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182033#comment-17182033
 ] 

Luke Cwik edited comment on BEAM-10776 at 8/21/20, 5:22 PM:
------------------------------------------------------------

Typically all jars on the classpath are included since we have no way to know 
whether a jar is needed or not during execution.
What is the gradle configuration being used for the classpath (./gradlew 
:path:to:project:dependencies)?
Which JDK version are you using? (if java11, is JPMS being enabled?)
How is the JDK being launched?
Is it a separate process?

The default is controlled by 
[ClasspathScanningResourcesDetector|https://github.com/apache/beam/blob/6b472e1de8ba5769127f6c330a23cc7c0af80527/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/resources/ClasspathScanningResourcesDetector.java#L31]
 and is configurable by this 
[PipelineResourceOptions|https://github.com/apache/beam/blob/26f6dd58b9fe608476ccc33601b2e26fc0343080/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/resources/PipelineResourcesOptions.java#L29]
 if you need to change it.

I looked at some of the jars and it looks like they aren't the JDK but some 
additional deps like Nashorn (a javascript engine for the JDK)



was (Author: lcwik):
Typically all jars on the classpath are included since we have no way to know 
whether a jar is needed or not during execution.
What is the gradle configuration being used for the classpath (./gradlew 
:path:to:project:dependencies)?
Which JDK version are you using? (if java11, is JPMS being enabled?)
How is the JDK being launched?
Is it a separate process?

The default is 
[ClasspathScanningResourcesDetector|https://github.com/apache/beam/blob/6b472e1de8ba5769127f6c330a23cc7c0af80527/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/resources/ClasspathScanningResourcesDetector.java#L31]
 and is configurable by this 
[PipelineResourceOptions|https://github.com/apache/beam/blob/26f6dd58b9fe608476ccc33601b2e26fc0343080/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/resources/PipelineResourcesOptions.java#L29]
 if you need to change it.



> Unwanted JDK jars staged when running cross-language pipelines
> --------------------------------------------------------------
>
>                 Key: BEAM-10776
>                 URL: https://issues.apache.org/jira/browse/BEAM-10776
>             Project: Beam
>          Issue Type: Bug
>          Components: cross-language
>            Reporter: Chamikara Madhusanka Jayalath
>            Priority: P2
>
> When running cross-language Kafka on Dataflow I see following jars being 
> staged.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/nashorn-BJZNQ7N8Lsfq-WSM0IMsRCwFMC3RIxBOEjrlB1YwKOw.jar...
> INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/nashorn-BJZNQ7N8Lsfq-WSM0IMsRCwFMC3RIxBOEjrlB1YwKOw.jar
>  in 40 seconds.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/cldrdata-aZ6XIS6LfPilqVFbS_bWm1wMWGm3jxtjh0vjlRuqp5M.jar...
> INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/cldrdata-aZ6XIS6LfPilqVFbS_bWm1wMWGm3jxtjh0vjlRuqp5M.jar
>  in 177 seconds.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/jfxrt-B2UJQqvuEI-15FPV1mcdw80YRUIDMg1Kr82FxWK_DZ8.jar...
> INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/jfxrt-B2UJQqvuEI-15FPV1mcdw80YRUIDMg1Kr82FxWK_DZ8.jar
>  in 285 seconds.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/dnsns-zNxWyUaaHIkUFJRt-aNZudjc3eroySNUeRkxdxidGbY.jar...
> INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/dnsns-zNxWyUaaHIkUFJRt-aNZudjc3eroySNUeRkxdxidGbY.jar
>  in 0 seconds.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/localedata-Wt0bN9j6XmIH4BaRLouHZX6p6iIoQsbZ2AkomxZTOYM.jar...
> INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/localedata-Wt0bN9j6XmIH4BaRLouHZX6p6iIoQsbZ2AkomxZTOYM.jar
>  in 16 seconds.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/jaccess-5wlKULhaKWM_gmKVtH_QBwVqH4awlxxRdNNfz0z0Imw.jar...
> INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/jaccess-5wlKULhaKWM_gmKVtH_QBwVqH4awlxxRdNNfz0z0Imw.jar
>  in 0 seconds.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/MRJToolkit-jU5qhDBc0cNjn7g3yrGHYO78BRC09T-sE8Syqo9mRjg.jar...
> INFO:apache_beam.runners.dataflow.internal.apiclient:Completed GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/MRJToolkit-jU5qhDBc0cNjn7g3yrGHYO78BRC09T-sE8Syqo9mRjg.jar
>  in 0 seconds.
> INFO:apache_beam.runners.dataflow.internal.apiclient:Starting GCS upload to 
> gs://clouddfe-chamikara/temp/kafka-taxi-20200820-132559.1597955225.717180/beam-sdks-java-io-expansion-service-2.24.0-SNAPSHOT-A94br32q87Prj7b_mG4_kPEdz9NSJ-0NwgHWEwwU4Qc.jar...
>  
> Out of these we just need 
> 'beam-sdks-java-io-expansion-service-2.24.0-SNAPSHOT-A94br32q87Prj7b_mG4_kPEdz9NSJ-0NwgHWEwwU4Qc.jar'.
>  Rest seems to be due to us including all jars from classpath in the 
> expansion service response.
>  
> [https://github.com/apache/beam/blob/master/sdks/java/expansion-service/src/main/java/org/apache/beam/sdk/expansion/service/ExpansionService.java#L407]
>  
> We should figure out a way to filter out these additional jars.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to