Ouch. Someone runs into this every few months. Beam SQL has this pattern in a few different places. I really wish we could fix this, but it is a hard problem. There was a nice thread describing why over on dev:
https://lists.apache.org/thread.html/b5a37ef32d892fdecd1fc0b16b24fdc934cd2b0c0c77c193431739e0@%3Cdev.beam.apache.org%3E Andrew On Mon, Dec 10, 2018 at 3:58 AM Matt Casters <[email protected]> wrote: > So, on the off-chance someone else bumps into this. > The actual real fix this error I got: > > java.lang.IllegalArgumentException: No filesystem found for scheme > > was for me to set the correct ClassLoader for the current Thread: > > ClassLoader oldContextClassLoader = > Thread.currentThread().getContextClassLoader(); > try { > > Thread.currentThread().setContextClassLoader( yourClassLoader ); > > // Build/Run Pipeline > > } finally { > > Thread.currentThread().setContextClassLoader( oldContextClassLoader ); > > } > > It allows org.apache.beam.sdk.io.FileSystems to pick up the correct > classloader. > > HTH, > Matt > --- > Matt Casters <m <[email protected]>[email protected]> > Senior Solution Architect, Kettle Project Founder > > > > Op vr 30 nov. 2018 om 12:51 schreef Matt Casters <[email protected]>: > >> I just wanted to thank you again. I split up my project in a beam core >> stuff and my plugin. This got rid of a number of circular dependency >> issues and lib conflicts. >> I also gave the Dataflow PipelineOptions the list of files to stage. >> >> That has made things work and much quicker than I anticipated I must >> admit. >> I'm in awe of how clean and intuitive the Beam API is (once you get the >> hang of it). >> Thanks for everything! >> >> https://github.com/mattcasters/kettle-beam-core >> https://github.com/mattcasters/kettle-beam >> >> Cheers, >> >> Matt >> --- >> Matt Casters <m <[email protected]>[email protected]> >> Senior Solution Architect, Kettle Project Founder >> >> >> Op do 29 nov. 2018 om 19:03 schreef Matt Casters <[email protected]>: >> >>> Thanks a lot for the replies. The problem is not that the jar files >>> aren't in the classloader, it's that something somewhere insists on using >>> the parent classloader. >>> I guess it makes sense since I noticed that running in my IDEA Beam >>> copied all required runtime binaries into GCP Storage so it must have an >>> idea of what to pick up. >>> I'm guessing it tries to pick up everything in the classpath. >>> >>> Throwing all the generated maven jar files into the main classpath of >>> Kettle in this case is a bit messy I'm going to look for an alternative >>> like an application alongside to communicate with. >>> >>> I'll report back once I get a bit further along. >>> >>> Cheers, >>> Matt >>> >>> Op do 29 nov. 2018 om 17:10 schreef Juan Carlos Garcia < >>> [email protected]>: >>> >>>> If you are using Gradle for packaging, make sure your final jar >>>> (fat-jar) contains all the services files merged. >>>> >>>> Using the Gradle shadowJar plugin include "*mergeServiceFiles()*" >>>> instruction like: >>>> >>>> apply plugin: 'com.github.johnrengelman.shadow' >>>> shadowJar { >>>> mergeServiceFiles() >>>> >>>> zip64 true >>>> classifier = 'bundled' >>>> } >>>> >>>> If you are using Maven then use the Shade plugin. >>>> >>>> On Thu, Nov 29, 2018 at 4:50 PM Robert Bradshaw <[email protected]> >>>> wrote: >>>> >>>>> BeamJava uses com.google.auto.service.AutoService which, at the end of >>>>> the day, is shorthand for Java's standard ServiceLoader mechanisms >>>>> (e.g. see [1]). I'm not an expert on the details of how this works, >>>>> but you'll probably have to make sure these filesystem dependencies >>>>> are in your custom classloader's jar. >>>>> >>>>> [1] >>>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystemRegistrar.java >>>>> On Thu, Nov 29, 2018 at 3:57 PM Matt Casters <[email protected]> >>>>> wrote: >>>>> > >>>>> > Hello Beam, >>>>> > >>>>> > I've been taking great steps forward in having Kettle generate Beam >>>>> pipelines and they actually execute just find in unit testing in IntelliJ. >>>>> > The problem starts when I collect all the libraries needed for Beam >>>>> and the Runners and throw them into the Kettle project as a plugin. >>>>> > >>>>> > Caused by: java.lang.IllegalArgumentException: No filesystem found >>>>> for scheme gs >>>>> > at org.apache.beam.sdk.io >>>>> .FileSystems.getFileSystemInternal(FileSystems.java:456) >>>>> > at org.apache.beam.sdk.io >>>>> .FileSystems.matchNewResource(FileSystems.java:526) >>>>> > at org.apache.beam.sdk.io >>>>> .FileBasedSink.convertToFileResourceIfPossible(FileBasedSink.java:213) >>>>> > at org.apache.beam.sdk.io.TextIO$TypedWrite.to(TextIO.java:700) >>>>> > at org.apache.beam.sdk.io.TextIO$Write.to(TextIO.java:1028) >>>>> > at >>>>> org.kettle.beam.core.transform.BeamOutputTransform.expand(BeamOutputTransform.java:87) >>>>> > ... 32 more >>>>> > >>>>> > This also happens for local file execution ("scheme file" in that >>>>> case). >>>>> > >>>>> > So the questions are: how is Beam bootstrapped? How does Beam >>>>> determine which libraries to use and what is the recommended way for >>>>> packaging things up properly? >>>>> > The Beam plugin is running in a separate URLClassloader so I think >>>>> something is going awry there. >>>>> > >>>>> > Thanks a lot for any answers or tips you might have! >>>>> > >>>>> > Matt >>>>> > --- >>>>> > Matt Casters <[email protected]> >>>>> > Senior Solution Architect, Kettle Project Founder >>>>> > >>>>> > >>>>> >>>> >>>> >>>> -- >>>> >>>> JC >>>> >>>>
