Hi Urs, Thank you very much for your advice, I will look into excluding those files directly during the assembly.
2017-09-25 10:58 GMT+02:00 Urs Schoenenberger < urs.schoenenber...@tngtech.com>: > Hi Federico, > > oh, I remember running into this problem some time ago. If I recall > correctly, this is not a flink issue, but an issue with technically > incorrect jars from dependencies which prevent the verification of the > manifest. I was using the maven-shade plugin back then and configured an > exclusion for these file types. I assume that sbt/sbt-assembly has a > similar option, this should be more stable than manually stripping the jar. > Alternatively, you could try to find out which dependency puts the > .SF/etc files there and exclude this dependency altogether, it might be > a transitive lib dependency that comes with hadoop anyways, or simply > one that you don't need anyways. > > Best, > Urs > > On 25.09.2017 10:09, Federico D'Ambrosio wrote: > > Hi Urs, > > > > Yes the main class is set, just like you said. > > > > Still, I might have managed to get it working: during the assembly some > > .SF, .DSA and .RSA files are put inside the META-INF folder of the jar, > > possibly coming from some of the new dependencies in the deps tree. > > Apparently, this caused this weird issue. Using an appropriate pattern > for > > discarding the files during the assembly or removing them via zip -d > should > > be enough (I sure hope so, since this is some of the worst issues I've > come > > across). > > > > > > Federico D'Ambrosio > > > > Il 25 set 2017 9:51 AM, "Urs Schoenenberger" < > urs.schoenenber...@tngtech.com> > > ha scritto: > > > >> Hi Federico, > >> > >> just guessing, but are you explicitly setting the Main-Class manifest > >> attribute for the jar that you are building? > >> > >> Should be something like > >> > >> mainClass in (Compile, packageBin) := > >> Some("org.yourorg.YourFlinkJobMainClass") > >> > >> Best, > >> Urs > >> > >> > >> On 23.09.2017 17:53, Federico D'Ambrosio wrote: > >>> Hello everyone, > >>> > >>> I'd like to submit to you this weird issue I'm having, hoping you could > >>> help me. > >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink > 1.3.2 > >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6) > >>> So, I'm trying to implement an sink for Hive so I added the following > >>> dependency in my build.sbt: > >>> > >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >>> "1.2.1000.2.6.1.0-129" > >>> > >>> in order to use hive streaming capabilities. > >>> > >>> After importing this dependency, not even using it, if I try to flink > run > >>> the job I get > >>> > >>> org.apache.flink.client.program.ProgramInvocationException: The > >> program's > >>> entry point class 'package.MainObj' was not found in the jar file. > >>> > >>> If I remove the dependency, everything goes back to normal. > >>> What is weird is that if I try to use sbt run in order to run job, *it > >> does > >>> find the Main class* and obviously crash because of the missing flink > >> core > >>> dependencies (AbstractStateBackend missing and whatnot). > >>> > >>> Here are the complete dependencies of the project: > >>> > >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided", > >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % > >> "provided", > >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion, > >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion, > >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >>> "1.2.1000.2.6.1.0-129", > >>> "org.joda" % "joda-convert" % "1.8.3", > >>> "com.typesafe.play" %% "play-json" % "2.6.2", > >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2", > >>> "org.scalactic" %% "scalactic" % "3.0.1", > >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test", > >>> "de.javakaffee" % "kryo-serializers" % "0.42" > >>> > >>> Could it be an issue of dependencies conflicts between mongo-hadoop and > >>> hive hadoop versions (respectively 2.7.1 and 2.7.3.2.6.1.0-129, even > >>> though no issue between mongodb-hadoop and flink)? I'm even starting to > >>> think that Flink cannot handle that well big jars (before the new > >>> dependency it was 44M, afterwards it became 115M) when it comes to > >>> classpath loading? > >>> > >>> Any help would be really appreciated, > >>> Kind regards, > >>> Federico > >>> > >>> > >>> > >>> Hello everyone, > >>> > >>> I'd like to submit to you this weird issue I'm having, hoping you could > >>> help me. > >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink > 1.3.2 > >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6) > >>> So, I'm trying to implement an sink for Hive so I added the following > >>> dependency in my build.sbt: > >>> > >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >>> "1.2.1000.2.6.1.0-129" > >>> > >>> in order to use hive streaming capabilities. > >>> > >>> After importing this dependency, not even using it, if I try to flink > >>> run the job I get > >>> > >>> org.apache.flink.client.program.ProgramInvocationException: The > >>> program's entry point class 'package.MainObj' was not found in the jar > >> file. > >>> > >>> If I remove the dependency, everything goes back to normal. > >>> What is weird is that if I try to use sbt run in order to run job, *it > >>> does find the Main class* and obviously crash because of the missing > >>> flink core dependencies (AbstractStateBackend missing and whatnot). > >>> > >>> Here are the complete dependencies of the project: > >>> > >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided", > >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % > >> "provided", > >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion, > >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion, > >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >>> "1.2.1000.2.6.1.0-129", > >>> "org.joda" % "joda-convert" % "1.8.3", > >>> "com.typesafe.play" %% "play-json" % "2.6.2", > >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2", > >>> "org.scalactic" %% "scalactic" % "3.0.1", > >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test", > >>> "de.javakaffee" % "kryo-serializers" % "0.42" > >>> > >>> Could it be an issue of dependencies conflicts between mongo-hadoop and > >>> hive hadoop versions (respectively 2.7.1 and 2.7.3.2.6.1.0-129, even > >>> though no issue between mongodb-hadoop and flink)? I'm even starting to > >>> think that Flink cannot handle that well big jars (before the new > >>> dependency it was 44M, afterwards it became 115M) when it comes to > >>> classpath loading? > >>> > >>> Any help would be really appreciated, > >>> Kind regards, > >>> Federico > >> > >> -- > >> Urs Schönenberger - urs.schoenenber...@tngtech.com > >> > >> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > >> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller > >> Sitz: Unterföhring * Amtsgericht München * HRB 135082 > >> > >> > >> > >> Hi Urs, > >> > >> Yes the main class is set, just like you said. > >> > >> Still, I might have managed to get it working: during the assembly > >> some .SF, .DSA and .RSA files are put inside the META-INF folder of > >> the jar, possibly coming from some of the new dependencies in the deps > >> tree. > >> Apparently, this caused this weird issue. Using an appropriate pattern > >> for discarding the files during the assembly or removing them via zip > >> -d should be enough (I sure hope so, since this is some of the worst > >> issues I've come across). > >> > >> > >> Federico D'Ambrosio > >> > >> Il 25 set 2017 9:51 AM, "Urs Schoenenberger" > >> <urs.schoenenber...@tngtech.com > >> <mailto:urs.schoenenber...@tngtech.com>> ha scritto: > >> > >> Hi Federico, > >> > >> just guessing, but are you explicitly setting the Main-Class > manifest > >> attribute for the jar that you are building? > >> > >> Should be something like > >> > >> mainClass in (Compile, packageBin) := > >> Some("org.yourorg.YourFlinkJobMainClass") > >> > >> Best, > >> Urs > >> > >> > >> On 23.09.2017 17:53, Federico D'Ambrosio wrote: > >> > Hello everyone, > >> > > >> > I'd like to submit to you this weird issue I'm having, hoping > >> you could > >> > help me. > >> > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and > >> flink 1.3.2 > >> > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6) > >> > So, I'm trying to implement an sink for Hive so I added the > >> following > >> > dependency in my build.sbt: > >> > > >> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >> > "1.2.1000.2.6.1.0-129" > >> > > >> > in order to use hive streaming capabilities. > >> > > >> > After importing this dependency, not even using it, if I try to > >> flink run > >> > the job I get > >> > > >> > org.apache.flink.client.program.ProgramInvocationException: The > >> program's > >> > entry point class 'package.MainObj' was not found in the jar file. > >> > > >> > If I remove the dependency, everything goes back to normal. > >> > What is weird is that if I try to use sbt run in order to run > >> job, *it does > >> > find the Main class* and obviously crash because of the missing > >> flink core > >> > dependencies (AbstractStateBackend missing and whatnot). > >> > > >> > Here are the complete dependencies of the project: > >> > > >> > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided", > >> > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % > >> "provided", > >> > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion, > >> > "org.apache.flink" %% "flink-cep-scala" % flinkVersion, > >> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >> > "1.2.1000.2.6.1.0-129", > >> > "org.joda" % "joda-convert" % "1.8.3", > >> > "com.typesafe.play" %% "play-json" % "2.6.2", > >> > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2", > >> > "org.scalactic" %% "scalactic" % "3.0.1", > >> > "org.scalatest" %% "scalatest" % "3.0.1" % "test", > >> > "de.javakaffee" % "kryo-serializers" % "0.42" > >> > > >> > Could it be an issue of dependencies conflicts between > >> mongo-hadoop and > >> > hive hadoop versions (respectively 2.7.1 and 2.7.3.2.6.1.0-129, > >> even > >> > though no issue between mongodb-hadoop and flink)? I'm even > >> starting to > >> > think that Flink cannot handle that well big jars (before the new > >> > dependency it was 44M, afterwards it became 115M) when it comes to > >> > classpath loading? > >> > > >> > Any help would be really appreciated, > >> > Kind regards, > >> > Federico > >> > > >> > > >> > > >> > Hello everyone, > >> > > >> > I'd like to submit to you this weird issue I'm having, hoping > >> you could > >> > help me. > >> > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and > >> flink 1.3.2 > >> > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6) > >> > So, I'm trying to implement an sink for Hive so I added the > >> following > >> > dependency in my build.sbt: > >> > > >> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >> > "1.2.1000.2.6.1.0-129" > >> > > >> > in order to use hive streaming capabilities. > >> > > >> > After importing this dependency, not even using it, if I try to > >> flink > >> > run the job I get > >> > > >> > org.apache.flink.client.program.ProgramInvocationException: The > >> > program's entry point class 'package.MainObj' was not found in > >> the jar file. > >> > > >> > If I remove the dependency, everything goes back to normal. > >> > What is weird is that if I try to use sbt run in order to run > >> job, *it > >> > does find the Main class* and obviously crash because of the > missing > >> > flink core dependencies (AbstractStateBackend missing and > whatnot). > >> > > >> > Here are the complete dependencies of the project: > >> > > >> > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided", > >> > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % > >> "provided", > >> > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion, > >> > "org.apache.flink" %% "flink-cep-scala" % flinkVersion, > >> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" % > >> > "1.2.1000.2.6.1.0-129", > >> > "org.joda" % "joda-convert" % "1.8.3", > >> > "com.typesafe.play" %% "play-json" % "2.6.2", > >> > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2", > >> > "org.scalactic" %% "scalactic" % "3.0.1", > >> > "org.scalatest" %% "scalatest" % "3.0.1" % "test", > >> > "de.javakaffee" % "kryo-serializers" % "0.42" > >> > > >> > Could it be an issue of dependencies conflicts between > >> mongo-hadoop and > >> > hive hadoop versions (respectively 2.7.1 and 2.7.3.2.6.1.0-129, > >> even > >> > though no issue between mongodb-hadoop and flink)? I'm even > >> starting to > >> > think that Flink cannot handle that well big jars (before the new > >> > dependency it was 44M, afterwards it became 115M) when it comes to > >> > classpath loading? > >> > > >> > Any help would be really appreciated, > >> > Kind regards, > >> > Federico > >> > >> -- > >> Urs Schönenberger - urs.schoenenber...@tngtech.com > >> <mailto:urs.schoenenber...@tngtech.com> > >> > >> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > >> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller > >> Sitz: Unterföhring * Amtsgericht München * HRB 135082 > >> > > -- > Urs Schönenberger - urs.schoenenber...@tngtech.com > TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller > Sitz: Unterföhring * Amtsgericht München * HRB 135082 >