Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

Urs Schoenenberger Mon, 25 Sep 2017 01:59:10 -0700

Hi Federico,

oh, I remember running into this problem some time ago. If I recall
correctly, this is not a flink issue, but an issue with technically
incorrect jars from dependencies which prevent the verification of the
manifest. I was using the maven-shade plugin back then and configured an
exclusion for these file types. I assume that sbt/sbt-assembly has a
similar option, this should be more stable than manually stripping the jar.
Alternatively, you could try to find out which dependency puts the
.SF/etc files there and exclude this dependency altogether, it might be
a transitive lib dependency that comes with hadoop anyways, or simply
one that you don't need anyways.


Best,
Urs

On 25.09.2017 10:09, Federico D'Ambrosio wrote:
> Hi Urs,
> 
> Yes the main class is set, just like you said.
> 
> Still, I might have managed to get it working: during the assembly some
> .SF, .DSA and .RSA files are put inside the META-INF folder of the jar,
> possibly coming from some of the new dependencies in the deps tree.
> Apparently, this caused this weird issue. Using an appropriate pattern for
> discarding the files during the assembly or removing them via zip -d should
> be enough (I sure hope so, since this is some of the worst issues I've come
> across).
> 
> 
> Federico D'Ambrosio
> 
> Il 25 set 2017 9:51 AM, "Urs Schoenenberger" <urs.schoenenber...@tngtech.com>
> ha scritto:
> 
>> Hi Federico,
>>
>> just guessing, but are you explicitly setting the Main-Class manifest
>> attribute for the jar that you are building?
>>
>> Should be something like
>>
>> mainClass in (Compile, packageBin) :=
>> Some("org.yourorg.YourFlinkJobMainClass")
>>
>> Best,
>> Urs
>>
>>
>> On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>>> Hello everyone,
>>>
>>> I'd like to submit to you this weird issue I'm having, hoping you could
>>> help me.
>>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
>>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>> So, I'm trying to implement an sink for Hive so I added the following
>>> dependency in my build.sbt:
>>>
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129"
>>>
>>> in order to use hive streaming capabilities.
>>>
>>> After importing this dependency, not even using it, if I try to flink run
>>> the job I get
>>>
>>> org.apache.flink.client.program.ProgramInvocationException: The
>> program's
>>> entry point class 'package.MainObj' was not found in the jar file.
>>>
>>> If I remove the dependency, everything goes back to normal.
>>> What is weird is that if I try to use sbt run in order to run job, *it
>> does
>>> find the Main class* and obviously crash because of the missing flink
>> core
>>> dependencies (AbstractStateBackend missing and whatnot).
>>>
>>> Here are the complete dependencies of the project:
>>>
>>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> "provided",
>>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129",
>>> "org.joda" % "joda-convert" % "1.8.3",
>>> "com.typesafe.play" %% "play-json" % "2.6.2",
>>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>> "org.scalactic" %% "scalactic" % "3.0.1",
>>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>> "de.javakaffee" % "kryo-serializers" % "0.42"
>>>
>>> Could it be an issue of dependencies conflicts between mongo-hadoop and
>>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>>> though no issue between mongodb-hadoop and flink)? I'm even starting to
>>> think that Flink cannot handle that well big jars (before the new
>>> dependency it was 44M, afterwards it became 115M) when it comes to
>>> classpath loading?
>>>
>>> Any help would be really appreciated,
>>> Kind regards,
>>> Federico
>>>
>>>
>>>
>>> Hello everyone,
>>>
>>> I'd like to submit to you this weird issue I'm having, hoping you could
>>> help me.
>>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
>>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>> So, I'm trying to implement an sink for Hive so I added the following
>>> dependency in my build.sbt:
>>>
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129"
>>>
>>> in order to use hive streaming capabilities.
>>>
>>> After importing this dependency, not even using it, if I try to flink
>>> run the job I get
>>>
>>> org.apache.flink.client.program.ProgramInvocationException: The
>>> program's entry point class 'package.MainObj' was not found in the jar
>> file.
>>>
>>> If I remove the dependency, everything goes back to normal.
>>> What is weird is that if I try to use sbt run in order to run job, *it
>>> does find the Main class* and obviously crash because of the missing
>>> flink core dependencies (AbstractStateBackend missing and whatnot).
>>>
>>> Here are the complete dependencies of the project:
>>>
>>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> "provided",
>>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129",
>>> "org.joda" % "joda-convert" % "1.8.3",
>>> "com.typesafe.play" %% "play-json" % "2.6.2",
>>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>> "org.scalactic" %% "scalactic" % "3.0.1",
>>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>> "de.javakaffee" % "kryo-serializers" % "0.42"
>>>
>>> Could it be an issue of dependencies conflicts between mongo-hadoop and
>>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>>> though no issue between mongodb-hadoop and flink)? I'm even starting to
>>> think that Flink cannot handle that well big jars (before the new
>>> dependency it was 44M, afterwards it became 115M) when it comes to
>>> classpath loading?
>>>
>>> Any help would be really appreciated,
>>> Kind regards,
>>> Federico
>>
>> --
>> Urs Schönenberger - urs.schoenenber...@tngtech.com
>>
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>>
>>
>> Hi Urs,
>>
>> Yes the main class is set, just like you said. 
>>
>> Still, I might have managed to get it working: during the assembly
>> some .SF, .DSA and .RSA files are put inside the META-INF folder of
>> the jar, possibly coming from some of the new dependencies in the deps
>> tree. 
>> Apparently, this caused this weird issue. Using an appropriate pattern
>> for discarding the files during the assembly or removing them via zip
>> -d should be enough (I sure hope so, since this is some of the worst
>> issues I've come across).
>>
>>
>> Federico D'Ambrosio
>>
>> Il 25 set 2017 9:51 AM, "Urs Schoenenberger"
>> <urs.schoenenber...@tngtech.com
>> <mailto:urs.schoenenber...@tngtech.com>> ha scritto:
>>
>>     Hi Federico,
>>
>>     just guessing, but are you explicitly setting the Main-Class manifest
>>     attribute for the jar that you are building?
>>
>>     Should be something like
>>
>>     mainClass in (Compile, packageBin) :=
>>     Some("org.yourorg.YourFlinkJobMainClass")
>>
>>     Best,
>>     Urs
>>
>>
>>     On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>>     > Hello everyone,
>>     >
>>     > I'd like to submit to you this weird issue I'm having, hoping
>>     you could
>>     > help me.
>>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>>     flink 1.3.2
>>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>     > So, I'm trying to implement an sink for Hive so I added the
>>     following
>>     > dependency in my build.sbt:
>>     >
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129"
>>     >
>>     > in order to use hive streaming capabilities.
>>     >
>>     > After importing this dependency, not even using it, if I try to
>>     flink run
>>     > the job I get
>>     >
>>     > org.apache.flink.client.program.ProgramInvocationException: The
>>     program's
>>     > entry point class 'package.MainObj' was not found in the jar file.
>>     >
>>     > If I remove the dependency, everything goes back to normal.
>>     > What is weird is that if I try to use sbt run in order to run
>>     job, *it does
>>     > find the Main class* and obviously crash because of the missing
>>     flink core
>>     > dependencies (AbstractStateBackend missing and whatnot).
>>     >
>>     > Here are the complete dependencies of the project:
>>     >
>>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>>     "provided",
>>     > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129",
>>     > "org.joda" % "joda-convert" % "1.8.3",
>>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>     > "org.scalactic" %% "scalactic" % "3.0.1",
>>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>>     >
>>     > Could it be an issue of dependencies conflicts between
>>     mongo-hadoop and
>>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>>     even
>>     > though no issue between mongodb-hadoop and flink)? I'm even
>>     starting to
>>     > think that Flink cannot handle that well big jars (before the new
>>     > dependency it was 44M, afterwards it became 115M) when it comes to
>>     > classpath loading?
>>     >
>>     > Any help would be really appreciated,
>>     > Kind regards,
>>     > Federico
>>     >
>>     >
>>     >
>>     > Hello everyone,
>>     >
>>     > I'd like to submit to you this weird issue I'm having, hoping
>>     you could
>>     > help me.
>>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>>     flink 1.3.2
>>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>     > So, I'm trying to implement an sink for Hive so I added the
>>     following
>>     > dependency in my build.sbt:
>>     >
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129"
>>     >
>>     > in order to use hive streaming capabilities.
>>     >
>>     > After importing this dependency, not even using it, if I try to
>>     flink
>>     > run the job I get
>>     >
>>     > org.apache.flink.client.program.ProgramInvocationException: The
>>     > program's entry point class 'package.MainObj' was not found in
>>     the jar file.
>>     >
>>     > If I remove the dependency, everything goes back to normal.
>>     > What is weird is that if I try to use sbt run in order to run
>>     job, *it
>>     > does find the Main class* and obviously crash because of the missing
>>     > flink core dependencies (AbstractStateBackend missing and whatnot).
>>     >
>>     > Here are the complete dependencies of the project:
>>     >
>>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>>     "provided",
>>     > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129",
>>     > "org.joda" % "joda-convert" % "1.8.3",
>>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>     > "org.scalactic" %% "scalactic" % "3.0.1",
>>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>>     >
>>     > Could it be an issue of dependencies conflicts between
>>     mongo-hadoop and
>>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>>     even
>>     > though no issue between mongodb-hadoop and flink)? I'm even
>>     starting to
>>     > think that Flink cannot handle that well big jars (before the new
>>     > dependency it was 44M, afterwards it became 115M) when it comes to
>>     > classpath loading?
>>     >
>>     > Any help would be really appreciated,
>>     > Kind regards,
>>     > Federico
>>
>>     --
>>     Urs Schönenberger - urs.schoenenber...@tngtech.com
>>     <mailto:urs.schoenenber...@tngtech.com>
>>
>>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>>     Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>

-- 
Urs Schönenberger - urs.schoenenber...@tngtech.com
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

Reply via email to