[
https://issues.apache.org/jira/browse/MAHOUT-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258255#comment-16258255
]
Pat Ferrel edited comment on MAHOUT-2023 at 11/18/17 11:32 PM:
---------------------------------------------------------------
ok not that MAHOUT-2020 is resolved, I looked at the scopt issue and found:
* all the correct scopt artifact exist in remote repos for all scala versions
and they are being found by the mahout build.
* the ids for artifact etc are correct as per ^^^
* I checked all the tagged versions of Mahout back to 12.0. Not sure when the
drivers stopped working but there has been no change to any reference to scopt
in any POM. And since people have been using it and asking questions on the
mailing list I will assume that up till the last build changes the drivers
worked.
* The vienna-cl and java to c bindings are in the assembly pom so these classes
are getting to the Spark Executors.
* I've checked compute-classpath.sh and the mahout script where changes were
small and not relevant.
* I've looked at the contents of the mahout*dependency-reduced.jar, which
should have the things listed below and it does not, in only had guava, apache
commons and fastutils. It is supposed to have:
<dependencySets>
<dependencySet>
<unpack>true</unpack>
<unpackOptions>
<!-- MAHOUT-1126 -->
<excludes>
<exclude>META-INF/LICENSE</exclude>
</excludes>
</unpackOptions>
<scope>runtime</scope>
<outputDirectory>/</outputDirectory>
<useTransitiveFiltering>true</useTransitiveFiltering>
<includes>
<!-- guava only included to get Preconditions in mahout-math and
mahout-hdfs -->
<include>com.google.guava:guava</include>
<include>com.github.scopt_${scala.compat.version}</include>
<include>com.tdunning:t-digest</include>
<include>org.apache.commons:commons-math3</include>
<include>it.unimi.dsi:fastutil</include>
<include>org.apache.mahout:mahout-native-viennacl_${scala.compat.version}</include>
<include>org.apache.mahout:mahout-native-viennacl-omp_${scala.compat.version}</include>
<include>org.bytedeco:javacpp</include>
</includes>
</dependencySet>
This all leads me to believe that something in the build no longer makes that
dependency-reduced.jar available to the Java Driver code since those other libs
in the assembly are probably all hadoop or Spark Executor code, not needed in
the Mahout driver. This is likely to have been a side effect of the build
refactoring
[~rawkintrevo_apache] does "dependencies-reduced.jar" which contains Scopt get
its scala.compat.version fixed? It seems like the jar is missing anything with
scala.compat.version but this may be a red herring.
was (Author: pferrel):
ok not that MAHOUT-2020 is resolved, I looked at the scopt issue and found:
* all the correct scopt artifact exist in remote repos for all scala versions
and they are being found by the mahout build.
* the ids for artifact etc are correct as per ^^^
* I checked all the tagged versions of Mahout back to 12.0. Not sure when the
drivers stopped working but there has been no change to any reference to scopt
in any POM. And since people have been using it and asking questions on the
mailing list I will assume that up till the last build changes the drivers
worked.
* The vienna-cl and java to c bindings are in the assembly pom so these classes
are getting to the Spark Executors.
* I've checked compute-classpath.sh and the mahout script where changes were
small and not relevant.
* I've looked at the contents of the mahout*dependency-reduced.jar, which
should have the things listed below and it does not, in only had guava, apache
commons and fastutils. It is supposed to have:
{{ <dependencySets>
<dependencySet>
<unpack>true</unpack>
<unpackOptions>
<!-- MAHOUT-1126 -->
<excludes>
<exclude>META-INF/LICENSE</exclude>
</excludes>
</unpackOptions>
<scope>runtime</scope>
<outputDirectory>/</outputDirectory>
<useTransitiveFiltering>true</useTransitiveFiltering>
<includes>
<!-- guava only included to get Preconditions in mahout-math and
mahout-hdfs -->
<include>com.google.guava:guava</include>
<include>com.github.scopt_${scala.compat.version}</include>
<include>com.tdunning:t-digest</include>
<include>org.apache.commons:commons-math3</include>
<include>it.unimi.dsi:fastutil</include>
<include>org.apache.mahout:mahout-native-viennacl_${scala.compat.version}</include>
<include>org.apache.mahout:mahout-native-viennacl-omp_${scala.compat.version}</include>
<include>org.bytedeco:javacpp</include>
</includes>
</dependencySet>
}}
This all leads me to believe that something in the build no longer makes that
dependency-reduced.jar available to the Java Driver code since those other libs
in the assembly are probably all hadoop or Spark Executor code, not needed in
the Mahout driver. This is likely to have been a side effect of the build
refactoring
[~rawkintrevo_apache] does "dependencies-reduced.jar" which contains Scopt get
its scala.compat.version fixed? This doesn't seem to be the problem but is aa
question nonetheless.
> Drivers broken, scopt classes not found
> ---------------------------------------
>
> Key: MAHOUT-2023
> URL: https://issues.apache.org/jira/browse/MAHOUT-2023
> Project: Mahout
> Issue Type: Bug
> Components: build
> Affects Versions: 0.13.1
> Environment: any
> Reporter: Pat Ferrel
> Assignee: Pat Ferrel
> Priority: Blocker
> Fix For: 0.13.1
>
>
> Type `mahout spark-itemsimilarity` after Mahout is installed properly and you
> get a fatal exception due to missing scopt classes.
> Probably a build issue related to incorrect versions of scopt being looked
> for.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)