[ 
https://issues.apache.org/jira/browse/MAHOUT-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17065539#comment-17065539
 ] 

Andrew Palumbo edited comment on MAHOUT-2099 at 3/24/20, 11:56 AM:
-------------------------------------------------------------------

[~tariqjawed83] It may be a question of yor configuration. could you please add 
the mahout-hdfs module to your classpath in the pom.xml,

Are you able to remove the kryo dependency?

(unless you need that version of kryo for something else in your project)

it may conflict with the kryo version brought in by mahout v0.13.0.

The Mahout library's kryo version will be pinned to that of the cluster that it 
is built against.

pom.xml:

add: 
  
{code:java}
<!-- https://mvnrepository.com/artifact/org.apache.mahout/mahout-hdfs -->
<dependency>
    <groupId>org.apache.mahout</groupId>
    <artifactId>mahout-hdfs</artifactId>
    <version>0.13.0</version>
</dependency> 
{code}
 

 
 remove:
{code:java}
<groupId>com.esotericsoftware</groupId>
 <artifactId>kryo</artifactId>
 <version>5.0.0-RC5</version>
 </dependency>
{code:java}
{code}
that dependency will have the org.apache.mahout.VectorWritable Classes, etc.


was (Author: andrew_palumbo):
[~tariqjawed83] could you please add the mahout-hdfs module to your classpath 
in the pom.xml, 

Are you able to remove  the kryo dependency? 

(unless you need that version of kryo for something else in your project) 

it may conflict with the kryo version brought in by mahout v0.13.0. 

The Mahout library's kryo version will be pinned to that of the cluster that it 
is built against.  




pom.xml:

add: 
 
{code}

<!-- https://mvnrepository.com/artifact/org.apache.mahout/mahout-hdfs -->
<dependency>
    <groupId>org.apache.mahout</groupId>
    <artifactId>mahout-hdfs</artifactId>
    <version>0.13.0</version>
</dependency> 
{code}
 

 
remove:
{code}


<groupId>com.esotericsoftware</groupId>
 <artifactId>kryo</artifactId>
 <version>5.0.0-RC5</version>
 </dependency>
{code:java}
{code}

that dependency will have the org.apache.mahout.VectorWritable Classes, etc.



> Using Mahout as a Library in Spark Cluster
> ------------------------------------------
>
>                 Key: MAHOUT-2099
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-2099
>             Project: Mahout
>          Issue Type: Question
>          Components: cooccurrence, Math
>         Environment: Spark version 2.3.0.2.6.5.10-2
>  
> [EDIT] AP
>            Reporter: Tariq Jawed
>            Priority: Major
>
> I have a Spark Cluster already setup, and this is the environment not in my 
> direct control, but they do allow FAT JARs to be installed with the 
> dependencies. I tried to package my Spark Application with some mahout code 
> for SimilarityAnalysis, added Mahout library in POM file, and they are 
> successfully packaged.
> The problem however is that I am getting this error while using existing 
> Spark Context to build Distributed Spark Context for
> Mahout
> [EDIT]AP:
> {code:xml}
> pom.xml
> {...}
> dependency>
>  <groupId>org.apache.mahout</groupId>
>  <artifactId>mahout-math</artifactId>
>  <version>0.13.0</version>
>  </dependency>
>  <dependency>
>  <groupId>org.apache.mahout</groupId>
>  <artifactId>mahout-math-scala_2.10</artifactId>
>  <version>0.13.0</version>
>  </dependency>
>  <dependency>
>  <groupId>org.apache.mahout</groupId>
>  <artifactId>mahout-spark_2.10</artifactId>
>  <version>0.13.0</version>
>  </dependency>
>  <dependency>
>  <groupId>com.esotericsoftware</groupId>
>  <artifactId>kryo</artifactId>
>  <version>5.0.0-RC5</version>
>  </dependency>
>  {code}
>  
> Code:
> {code}
> implicit val sc: SparkContext = sparkSession.sparkContext
> implicit val msc: SparkDistributedContext = sc2sdc(sc)
> Error:
> ERROR TaskSetManager: Task 7.0 in stage 10.0 (TID 58) had a not serializable 
> result: org.apache.mahout.math.DenseVector
>  
> And if I try to build the context using mahoutSparkContext() then its giving 
> me the error that MAHOUT_HOME not found. 
> Code:
> implicit val msc = mahoutSparkContext(masterUrl = "local", appName = 
> "CooccurrenceDriver")
> Error:
> MAHOUT_HOME is required to spawn mahout-based spark jobs
>  {code}
> My question is that how do I proceed in this situation? should I have to ask 
> the administrators of the Spark environment to install Mahout library, or is 
> there anyway I can proceed packaging my application as fat JAR. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to