[
https://issues.apache.org/jira/browse/MAHOUT-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385902#comment-14385902
]
ASF GitHub Bot commented on MAHOUT-1655:
----------------------------------------
GitHub user pferrel opened a pull request:
https://github.com/apache/mahout/pull/86
MAHOUT-1655
Refactors mr-legacy into mahout-hdfs and mahout-mr
Compiles and completes unit tests, can launch spark-shell but doesn't run
spark-itemsimilarity with the following error:
15/03/29 12:22:12 INFO AkkaUtils: Connecting to HeartbeatReceiver:
akka.tcp://[email protected]:52857/user/HeartbeatReceiver
15/03/29 12:22:12 WARN BlockManager: Putting block broadcast_0 failed
Exception in thread "main" java.lang.NoSuchMethodError:
com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
at
org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
at
org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
at
org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
at
org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at
org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
at
org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
at
org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
at
org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
at
org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
at
org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
at
org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
at
org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:236)
at
org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:126)
at
org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:104)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/pferrel/mahout MAHOUT-1655
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/mahout/pull/86.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #86
----
commit c783c0a91a5f6f1f279b77de6f52ccb39292b5e9
Author: Andrew Musselman <[email protected]>
Date: 2015-03-26T00:38:56Z
Moving mrlegacy directory to hdfs, starting minimal mr directory,
repointing references to mrlegacy everywhere.
commit 1dc3662ac2f548d6ecc784aababda806aa5e7578
Author: Andrew Musselman <[email protected]>
Date: 2015-03-26T14:24:44Z
Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/mahout
into MAHOUT-1655
commit 943d982f9ab08c1a9478ac40b02c4938091ec095
Author: Andrew Musselman <[email protected]>
Date: 2015-03-26T17:00:12Z
Merge branch 'master' into MAHOUT-1655
commit 1c65f2f441e4f504cdb6d215397097a3296eac15
Author: Andrew Musselman <[email protected]>
Date: 2015-03-26T17:13:17Z
Moving contents of hdfs over to mr.
commit 5c8e964991c88813a4cb8abe367245c3c7838246
Author: pferrel <[email protected]>
Date: 2015-03-29T16:44:02Z
Merge branch 'MAHOUT-1655' of https://github.com/andrewmusselman/mahout
into MAHOUT-1655
commit 2d940a04ad06cdeca23731355ef61f1d25d9d2d5
Author: pferrel <[email protected]>
Date: 2015-03-29T18:12:21Z
moved classes into mahout-hdfs and created a dependency in mahout-mr for
the module
commit 7cda0918c5bef50f2e32b70c3439fc3656d804f8
Author: pferrel <[email protected]>
Date: 2015-03-29T18:13:25Z
added junits for moved classes
commit c2b18eebb9009bf0676566e8e9a30332441c2331
Author: pferrel <[email protected]>
Date: 2015-03-29T19:20:11Z
no need to pass mapreduce jar to Spark context now
----
> Refactor module dependencies
> ----------------------------
>
> Key: MAHOUT-1655
> URL: https://issues.apache.org/jira/browse/MAHOUT-1655
> Project: Mahout
> Issue Type: Improvement
> Components: mrlegacy
> Affects Versions: 0.9
> Reporter: Pat Ferrel
> Assignee: Andrew Musselman
> Priority: Critical
> Fix For: 0.10.0
>
>
> Make a new module, call it mahout-hadoop. Move anything there that is
> currently in mrlegacy but used in math-scala or spark. Remove dependencies on
> mrlegacy altogether if possible by using other core classes.
> The goal is to have math-scala and spark module depend on math, and a small
> module called mahout-hadoop (much smaller than mrlegacy).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)