problem for 0.10.1?

On 04/03/2015 08:32 PM, Suneel Marthi wrote:
We need to refactor some of the T-Digest stuff, the T-Digest that's been
plumbed into Mahout 0.9 was a very early version that only had a TreeDigest.

We now have AVLTreeDigest and better methods available. Good idea to create
a JIRA for that and redo T-Digest stuff in Mahout.


On Fri, Apr 3, 2015 at 8:25 PM, Andrew Palumbo <[email protected]> wrote:

Sorry its TreeDigest that will be missing not  OnlineSummarizer which is
in org.apache.mahout.math.stats and imports:

com.tdunning.math.stats.TDigest
com.tdunning.math.stats.TreeDigest




On 04/03/2015 08:17 PM, Andrew Palumbo wrote:

The problem is (if I understand correctly) that Srtream-lib has
TDijgest.java but not the rest of the classes in the t-digest artifact eg:
OnlineSummarizer which is used by the ResultAnalyzer class that I ported
over from MrLegacy. to Math-Scala in the confusion matrix (also ported to
math-scala).

https://github.com/addthis/stream-lib/tree/master/src/
main/java/com/clearspring/analytics/stream/quantile

I've added:

      <include>com.tdunning:t-digest</include>
<include>org.apache.commons:commons-math3</include>

to the spark/src/main/assembly/dependency-reduced.xml

to include these jars in the spark-naive bayes CLI launcher.

i believe that  the dependency-reduced jar was slimmed down from the
entire MrLegacy module, which included the t-digest artifact. to the few
dependencies that we have in it now.

Not including these in the dependency-reduced jar leads to this exception:

  Exception in thread"main"  java.lang.NoClassDefFoundError:
com/tdunning/math/stats/TDigest
         at org.apache.mahout.classifier.stats.ResultAnalyzer.<init>(
ClassifierStats.scala:64)
         at org.apache.mahout.classifier.naivebayes.NaiveBayes$class.
test(NaiveBayes.scala:303)
         at org.apache.mahout.classifier.naivebayes.NaiveBayes$.test(
NaiveBayes.scala:336)
         at org.apache.mahout.drivers.TestNBDriver$.process(
TestNBDriver.scala:105)
         at org.apache.mahout.drivers.TestNBDriver$$anonfun$main$1.
apply(TestNBDriver.scala:77)
         at org.apache.mahout.drivers.TestNBDriver$$anonfun$main$1.
apply(TestNBDriver.scala:75)
         at scala.Option.map(Option.scala:145)


I'm not sure that this is causing the exception below, but it does seem
possible.

On 04/03/2015 07:26 PM, Suneel Marthi wrote:

You shouldn't be adding T-Digest again to Spark modules (since Stream-lib
in Spark already has one).

T-Digest is needed for MrLegacy and should be added as a dependency.

On Fri, Apr 3, 2015 at 7:00 PM, Andrew Palumbo <[email protected]>
wrote:

  I'm wondering if it could be caused by the  TDigest class from the
artifact that I added to the dependency-reduced jar conflicting with the
spark TDigest class which, as you pointed out the other day, is on the
spark classpath.  The exception is coming right when the summarizer is
being used by the confusion matrix.


On 04/03/2015 06:22 PM, Dmitriy Lyubimov wrote:

  saw a lot of these, some still bewildering, but they all related to
non-local mode (different classpaths on backed and front end).



On Fri, Apr 3, 2015 at 1:39 PM, Andrew Palumbo <[email protected]>
wrote:

   Has anybody seen an exception like this when running a spark job?

the job completes but this exception is reported in the middle.

15/04/02 12:43:54 ERROR Remoting: org.apache.spark.storage.
BlockManagerId;
local class incompatible: stream classdesc serialVersionUID =
2439208141545036836, local class serialVersionUID =
-7366074099953117729
java.io.InvalidClassException: org.apache.spark.storage.
BlockManagerId;
local class incompatible: stream classdesc serialVersionUID =
2439208141545036836, local class serialVersionUID =
-7366074099953117729
       at java.io.ObjectStreamClass.initNonProxy(
ObjectStreamClass.java:617)
       at java.io.ObjectInputStream.readNonProxyDesc(
ObjectInputStream.java:1622)
       at java.io.ObjectInputStream.readClassDesc(
ObjectInputStream.java:1517)
       at java.io.ObjectInputStream.readOrdinaryObject(
ObjectInputStream.java:1771)
       at java.io.ObjectInputStream.readObject0(ObjectInputStream.
java:1350)
       at java.io.ObjectInputStream.defaultReadFields(
ObjectInputStream.java:1990)
       at java.io.ObjectInputStream.readSerialData(
ObjectInputStream.java:1915)
       at java.io.ObjectInputStream.readOrdinaryObject(
ObjectInputStream.java:1798)
       at java.io.ObjectInputStream.readObject0(ObjectInputStream.
java:1350)
       at java.io.ObjectInputStream.readObject(ObjectInputStream.
java:370)
       at akka.serialization.JavaSerializer$$anonfun$1.
apply(Serializer.scala:136)
       at scala.util.DynamicVariable.withValue(DynamicVariable.
scala:57)
       at akka.serialization.JavaSerializer.fromBinary(
Serializer.scala:136)
       at akka.serialization.Serialization$$anonfun$
deserialize$1.apply(
Serialization.scala:104)
       at scala.util.Try$.apply(Try.scala:161)
       at akka.serialization.Serialization.deserialize(
Serialization.scala:98)
       at akka.remote.MessageSerializer$.deserialize(
MessageSerializer.scala:23)
       at akka.remote.DefaultMessageDispatcher.
payload$lzycompute$1(Endpoint.
scala:55)
       at akka.remote.DefaultMessageDispatcher.
payload$1(Endpoint.scala:55)
       at akka.remote.DefaultMessageDispatcher.
dispatch(Endpoint.scala:73)
       at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(
Endpoint.scala:764)
       at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
       at akka.actor.ActorCell.invoke(ActorCell.scala:456)
       at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
       at akka.dispatch.Mailbox.run(Mailbox.scala:219)
       at akka.dispatch.ForkJoinExecutorConfigurator$
AkkaForkJoinTask.exec(
AbstractDispatcher.scala:386)
       at scala.concurrent.forkjoin.ForkJoinTask.doExec(
ForkJoinTask.java:260)
       at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.
runTask(ForkJoinPool.java:1339)
       at scala.concurrent.forkjoin.ForkJoinPool.runWorker(
ForkJoinPool.java:1979)
       at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(
ForkJoinWorkerThread.java:107)





Reply via email to