Thinking loud here. If this is indeed a build error that u r seeing, a better 
fix would be to exclude hadoop's guava 11 transitive dependency in the pom as 
opposed to having downgrade Mahout code to be guava 11 compatible. 

We might have missed excluding Hadoop's Guava 11 jar during the recent patch 
for Hadoop 2 (this needs to be done for both hadoop 1 & 2 profiles) if that 
indeed fixes the issue.  








On Sunday, March 9, 2014 2:14 PM, Bikash Gupta <[email protected]> wrote:
 
MAHOUT-1442 has been created. Will submit the patch too.


On Sun, Mar 9, 2014 at 9:03 PM, Ted Dunning <[email protected]> wrote:

> Can you file a JIRA and attach your patch?
>
>
> On Sun, Mar 9, 2014 at 8:03 AM, Bikash Gupta <[email protected]
> >wrote:
>
> > Info for everyone
> >
> > I have successfully forced Mahout to build with Guava 11.0.2. Error and
> > fixes as mentioned below
> >
>
 > 1.  Class: org.apache.mahout.math.stats.GroupTree
> > - Change Line No 171 to - stack = new ArrayDeque<GroupTree>();
> > - Import package java.util.ArrayDeque;
> >
> > 2. Class: org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > -  11.0.2 dosent have Closer in IO, hence I have used try-with-resources
> > - changed java to 1.7
> > - code changed as shown below
> >
> >  try(ByteArrayOutputStream byteArrayOutputStream = new
> > ByteArrayOutputStream();
> >         DataOutputStream dataOutputStream = new
> > DataOutputStream(byteArrayOutputStream)) {
> >      
 PolymorphicWritable.write(dataOutputStream, lr);
> >       output =
 byteArrayOutputStream.toByteArray();
> >     }
> >
> >     OnlineLogisticRegression read;
> >
> >     try(ByteArrayInputStream byteArrayInputStream = new
> > ByteArrayInputStream(output);
> >       DataInputStream dataInputStream = new
> > DataInputStream(byteArrayInputStream)) {
> >       read = PolymorphicWritable.read(dataInputStream,
> > OnlineLogisticRegression.class);
> >     }
> >
> > 3. org.apache.mahout.utils.vectors.lucene.LuceneIterableTest
> > -  Iterators.advance was not present in 11.0.2. Hence just added the
> > respective code. sample shown
 below
> > int numberToAdvance = 1;
> >     int iterateNumberToAdvance;
> >     for (iterateNumberToAdvance = 0; iterateNumberToAdvance <
> > numberToAdvance && iterator.hasNext(); iterateNumberToAdvance++) {
> >       iterator.next();
> >     }
> >
> > If anyone has good suggestion then please flag.
> >
> > @Suneel,
> >
> > Going back to my original question. I was able to call ClusteringUtils
> for
> > Kmeans, however I cannot use ClusterQualitySummarizer bcoz it doesnt
> > support WeightedPropertyVectorWritable.
> >
> >
> >
> > On Sun, Mar 9, 2014 at 6:28 PM, Bikash Gupta <[email protected]
> > >wrote:
> >
> > > Just FYI... downgrading guava to 11.0.2 has fixed the build error in
> > > mahout-math as suggested by Ted however it is causing some other build
> > > error in mahout-core
> > >
> > > [INFO] -------------------------------------------------------------
> > > [ERROR]
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[24,28]
> > > cannot
 find symbol
>
 > >   symbol:   class Closer
> > >   location: package com.google.common.io
> > > [ERROR]
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,5]
> > > cannot find symbol
> > >   symbol:   class Closer
> > >   location: class
> > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > > [ERROR]
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,21]
> > > cannot find symbol
> > >   symbol:   variable Closer
> > >   location: class
> > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > >
> > >
> > > On Sun, Mar 9, 2014 at 3:45 PM, Suneel Marthi <[email protected]
> > >wrote:
> > >
> > >> Darn. U r the second guy to report that this week.  Change that line
> to
> > >> what ted suggested.  The issue is with guava incompatibility with
> > Hadoop's
> > >> antiquated guava version.
> > >>
> > >> Sent from my iPhone
> >
 >>
> >
 >> On Mar 9, 2014, at 6:10 AM, Bikash Gupta <[email protected]>
> > >> wrote:
> > >>
> > >> I am successfully able to run ClusteringUtils on Kmeans(needs to check
> > >> the scenario which you have mentionbed). However I am getting error
> from
> > >> TDigest class
> > >>
> > >> Exception in thread "main" java.lang.NoSuchMethodError:
> > >> com.google.common.collect.Queues.newArrayDeque()Ljava/util/ArrayDeque;
> > >>     at
> > org.apache.mahout.math.stats.GroupTree$1.<init>(GroupTree.java:171)
> >
 >>     at
> > org.apache.mahout.math.stats.GroupTree.iterator(GroupTree.java:169)
> > >>     at
> > >> org.apache.mahout.math.stats.GroupTree.access$300(GroupTree.java:14)
> > >>     at
> > >> org.apache.mahout.math.stats.GroupTree$2.iterator(GroupTree.java:317)
> > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:105)
> > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:88)
> > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:76)
> > >>     at
> > >>
> >
>
 org.apache.mahout.math.stats.OnlineSummarizer.add(OnlineSummarizer.java:57)
> > >>     at
> > >>
> >
> org.apache.mahout.clustering.ClusteringUtils.summarizeClusterDistances(ClusteringUtils.java:65)
> > >>
> > >> Few days ago I saw a post where an user got a similar issue on TDigest
> > >> class. Ted suggested to replace the line with below code
> > >>
> > >> stack = new ArrayDeque<GroupTree>();
> > >>
> > >> Let me know if I am correct.
> > >>
> > >>
> > >> On Sun, Mar 9, 2014 at 3:18 PM, Suneel Marthi <
> [email protected]
> > >wrote:
> > >>
> > >>> U could call ClusterQualitySummarizer which then calls
> ClusteringUtils
> > >>> to spew out the different metrics u had specified.
> > >>> For an example, see the Streaming Kmeans section in
> > >>> examples/bin/cluster-reuters.sh.
> > >>>
> > >>> It calls 'qualcluster' with options -i <tf-idf vectors generated from
> > >>> seq2sparse> -c <output of Kmeans> -o <output file generated with the
> > >>> metrics>
> > >>>
> >
 >>>
> > >>> I have not tried this on KMeans and since the output format of KMeans
> > is
> > >>> different from Streaming KMeans, this might just fall flat.
> > >>> Also it may fail to read some of the clusters if the clusters have
> only
> > >>> a single clusteredpoint, this is due to new TDigest summarizer that
> > expects
> > >>> atleast 2 points in order to calculate - max, quartiles, mean.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> >
 >>> On
 Sunday, March 9, 2014 4:19 AM, Bikash Gupta <
> > [email protected]>
> > >>> wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I want to use ClusteringUtils on Kmeans clusteredPoints to get
> > >>> summarizeClusterDistances , daviesBouldinIndex & dunnIndex
> > >>>
> > >>> Is there any sample or example how to use these features?
> > >>> --
> > >>> Thanks & Regards
> > >>> Bikash Kumar Gupta

> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Thanks & Regards
> > >> Bikash Kumar Gupta
> > >>
> > >>
> > >
> > >
> > > --
> > > Thanks & Regards
> > > Bikash Kumar Gupta
> > >
> >
> >
> >
> > --
> > Thanks & Regards
> > Bikash Kumar Gupta
> >
>



-- 
Thanks &
 Regards
Bikash Kumar Gupta

Reply via email to