Done, I've committed a change to check debug level before trying
expensive debug statements, here and throughout the code.

2011/11/7 WangRamon <[email protected]>:
>
>
>
>
> Hi All  I'm using CanopyClusterer, the input is vectors of Type 
> RandomAccessSparseVector, each vector may have 1~99 attributes. When I'm 
> running CanopyClusterer on Hadoop, i find it was very very slow, so i get the 
> stacktrace of the map tasks, i find the following output:       at 
> org.apache.mahout.clustering.AbstractCluster.formatVector(AbstractCluster.java:301)
>        at 
> org.apache.mahout.clustering.canopy.CanopyClusterer.addPointToCanopies(CanopyClusterer.java:161)
>      At line 161 of CanopyClusterer, it's just a log output statement, it 
> should have something like this "if(log.isDebugEnabled())" to avoid running 
> if the log level is not debug, but this is not the root cause, the root cause 
> in my case is AbstractCluster.formatVector is so slow to complete, after i 
> comment "AbstractCluster.formatVector" everything goes well, can any body 
> have a look at this, thank you very much.      Cheers  Ramon

Reply via email to