Re: Empty cluster exception

Arshad Khan Tue, 06 Apr 2010 18:41:59 -0700

It seems that the empty cluster exception is being caused by another
exception happening earlier. It is the FileAlreadyExistsException. The stack
trace is follows. Although I am using HadoopUtil.overwrite method to cleanup
the output dir, but the exception happens anyway.



org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
file:/informatics/data/scratch/TMA/work/C104614-2010-04-06-21-29-57-CDE00FB3-2C58-4C89-AAC4-8E79083D9D12/clusters/clusters-0
already exists
    at
org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:111)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:772)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
    at
org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:270)
    at
org.apache.mahout.clustering.kmeans.KMeansDriver.runJob(KMeansDriver.java:213)

Again, this happens randomly.

On Thu, Apr 1, 2010 at 9:28 AM, Arshad Khan <khan.m.ars...@gmail.com> wrote:

> The data being used for clustering is coming out of an index created on a
> bunch of PubMed abstracts. The index is passed through a TFDFMapper using
> the tf-idf weighting scheme and a points file is generated using the
> LuceneIterable class. This file is the input file to the KMeansDriver
> program. The code to perform this is actually same as one given in the
> util.vectors.lucene.Driver class.
>
> Arshad
>
>
> On Thu, Apr 1, 2010 at 1:55 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>
>> Empty clusters are not that uncommon with k-means if you specify too large
>> a
>> value for k.
>>
>> Arshad,  can you say more about what data you are clustering?
>>
>> On Wed, Mar 31, 2010 at 6:29 AM, Grant Ingersoll <gsing...@apache.org
>> >wrote:
>>
>> > Can you share the parameters you used to get this?  Does it happen every
>> > time?
>> >
>> >
>> > On Mar 29, 2010, at 11:53 PM, Arshad Khan wrote:
>> >
>> > > Hello All
>> > >
>> > > While using Mahout 0.3 KMeansDriver I am encountering an exception
>> > > indicating an empty cluster. This happens sometimes while re-running
>> the
>> > > clustering on the same data set. Is there a way to prevent this error?
>> > The
>> > > exception trace is follows:
>> > >
>> > > java.lang.RuntimeException: Error in configuring object
>> > >        at
>> > >
>> >
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>> > >        at
>> > >
>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>> > >        at
>> > >
>> >
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>> > >        at
>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
>> > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> > >        at
>> > >
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)
>> > > Caused by: java.lang.reflect.InvocationTargetException
>> > >        at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
>> > >        at
>> > >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> > >        at java.lang.reflect.Method.invoke(Method.java:597)
>> > >        at
>> > >
>> >
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>> > >        ... 5 more
>> > > Caused by: java.lang.RuntimeException: Error in configuring object
>> > >        at
>> > >
>> >
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>> > >        at
>> > >
>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>> > >        at
>> > >
>> >
>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>> > >        at
>> org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>> > >        ... 9 more
>> > > Caused by: java.lang.reflect.InvocationTargetException
>> > >        at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source)
>> > >        at
>> > >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> > >        at java.lang.reflect.Method.invoke(Method.java:597)
>> > >        at
>> > >
>> >
>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>> > >        ... 12 more
>> > > Caused by: java.lang.IllegalStateException: Cluster is empty!
>> > >        at
>> > >
>> >
>> org.apache.mahout.clustering.kmeans.KMeansClusterMapper.configure(KMeansClusterMapper.java:73)
>> > >        ... 16 more
>> > >
>> > > Thanks
>> > > Arshad
>> >
>> > --------------------------
>> > Grant Ingersoll
>> > http://www.lucidimagination.com/
>> >
>> > Search the Lucene ecosystem using Solr/Lucene:
>> > http://www.lucidimagination.com/search
>> >
>> >
>>
>
>

Re: Empty cluster exception

Reply via email to