Let me do some more investigation. It could be an issue with my code. On Wed, Apr 7, 2010 at 9:44 AM, Robin Anil <robin.a...@gmail.com> wrote:
> Could you upload the dataset(if its small) somewhere. I will take a look at > it. > > Robin > > On Wed, Apr 7, 2010 at 7:11 AM, Arshad Khan <khan.m.ars...@gmail.com> > wrote: > > > It seems that the empty cluster exception is being caused by another > > exception happening earlier. It is the FileAlreadyExistsException. The > > stack > > trace is follows. Although I am using HadoopUtil.overwrite method to > > cleanup > > the output dir, but the exception happens anyway. > > > > > > org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory > > > > > file:/informatics/data/scratch/TMA/work/C104614-2010-04-06-21-29-57-CDE00FB3-2C58-4C89-AAC4-8E79083D9D12/clusters/clusters-0 > > already exists > > at > > > > > org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:111) > > at > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:772) > > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249) > > at > > > > > org.apache.mahout.clustering.kmeans.KMeansDriver.runIteration(KMeansDriver.java:270) > > at > > > > > org.apache.mahout.clustering.kmeans.KMeansDriver.runJob(KMeansDriver.java:213) > > > > Again, this happens randomly. > > > > On Thu, Apr 1, 2010 at 9:28 AM, Arshad Khan <khan.m.ars...@gmail.com> > > wrote: > > > > > The data being used for clustering is coming out of an index created on > a > > > bunch of PubMed abstracts. The index is passed through a TFDFMapper > using > > > the tf-idf weighting scheme and a points file is generated using the > > > LuceneIterable class. This file is the input file to the KMeansDriver > > > program. The code to perform this is actually same as one given in the > > > util.vectors.lucene.Driver class. > > > > > > Arshad > > > > > > > > > On Thu, Apr 1, 2010 at 1:55 AM, Ted Dunning <ted.dunn...@gmail.com> > > wrote: > > > > > >> Empty clusters are not that uncommon with k-means if you specify too > > large > > >> a > > >> value for k. > > >> > > >> Arshad, can you say more about what data you are clustering? > > >> > > >> On Wed, Mar 31, 2010 at 6:29 AM, Grant Ingersoll <gsing...@apache.org > > >> >wrote: > > >> > > >> > Can you share the parameters you used to get this? Does it happen > > every > > >> > time? > > >> > > > >> > > > >> > On Mar 29, 2010, at 11:53 PM, Arshad Khan wrote: > > >> > > > >> > > Hello All > > >> > > > > >> > > While using Mahout 0.3 KMeansDriver I am encountering an exception > > >> > > indicating an empty cluster. This happens sometimes while > re-running > > >> the > > >> > > clustering on the same data set. Is there a way to prevent this > > error? > > >> > The > > >> > > exception trace is follows: > > >> > > > > >> > > java.lang.RuntimeException: Error in configuring object > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > > >> > > at > > >> > > > > >> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > > >> > > at > > >> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354) > > >> > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > > >> > > at > > >> > > > > >> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176) > > >> > > Caused by: java.lang.reflect.InvocationTargetException > > >> > > at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown > > Source) > > >> > > at > > >> > > > > >> > > > >> > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > >> > > at java.lang.reflect.Method.invoke(Method.java:597) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > > >> > > ... 5 more > > >> > > Caused by: java.lang.RuntimeException: Error in configuring object > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) > > >> > > at > > >> > > > > >> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > > >> > > at > > >> org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) > > >> > > ... 9 more > > >> > > Caused by: java.lang.reflect.InvocationTargetException > > >> > > at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown > > Source) > > >> > > at > > >> > > > > >> > > > >> > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > >> > > at java.lang.reflect.Method.invoke(Method.java:597) > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) > > >> > > ... 12 more > > >> > > Caused by: java.lang.IllegalStateException: Cluster is empty! > > >> > > at > > >> > > > > >> > > > >> > > > org.apache.mahout.clustering.kmeans.KMeansClusterMapper.configure(KMeansClusterMapper.java:73) > > >> > > ... 16 more > > >> > > > > >> > > Thanks > > >> > > Arshad > > >> > > > >> > -------------------------- > > >> > Grant Ingersoll > > >> > http://www.lucidimagination.com/ > > >> > > > >> > Search the Lucene ecosystem using Solr/Lucene: > > >> > http://www.lucidimagination.com/search > > >> > > > >> > > > >> > > > > > > > > >