I can commit Joe's fix for the ".DS_Store" problem -- seems like a clear bug so valid to change even in the quiet period. I will also commit a change that un-chains that second stack trace by one. There is no need to have ExecutionException in there and it obscures the cause. I don't know more about that.
On Sun, Oct 10, 2010 at 5:25 AM, Joe Kumar <[email protected]> wrote: > Ted, > > I just started testing TrainNewsGroups and am executing it through eclipse, > passing the location of directory 20news-18828 to the program. > > I encountered an Exception when the code was trying to read the files inside > the newsgroup directory > using files.addAll(Arrays.asList(newsgroup.listFiles())); > The directory of newgroup had a DS_Store file which made the above code > throw an Exception. So I modified the code as > > if(newsgroup.isDirectory()){ > > files.addAll(Arrays.asList(newsgroup.listFiles())); > > } > > to fix it > > After fixing this, I get the below log and exception > > > 18828 training files > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 1 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 2 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 3 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 4 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 6 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 8 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 10 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 12 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 15 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 20 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 25 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 30 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 40 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 50 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 60 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 70 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 80 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 100 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 120 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 140 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 150 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 200 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 250 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 300 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 400 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 500 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 600 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 700 0.000 0.00 none > > 0.00 0.00 0.00 0.00 0.00000000 0.00000000 800 0.000 0.00 none > > Exception in thread "main" java.lang.IllegalStateException: > java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: 19 > > at > org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.trainWithBufferedExamples( > AdaptiveLogisticRegression.java:137) > > at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train( > AdaptiveLogisticRegression.java:111) > > at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train( > AdaptiveLogisticRegression.java:97) > > at org.apache.mahout.classifier.sgd.TrainNewsGroups.main( > TrainNewsGroups.java:164) > > Caused by: java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: 19 > > at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > > at org.apache.mahout.ep.EvolutionaryProcess.parallelDo( > EvolutionaryProcess.java:154) > > at > org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.trainWithBufferedExamples( > AdaptiveLogisticRegression.java:117) > > ... 3 more > > I am not sure if I am doing something wrong. Thought I'll check with you and > document the process of running this example and other details about SGD. > > reg, > > Joe. >
