The execution exception is there because of the very thready nature of the code. The vector encoder runs in the main thread but the learning algorithms run in threads to saturate all available cores.
Commuting the changes seems fine. I will take a more careful look when I get back to a network later today. Sent from my iPhone On Oct 10, 2010, at 2:56 AM, Sean Owen <[email protected]> wrote: > I can commit Joe's fix for the ".DS_Store" problem -- seems like a > clear bug so valid to change even in the quiet period. I will also > commit a change that un-chains that second stack trace by one. There > is no need to have ExecutionException in there and it obscures the > cause. I don't know more about that. > > On Sun, Oct 10, 2010 at 5:25 AM, Joe Kumar <[email protected]> wrote: >> Ted, >> >> I just started testing TrainNewsGroups and am executing it through eclipse, >> passing the location of directory 20news-18828 to the program. >> >> I encountered an Exception when the code was trying to read the files inside >> the newsgroup directory >> using files.addAll(Arrays.asList(newsgroup.listFiles())); >> The directory of newgroup had a DS_Store file which made the above code >> throw an Exception. So I modified the code as >> >> if(newsgroup.isDirectory()){ >> >> files.addAll(Arrays.asList(newsgroup.listFiles())); >> >> } >> >> to fix it >> >> After fixing this, I get the below log and exception >> >> >> 18828 training files >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 1 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 2 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 3 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 4 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 6 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 8 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 10 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 12 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 15 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 20 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 25 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 30 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 40 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 50 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 60 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 70 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 80 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 100 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 120 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 140 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 150 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 200 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 250 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 300 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 400 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 500 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 600 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 700 0.000 0.00 none >> >> 0.00 0.00 0.00 0.00 0.00000000 0.00000000 800 0.000 0.00 none >> >> Exception in thread "main" java.lang.IllegalStateException: >> java.util.concurrent.ExecutionException: >> java.lang.ArrayIndexOutOfBoundsException: 19 >> >> at >> org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.trainWithBufferedExamples( >> AdaptiveLogisticRegression.java:137) >> >> at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train( >> AdaptiveLogisticRegression.java:111) >> >> at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train( >> AdaptiveLogisticRegression.java:97) >> >> at org.apache.mahout.classifier.sgd.TrainNewsGroups.main( >> TrainNewsGroups.java:164) >> >> Caused by: java.util.concurrent.ExecutionException: >> java.lang.ArrayIndexOutOfBoundsException: 19 >> >> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> >> at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> >> at org.apache.mahout.ep.EvolutionaryProcess.parallelDo( >> EvolutionaryProcess.java:154) >> >> at >> org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.trainWithBufferedExamples( >> AdaptiveLogisticRegression.java:117) >> >> ... 3 more >> >> I am not sure if I am doing something wrong. Thought I'll check with you and >> document the process of running this example and other details about SGD. >> >> reg, >> >> Joe. >>
