Ted,

I just started testing TrainNewsGroups and am executing it through eclipse,
passing the location of directory 20news-18828 to the program.

I encountered an Exception when the code was trying to read the files inside
the newsgroup directory
using files.addAll(Arrays.asList(newsgroup.listFiles()));
The directory of newgroup had a DS_Store file which made the above code
throw an Exception. So I modified the code as

if(newsgroup.isDirectory()){

        files.addAll(Arrays.asList(newsgroup.listFiles()));

      }

to fix it

After fixing this, I get the below log and exception


18828 training files

0.00 0.00 0.00 0.00 0.00000000 0.00000000 1 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 2 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 3 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 4 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 6 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 8 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 10 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 12 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 15 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 20 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 25 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 30 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 40 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 50 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 60 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 70 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 80 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 100 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 120 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 140 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 150 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 200 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 250 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 300 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 400 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 500 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 600 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 700 0.000 0.00 none

0.00 0.00 0.00 0.00 0.00000000 0.00000000 800 0.000 0.00 none

Exception in thread "main" java.lang.IllegalStateException:
java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: 19

at
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.trainWithBufferedExamples(
AdaptiveLogisticRegression.java:137)

at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train(
AdaptiveLogisticRegression.java:111)

at org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.train(
AdaptiveLogisticRegression.java:97)

at org.apache.mahout.classifier.sgd.TrainNewsGroups.main(
TrainNewsGroups.java:164)

Caused by: java.util.concurrent.ExecutionException:
java.lang.ArrayIndexOutOfBoundsException: 19

at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)

at java.util.concurrent.FutureTask.get(FutureTask.java:83)

at org.apache.mahout.ep.EvolutionaryProcess.parallelDo(
EvolutionaryProcess.java:154)

at
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.trainWithBufferedExamples(
AdaptiveLogisticRegression.java:117)

... 3 more

I am not sure if I am doing something wrong. Thought I'll check with you and
document the process of running this example and other details about SGD.

reg,

Joe.

Reply via email to