Adaptive logistic regression is having a bit of a problem.  It cuts off the
learning rate too soon.  Right now the only work around is to hand tune an
annealing schedule and use OnlineLogisticRegression.

On Wed, Jun 20, 2012 at 7:17 PM, 山鸡 <[email protected]> wrote:

> Hi
>   I tried to run  the mahout example classify-20newsgroups.sh , and got
> the  result in the following :
> Summary
>
> -------------------------------------------------------
>
> Correctly Classified Instances : 5556 73.7653%
>
> Incorrectly Classified Instances : 1976 26.2347%
>
> Total Classified Instances : 7532
>
>
> The correct rate is even not better than bayes algorithmn . So I 'm not
> sure if I ran the bad case or should I tune the classifier ?
>
>
> And also I tried to tune the leaktype with different values which is talk
> about in "Mahout in Action ",but the best result is still with the default
> value "leaktype =0 ".
>
>
> I found from confusion matrix the comp.graphics get very low correct rate ,
> precision and recall on that is unacceptable .
>
> The following is the detail output of my classify-20newsgroups.sh output :
>
>
>
> ok. You chose 3 and we'll use sgd
>
> creating work directory at /tmp/mahout-work-hadoop
>
> Training on /tmp/mahout-work-hadoop/20news-bydate/20news-bydate-train/
>
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in
>
> [jar:file:/home/hadoop/mahouttrunk0210/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
>
> [jar:file:/home/hadoop/mahouttrunk0210/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
>
> [jar:file:/home/hadoop/mahouttrunk0210/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> 12/06/19 15:42:54 WARN driver.MahoutDriver: No
> org.apache.mahout.classifier.sgd.TrainNewsGroups.props found on classpath,
> will use command-line arguments only
>
> 11314 training files
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 1 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 2 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 3 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 4 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 6 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 8 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 10 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 12 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 15 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 20 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 25 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 30 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 40 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 50 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 60 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 70 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 80 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 100 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 120 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 140 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 150 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 200 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 250 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 300 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 400 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 500 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 600 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 700 0.000 0.00 none
>
> 0.00 0.00 0.00 0.00 0.0000000 0.0000000 800 0.000 0.00 none
>
> 0.08 183475.00 56343.00 398.73 0.0019339992 1.0020036e-08 1000 -2.782 27.47
> none
>
> 0.08 183475.00 56343.00 398.73 0.0019339992 1.0020036e-08 1200 -2.782 27.47
> none
>
> 0.08 183475.00 56343.00 398.73 0.0019339992 1.0020036e-08 1400 -2.782 27.47
> none
>
> 0.08 183475.00 56343.00 398.73 0.0019339992 1.0020036e-08 1500 -2.782 27.47
> none
>
> 0.46 189651.00 58333.00 2484.18 3.3498795e-07 1.0040845e-08 2000 -2.210
> 49.44 none
>
> 0.69 181650.00 54327.00 3586.22 0.010714760 1.0018239e-08 2500 -1.729 64.92
> none
>
> 0.69 181650.00 54327.00 3586.22 0.010714760 1.0018239e-08 3000 -1.729 64.92
> none
>
> 0.73 185336.00 60036.00 3727.24 0.0013261933 1.0028959e-08 4000 -1.534
> 67.67 none
>
> 0.90 187635.00 64585.00 4477.70 0.00033254808 1.0042120e-08 5000 -1.332
> 71.66 none
>
> 0.99 186208.00 66901.00 5130.19 0.0020679036 1.0046007e-08 6000 -1.219
> 74.66 none
>
> 1.04 183031.00 67655.00 5167.49 0.0045271583 1.0039594e-08 7000 -1.104
> 76.65 none
>
> 1.05 185882.00 69604.00 5375.51 0.00068848119 1.0039337e-08 8000 -0.961
> 78.78 none
>
> 1.06 184877.00 71114.00 5513.33 0.00075111384 1.0000002e-08 10000 -0.862
> 81.65 none
>
> ============
>
> Model Dissection
>
> body=windows 2.1 comp.os.ms-windows.misc 13.0 -0.29398131728660715 4.0
> -0.2499250721404836
>
> body=sale 2.0 misc.forsale 10.0 -0.15975889783548441 3.0
> -0.1515710074915567
>
> body=car 1.9 rec.autos 18.0 -0.20029877810977753 12.0 -0.18749612087883083
>
> body=bike 1.7 rec.motorcycles 8.0 -0.2018142555747598 4.0
> -0.16814432275529953
>
> body=space 1.5 sci.space 17.0 -0.1399287995272109 13.0 -0.1351873061593967
>
> body=gun 1.5 talk.politics.guns 14.0 -0.17722128868019074 2.0
> -0.16841537898914227
>
> body=x 1.5 comp.windows.x 16.0 -0.2732296260391984 11.0
> -0.27083865833751686
>
> body=dod 1.4 rec.motorcycles 9.0 -0.148326492278381 1.0
> -0.12210142836873147
>
> body=mac 1.3 comp.sys.mac.hardware 11.0 -0.27876944749543625 10.0
> -0.23484329903813153
>
> body=baseball 1.3 rec.sport.baseball 3.0 -0.13485357144819868 18.0
> -0.1195852341059688
>
> body=hockey 1.3 rec.sport.hockey 7.0 -0.211150729043167 15.0
> -0.1966715818369525
>
> body=cars 1.3 rec.autos 15.0 -0.12568321719729983 1.0 -0.10812928535843122
>
> body=israeli 1.2 talk.politics.mideast 18.0 -0.18684490836644635 3.0
> -0.0904030523936303
>
> body=apple 1.2 comp.sys.mac.hardware 12.0 -0.13251751383798738 4.0
> -0.10422377507646176
>
> body=israel 1.2 talk.politics.mideast 9.0 -0.2647891626493792 18.0
> -0.19546607506057787
>
> body=clipper 1.2 sci.crypt 18.0 -0.1207256296109393 16.0
> -0.11162680043519771
>
> body=circuit 1.1 sci.electronics 17.0 -0.12572559996977076 14.0
> -0.11938666814673585
>
> body=bikes 1.1 rec.motorcycles 8.0 -0.09900225369428561 3.0
> -0.08717090925544266
>
> body=amused 1.1 rec.autos 12.0 -0.11524977990608262 16.0
> -0.09828204462587121
>
> body=erwin 1.1 comp.os.ms-windows.misc 9.0 0.14484323695215684 13.0
> -0.14410391056885355
>
> body=guns 1.1 talk.politics.guns 14.0 -0.14757759121069391 18.0
> -0.10553310545311635
>
> body=wernick 1.1 comp.os.ms-windows.misc 13.0 -0.16784055081374377 17.0
> -0.14926029272275243
>
> body=encryption 1.1 sci.crypt 10.0 -0.13935594947197455 12.0
> -0.09629501916867024
>
> body=window 1.1 comp.windows.x 12.0 -0.15454645190237665 2.0
> -0.14649163050320485
>
> body=3.8 1.0 comp.os.ms-windows.misc 13.0 -0.14832423324033525 17.0
> -0.12201479010717567
>
> body=ecs.comm.mot.com 1.0 sci.space 8.0 -0.14450035595002458 17.0
> -0.10305279969523698
>
> body=anglican 1.0 misc.forsale 9.0 -0.140766985802898 11.0
> -0.11617369735727251
>
> body=atheists 1.0 alt.atheism 10.0 -0.162728295309222 18.0
> -0.15128006555681522
>
> body=intergraph 1.0 rec.autos 18.0 -0.11254542913335626 10.0
> -0.11148706996441916
>
> body=recorders 1.0 misc.forsale 10.0 -0.10593295886655092 2.0
> -0.0998539389721292
>
> body=improving 1.0 comp.os.ms-windows.misc 15.0 -0.16610673507143098 13.0
> -0.16174677920493857
>
> body=orbit 1.0 sci.space 10.0 -0.09859822667708279 14.0
> -0.09613332813664953
>
> body=definately 1.0 misc.forsale 3.0 -0.15492703593518564 11.0
> -0.12445964707621754
>
> body=noel 1.0 comp.os.ms-windows.misc 13.0 -0.18404609344946038 4.0
> -0.1548940667164162
>
> body=millions 1.0 comp.os.ms-windows.misc 13.0 -0.18326546246712216 4.0
> -0.15798164289936817
>
> body=motif 1.0 comp.windows.x 8.0 -0.23666995978992933 12.0
> -0.12134500672351373
>
> body=horvath 1.0 misc.forsale 10.0 -0.10667273844295395 16.0
> -0.10607058873105142
>
> body=loan 1.0 comp.os.ms-windows.misc 13.0 -0.15106641686777386 4.0
> -0.1476604574647092
>
> body=1030 1.0 misc.forsale 10.0 -0.1338454344202623 9.0
> -0.10130515755826637
>
> body=warm 1.0 misc.forsale 10.0 -0.1406288955328403 9.0
> -0.09832001144605908
>
> body=vechorik 1.0 comp.os.ms-windows.misc 13.0 -0.1783419480163701 4.0
> -0.14260075857837795
>
> body=brilliance 1.0 sci.space 2.0 -0.11090242616862139 7.0
> -0.10065183568418715
>
> body=904.6 1.0 sci.space 13.0 0.17926821219563124 18.0 -0.12378772510084696
>
> body=23.02.54.12.1993.3063 1.0 misc.forsale 16.0 -0.1512719525186617 3.0
> -0.10652298738064216
>
> body=x11r5 1.0 comp.windows.x 8.0 -0.14515526955893973 13.0
> -0.1173816902643546
>
> body=1184.3 1.0 misc.forsale 3.0 -0.10097794557982731 16.0
> -0.09119361221244096
>
> body=empty 1.0 rec.autos 1.0 -0.1490078386225753 12.0 -0.14532920308685907
>
> body=decimated 1.0 misc.forsale 10.0 -0.11943217863706988 3.0
> -0.07828181544709865
>
> body=_47 1.0 rec.motorcycles 8.0 -0.1606201440666521 4.0
> -0.11573688978013585
>
> body=guided 1.0 sci.space 8.0 -0.10125942660604244 2.0 -0.09743609743149098
>
> body=chastity 0.9 sci.space 14.0 0.1866060745962246 17.0
> -0.10542294497423861
>
> body=amenities 0.9 rec.autos 18.0 -0.1660614692460346 14.0
> -0.14303947591080396
>
> body=resistor 0.9 sci.space 1.0 0.14759344989387738 10.0
> -0.1342420332099456
>
> body=bruce_linde 0.9 rec.motorcycles 8.0 -0.1166539055655064 4.0
> -0.11418384723730182
>
> body=getheseme 0.9 sci.space 17.0 -0.11013081896195905 16.0
> -0.10677532315297165
>
> body=misquotes 0.9 rec.motorcycles 8.0 -0.11431263598847516 4.0
> -0.09907891930382504
>
> body=ride 0.9 rec.motorcycles 7.0 -0.1227519596828087 14.0
> -0.12014095961171231
>
> body=2divbtm9 0.9 sci.space 14.0 0.12719320131363285 18.0
> -0.11825712162691612
>
> body=programer 0.9 rec.autos 18.0 -0.13601753781054654 12.0
> -0.1307800959062347
>
> body=god 0.9 soc.religion.christian 14.0 -0.4133266651201768 2.0
> 0.33311177035073325
>
> body=globally 0.9 misc.forsale 3.0 -0.19995580080798708 16.0
> -0.15047092543401747
>
> body=maynard 0.9 rec.autos 16.0 0.3316844119801673 18.0
> -0.20281453781050834
>
> body=keith 0.9 alt.atheism 16.0 0.17774219242622247 18.0
> -0.15261256748104562
>
> body=avql 0.9 rec.motorcycles 8.0 -0.10735437424279992 1.0
> -0.10013452743190238
>
> body=meat 0.9 rec.motorcycles 15.0 0.27811319990051087 8.0
> -0.15915073066561788
>
> body=ssl.berkeley.edu 0.9 rec.motorcycles 1.0 -0.11603283918761925 12.0
> -0.0884384975489251
>
> body=key 0.9 sci.crypt 13.0 -0.1653760681899731 3.0 -0.11829589048545573
>
> body=engine 0.9 rec.autos 15.0 -0.12142870454710157 17.0
> 0.11270079209502169
>
> body=quadra 0.9 comp.sys.mac.hardware 11.0 -0.1289212371761686 18.0
> -0.0964629641283687
>
> body=3.1 0.9 comp.os.ms-windows.misc 4.0 -0.10431467896570842 10.0
> -0.10391472825253333
>
> body=termcap.o 0.9 talk.politics.guns 18.0 -0.10601148563035288 14.0
> -0.09796166446170057
>
> body=riding 0.9 rec.motorcycles 14.0 -0.08371078673711305 18.0
> -0.0765989769900927
>
> body=widget 0.8 comp.windows.x 8.0 -0.29438520103866556 12.0
> -0.0702389135835321
>
> body=server 0.8 comp.windows.x 8.0 -0.2470589322602377 13.0
> -0.11122744522078573
>
> body=atheism 0.8 alt.atheism 14.0 -0.11597538848568334 5.0
> -0.09967769880625586
>
> body=msg 0.8 sci.med 8.0 -0.06971336969789471 10.0 -0.0681109211937129
>
> body=chip 0.8 sci.crypt 4.0 0.23854239752582898 17.0 -0.13512001384755057
>
> body=cramer 0.8 talk.politics.misc 10.0 -0.1352263446633181 5.0
> -0.0846139773282596
>
> body=minded 0.8 rec.sport.hockey 15.0 -0.13476741287979063 9.0
> -0.11218240200824414
>
> body=waco 0.8 talk.politics.guns 2.0 -0.14104251458658282 3.0
> -0.13479252162005856
>
> body=unscientific 0.8 comp.windows.x 4.0 -0.14386800107757686 11.0
> -0.13643227280627868
>
> body=proven 0.8 comp.windows.x 11.0 -0.18025300172022177 8.0
> -0.13943418338111344
>
> body=doctor 0.8 sci.med 8.0 -0.1414980332571893 7.0 -0.10005063016256435
>
> body=suspect 0.8 rec.motorcycles 10.0 -0.15261022789897558 8.0
> -0.12820570720159324
>
> body=serdar 0.8 talk.politics.mideast 16.0 -0.12904711019856027 1.0
> -0.10543264023404211
>
> body=fenholt 0.8 rec.sport.hockey 15.0 -0.1372443638799478 10.0
> -0.0886145437031035
>
> body=biker 0.8 rec.motorcycles 18.0 0.12374142052315541 2.0
> -0.10266019115223943
>
> body=cubs 0.8 rec.sport.baseball 1.0 -0.14013959394975284 9.0
> -0.11378203330389651
>
> body=crypto 0.7 sci.crypt 15.0 0.11528681174276662 17.0
> -0.11013426728037534
>
> body=diabolical 0.7 comp.windows.x 8.0 -0.16507306217582277 4.0
> -0.13749129633301096
>
> body=clammoring 0.7 rec.motorcycles 8.0 -0.1573347910328642 15.0
> -0.08009868203492393
>
> body=cats 0.7 rec.motorcycles 16.0 0.13122298930164777 7.0
> -0.1208281691839849
>
> body=occuring 0.7 rec.sport.hockey 7.0 -0.13008252552307786 9.0
> -0.1247005333165889
>
> body=treatment 0.7 sci.med 9.0 -0.15903632382328148 8.0 -0.1552145882481596
>
> body=weapons 0.7 talk.politics.guns 2.0 -0.12533412706852903 9.0
> 0.11927250248617388
>
> body=scsi 0.7 comp.sys.ibm.pc.hardware 13.0 0.5089016768181602 10.0
> -0.13095769178945824
>
> body=turkish 0.7 talk.politics.mideast 11.0 0.1421504500782991 7.0
> -0.13027619091731205
>
> body=jews 0.7 talk.politics.mideast 18.0 -0.14854832842454688 14.0
> -0.12337473788427047
>
> body=motorcycle 0.7 rec.motorcycles 2.0 -0.11980502893098964 12.0
> -0.10032194148702399
>
> body=glimpse 0.7 comp.sys.mac.hardware 10.0 -0.1965284407487402 11.0
> -0.15939598858118137
>
> exiting main
>
> Word counts
>
> 0 67141
>
> 1 48521
>
> 2 45961
>
> 3 26486
>
> 4 25802
>
> 5 18797
>
> 6 17895
>
> 7 11107
>
> 8 10338
>
> 9 10201
>
> 10 10038
>
> 11 9737
>
> 12 9312
>
> 13 9306
>
> 14 9254
>
> 15 8900
>
> 16 8824
>
> 17 8822
>
> 18 8633
>
> 19 8340
>
> 20 8213
>
> 21 8199
>
> 22 8130
>
> 23 8072
>
> 24 7967
>
> 25 7792
>
> 26 7451
>
> 27 7096
>
> 28 7075
>
> 29 6974
>
> 30 6683
>
> 31 6593
>
> 32 6554
>
> 33 6332
>
> 34 6290
>
> 35 6218
>
> 36 6163
>
> 37 6006
>
> 38 5956
>
> 39 5854
>
> 40 5828
>
> 41 5655
>
> 42 5612
>
> 43 5584
>
> 44 5548
>
> 45 5395
>
> 46 5382
>
> 47 5380
>
> 48 5283
>
> 49 5235
>
> 50 5217
>
> 51 5083
>
> 52 5031
>
> 53 4921
>
> 54 4898
>
> 55 4811
>
> 56 4699
>
> 57 4668
>
> 58 4657
>
> 59 4583
>
> 60 4565
>
> 61 4493
>
> 62 4475
>
> 63 4457
>
> 64 4415
>
> 65 4392
>
> 66 4333
>
> 67 4282
>
> 68 4168
>
> 69 4132
>
> 70 3924
>
> 71 3912
>
> 72 3803
>
> 73 3778
>
> 74 3682
>
> 75 3570
>
> 76 3567
>
> 77 3549
>
> 78 3509
>
> 79 3485
>
> 80 3475
>
> 81 3474
>
> 82 3400
>
> 83 3373
>
> 84 3358
>
> 85 3355
>
> 86 3244
>
> 87 3205
>
> 88 3200
>
> 89 3158
>
> 90 3151
>
> 91 3140
>
> 92 3108
>
> 93 3074
>
> 94 3036
>
> 95 3033
>
> 96 3014
>
> 97 3001
>
> 98 2992
>
> 99 2951
>
> 100 2929
>
> 101 2918
>
> 102 2901
>
> 103 2880
>
> 104 2864
>
> 105 2847
>
> 106 2818
>
> 107 2786
>
> 108 2774
>
> 109 2618
>
> 110 2618
>
> 111 2617
>
> 112 2591
>
> 113 2590
>
> 114 2585
>
> 115 2584
>
> 116 2579
>
> 117 2573
>
> 118 2566
>
> 119 2549
>
> 120 2537
>
> 121 2528
>
> 122 2525
>
> 123 2515
>
> 124 2514
>
> 125 2482
>
> 126 2462
>
> 127 2420
>
> 128 2417
>
> 129 2401
>
> 130 2374
>
> 131 2335
>
> 132 2324
>
> 133 2311
>
> 134 2255
>
> 135 2224
>
> 136 2205
>
> 137 2188
>
> 138 2177
>
> 139 2143
>
> 140 2133
>
> 141 2133
>
> 142 2128
>
> 143 2120
>
> 144 2092
>
> 145 2055
>
> 146 2042
>
> 147 2038
>
> 148 2031
>
> 149 1997
>
> 150 1988
>
> 151 1966
>
> 152 1966
>
> 153 1965
>
> 154 1959
>
> 155 1948
>
> 156 1925
>
> 157 1919
>
> 158 1918
>
> 159 1918
>
> 160 1914
>
> 161 1911
>
> 162 1886
>
> 163 1878
>
> 164 1877
>
> 165 1872
>
> 166 1862
>
> 167 1857
>
> 168 1851
>
> 169 1851
>
> 170 1850
>
> 171 1826
>
> 172 1824
>
> 173 1824
>
> 174 1807
>
> 175 1805
>
> 176 1799
>
> 177 1769
>
> 178 1768
>
> 179 1765
>
> 180 1763
>
> 181 1760
>
> 182 1738
>
> 183 1714
>
> 184 1699
>
> 185 1694
>
> 186 1685
>
> 187 1680
>
> 188 1679
>
> 189 1677
>
> 190 1659
>
> 191 1648
>
> 192 1648
>
> 193 1631
>
> 194 1608
>
> 195 1607
>
> 196 1602
>
> 197 1596
>
> 198 1585
>
> 199 1583
>
> 200 1570
>
> 201 1563
>
> 202 1558
>
> 203 1548
>
> 204 1536
>
> 205 1531
>
> 206 1531
>
> 207 1502
>
> 208 1499
>
> 209 1494
>
> 210 1488
>
> 211 1484
>
> 212 1484
>
> 213 1477
>
> 214 1470
>
> 215 1468
>
> 216 1464
>
> 217 1452
>
> 218 1451
>
> 219 1450
>
> 220 1442
>
> 221 1436
>
> 222 1420
>
> 223 1410
>
> 224 1410
>
> 225 1405
>
> 226 1388
>
> 227 1384
>
> 228 1380
>
> 229 1379
>
> 230 1377
>
> 231 1366
>
> 232 1364
>
> 233 1362
>
> 234 1360
>
> 235 1359
>
> 236 1356
>
> 237 1354
>
> 238 1354
>
> 239 1351
>
> 240 1350
>
> 241 1343
>
> 242 1334
>
> 243 1328
>
> 244 1319
>
> 245 1315
>
> 246 1315
>
> 247 1314
>
> 248 1308
>
> 249 1307
>
> 250 1302
>
> 251 1291
>
> 252 1287
>
> 253 1287
>
> 254 1284
>
> 255 1280
>
> 256 1278
>
> 257 1268
>
> 258 1268
>
> 259 1266
>
> 260 1260
>
> 261 1253
>
> 262 1249
>
> 263 1248
>
> 264 1244
>
> 265 1242
>
> 266 1242
>
> 267 1240
>
> 268 1238
>
> 269 1236
>
> 270 1235
>
> 271 1233
>
> 272 1232
>
> 273 1229
>
> 274 1225
>
> 275 1222
>
> 276 1219
>
> 277 1212
>
> 278 1204
>
> 279 1199
>
> 280 1196
>
> 281 1196
>
> 282 1192
>
> 283 1192
>
> 284 1191
>
> 285 1191
>
> 286 1177
>
> 287 1176
>
> 288 1170
>
> 289 1168
>
> 290 1167
>
> 291 1161
>
> 292 1153
>
> 293 1151
>
> 294 1149
>
> 295 1147
>
> 296 1141
>
> 297 1139
>
> 298 1136
>
> 299 1130
>
> 300 1122
>
> 301 1117
>
> 302 1108
>
> 303 1108
>
> 304 1105
>
> 305 1103
>
> 306 1099
>
> 307 1092
>
> 308 1091
>
> 309 1090
>
> 310 1089
>
> 311 1089
>
> 312 1087
>
> 313 1083
>
> 314 1076
>
> 315 1075
>
> 316 1073
>
> 317 1070
>
> 318 1069
>
> 319 1068
>
> 320 1068
>
> 321 1066
>
> 322 1064
>
> 323 1064
>
> 324 1055
>
> 325 1051
>
> 326 1047
>
> 327 1045
>
> 328 1044
>
> 329 1042
>
> 330 1042
>
> 331 1042
>
> 332 1038
>
> 333 1037
>
> 334 1034
>
> 335 1032
>
> 336 1028
>
> 337 1028
>
> 338 1025
>
> 339 1022
>
> 340 1019
>
> 341 1016
>
> 342 1015
>
> 343 1015
>
> 344 1015
>
> 345 1015
>
> 346 1014
>
> 347 1009
>
> 348 1007
>
> 349 1006
>
> 350 1003
>
> 351 1002
>
> 352 997
>
> 353 990
>
> 354 988
>
> 355 984
>
> 356 978
>
> 357 978
>
> 358 978
>
> 359 975
>
> 360 974
>
> 361 974
>
> 362 971
>
> 363 970
>
> 364 967
>
> 365 965
>
> 366 961
>
> 367 955
>
> 368 950
>
> 369 950
>
> 370 947
>
> 371 946
>
> 372 945
>
> 373 945
>
> 374 945
>
> 375 938
>
> 376 935
>
> 377 935
>
> 378 934
>
> 379 934
>
> 380 930
>
> 381 929
>
> 382 925
>
> 383 924
>
> 384 917
>
> 385 914
>
> 386 914
>
> 387 905
>
> 388 904
>
> 389 904
>
> 390 900
>
> 391 898
>
> 392 891
>
> 393 890
>
> 394 890
>
> 395 890
>
> 396 887
>
> 397 885
>
> 398 882
>
> 399 880
>
> 400 876
>
> 401 875
>
> 402 875
>
> 403 871
>
> 404 868
>
> 405 861
>
> 406 860
>
> 407 858
>
> 408 857
>
> 409 856
>
> 410 856
>
> 411 855
>
> 412 852
>
> 413 849
>
> 414 848
>
> 415 847
>
> 416 845
>
> 417 844
>
> 418 844
>
> 419 839
>
> 420 839
>
> 421 839
>
> 422 838
>
> 423 835
>
> 424 831
>
> 425 829
>
> 426 824
>
> 427 823
>
> 428 823
>
> 429 821
>
> 430 820
>
> 431 819
>
> 432 817
>
> 433 815
>
> 434 815
>
> 435 814
>
> 436 809
>
> 437 807
>
> 438 806
>
> 439 804
>
> 440 803
>
> 441 801
>
> 442 800
>
> 443 794
>
> 444 793
>
> 445 793
>
> 446 793
>
> 447 791
>
> 448 791
>
> 449 789
>
> 450 788
>
> 451 788
>
> 452 787
>
> 453 784
>
> 454 783
>
> 455 783
>
> 456 779
>
> 457 778
>
> 458 776
>
> 459 774
>
> 460 773
>
> 461 773
>
> 462 771
>
> 463 770
>
> 464 767
>
> 465 767
>
> 466 767
>
> 467 757
>
> 468 756
>
> 469 755
>
> 470 752
>
> 471 750
>
> 472 750
>
> 473 750
>
> 474 750
>
> 475 746
>
> 476 740
>
> 477 738
>
> 478 737
>
> 479 728
>
> 480 727
>
> 481 726
>
> 482 724
>
> 483 722
>
> 484 722
>
> 485 721
>
> 486 720
>
> 487 717
>
> 488 716
>
> 489 716
>
> 490 714
>
> 491 712
>
> 492 710
>
> 493 708
>
> 494 707
>
> 495 705
>
> 496 704
>
> 497 704
>
> 498 702
>
> 499 700
>
> 500 695
>
> 501 694
>
> 502 694
>
> 503 691
>
> 504 690
>
> 505 689
>
> 506 689
>
> 507 686
>
> 508 686
>
> 509 684
>
> 510 684
>
> 511 683
>
> 512 682
>
> 513 682
>
> 514 681
>
> 515 676
>
> 516 675
>
> 517 673
>
> 518 673
>
> 519 672
>
> 520 672
>
> 521 671
>
> 522 671
>
> 523 670
>
> 524 668
>
> 525 668
>
> 526 667
>
> 527 666
>
> 528 666
>
> 529 664
>
> 530 664
>
> 531 663
>
> 532 661
>
> 533 661
>
> 534 659
>
> 535 658
>
> 536 657
>
> 537 657
>
> 538 657
>
> 539 657
>
> 540 656
>
> 541 656
>
> 542 655
>
> 543 654
>
> 544 652
>
> 545 651
>
> 546 651
>
> 547 649
>
> 548 647
>
> 549 647
>
> 550 647
>
> 551 645
>
> 552 644
>
> 553 643
>
> 554 642
>
> 555 641
>
> 556 640
>
> 557 638
>
> 558 636
>
> 559 635
>
> 560 633
>
> 561 630
>
> 562 625
>
> 563 624
>
> 564 623
>
> 565 622
>
> 566 621
>
> 567 620
>
> 568 619
>
> 569 619
>
> 570 617
>
> 571 616
>
> 572 615
>
> 573 615
>
> 574 614
>
> 575 613
>
> 576 611
>
> 577 610
>
> 578 610
>
> 579 609
>
> 580 608
>
> 581 608
>
> 582 606
>
> 583 605
>
> 584 604
>
> 585 604
>
> 586 603
>
> 587 603
>
> 588 603
>
> 589 603
>
> 590 602
>
> 591 602
>
> 592 601
>
> 593 601
>
> 594 600
>
> 595 600
>
> 596 598
>
> 597 597
>
> 598 597
>
> 599 596
>
> 600 596
>
> 601 596
>
> 602 595
>
> 603 595
>
> 604 594
>
> 605 594
>
> 606 593
>
> 607 592
>
> 608 590
>
> 609 589
>
> 610 589
>
> 611 588
>
> 612 588
>
> 613 588
>
> 614 587
>
> 615 586
>
> 616 585
>
> 617 583
>
> 618 582
>
> 619 582
>
> 620 580
>
> 621 578
>
> 622 578
>
> 623 577
>
> 624 576
>
> 625 576
>
> 626 576
>
> 627 575
>
> 628 573
>
> 629 572
>
> 630 572
>
> 631 572
>
> 632 572
>
> 633 570
>
> 634 570
>
> 635 570
>
> 636 569
>
> 637 568
>
> 638 568
>
> 639 568
>
> 640 567
>
> 641 567
>
> 642 566
>
> 643 563
>
> 644 563
>
> 645 563
>
> 646 563
>
> 647 561
>
> 648 560
>
> 649 559
>
> 650 558
>
> 651 558
>
> 652 557
>
> 653 557
>
> 654 557
>
> 655 556
>
> 656 554
>
> 657 554
>
> 658 554
>
> 659 552
>
> 660 552
>
> 661 552
>
> 662 552
>
> 663 551
>
> 664 551
>
> 665 551
>
> 666 550
>
> 667 550
>
> 668 550
>
> 669 549
>
> 670 548
>
> 671 545
>
> 672 545
>
> 673 540
>
> 674 540
>
> 675 540
>
> 676 540
>
> 677 539
>
> 678 539
>
> 679 537
>
> 680 536
>
> 681 536
>
> 682 536
>
> 683 536
>
> 684 535
>
> 685 535
>
> 686 534
>
> 687 534
>
> 688 534
>
> 689 532
>
> 690 531
>
> 691 531
>
> 692 530
>
> 693 530
>
> 694 530
>
> 695 530
>
> 696 530
>
> 697 529
>
> 698 529
>
> 699 528
>
> 700 527
>
> 701 526
>
> 702 526
>
> 703 524
>
> 704 523
>
> 705 523
>
> 706 523
>
> 707 523
>
> 708 522
>
> 709 522
>
> 710 522
>
> 711 521
>
> 712 521
>
> 713 521
>
> 714 521
>
> 715 519
>
> 716 519
>
> 717 519
>
> 718 517
>
> 719 515
>
> 720 514
>
> 721 514
>
> 722 514
>
> 723 514
>
> 724 514
>
> 725 513
>
> 726 513
>
> 727 513
>
> 728 512
>
> 729 512
>
> 730 512
>
> 731 511
>
> 732 510
>
> 733 510
>
> 734 509
>
> 735 509
>
> 736 508
>
> 737 507
>
> 738 506
>
> 739 506
>
> 740 506
>
> 741 505
>
> 742 505
>
> 743 505
>
> 744 504
>
> 745 503
>
> 746 502
>
> 747 501
>
> 748 501
>
> 749 501
>
> 750 501
>
> 751 500
>
> 752 500
>
> 753 500
>
> 754 499
>
> 755 499
>
> 756 499
>
> 757 499
>
> 758 498
>
> 759 497
>
> 760 497
>
> 761 497
>
> 762 497
>
> 763 496
>
> 764 495
>
> 765 494
>
> 766 493
>
> 767 493
>
> 768 493
>
> 769 492
>
> 770 491
>
> 771 491
>
> 772 490
>
> 773 490
>
> 774 490
>
> 775 489
>
> 776 488
>
> 777 488
>
> 778 488
>
> 779 488
>
> 780 488
>
> 781 487
>
> 782 487
>
> 783 487
>
> 784 487
>
> 785 486
>
> 786 486
>
> 787 486
>
> 788 485
>
> 789 485
>
> 790 485
>
> 791 484
>
> 792 484
>
> 793 483
>
> 794 483
>
> 795 482
>
> 796 482
>
> 797 482
>
> 798 481
>
> 799 480
>
> 800 480
>
> 801 479
>
> 802 479
>
> 803 478
>
> 804 478
>
> 805 477
>
> 806 477
>
> 807 477
>
> 808 477
>
> 809 477
>
> 810 476
>
> 811 475
>
> 812 475
>
> 813 474
>
> 814 473
>
> 815 472
>
> 816 471
>
> 817 471
>
> 818 470
>
> 819 470
>
> 820 470
>
> 821 469
>
> 822 469
>
> 823 467
>
> 824 467
>
> 825 467
>
> 826 466
>
> 827 466
>
> 828 465
>
> 829 465
>
> 830 465
>
> 831 465
>
> 832 465
>
> 833 465
>
> 834 464
>
> 835 463
>
> 836 463
>
> 837 463
>
> 838 462
>
> 839 461
>
> 840 461
>
> 841 460
>
> 842 459
>
> 843 459
>
> 844 459
>
> 845 458
>
> 846 457
>
> 847 456
>
> 848 456
>
> 849 456
>
> 850 454
>
> 851 454
>
> 852 454
>
> 853 453
>
> 854 452
>
> 855 452
>
> 856 452
>
> 857 451
>
> 858 450
>
> 859 450
>
> 860 450
>
> 861 450
>
> 862 449
>
> 863 449
>
> 864 449
>
> 865 448
>
> 866 448
>
> 867 447
>
> 868 447
>
> 869 447
>
> 870 446
>
> 871 446
>
> 872 446
>
> 873 446
>
> 874 445
>
> 875 445
>
> 876 445
>
> 877 444
>
> 878 444
>
> 879 443
>
> 880 443
>
> 881 443
>
> 882 442
>
> 883 440
>
> 884 440
>
> 885 439
>
> 886 439
>
> 887 437
>
> 888 437
>
> 889 437
>
> 890 436
>
> 891 436
>
> 892 435
>
> 893 435
>
> 894 435
>
> 895 434
>
> 896 434
>
> 897 434
>
> 898 433
>
> 899 433
>
> 900 433
>
> 901 433
>
> 902 432
>
> 903 431
>
> 904 431
>
> 905 431
>
> 906 431
>
> 907 431
>
> 908 430
>
> 909 430
>
> 910 429
>
> 911 429
>
> 912 429
>
> 913 428
>
> 914 428
>
> 915 428
>
> 916 427
>
> 917 426
>
> 918 426
>
> 919 425
>
> 920 424
>
> 921 423
>
> 922 423
>
> 923 423
>
> 924 423
>
> 925 422
>
> 926 422
>
> 927 422
>
> 928 422
>
> 929 419
>
> 930 419
>
> 931 419
>
> 932 419
>
> 933 419
>
> 934 419
>
> 935 417
>
> 936 417
>
> 937 417
>
> 938 417
>
> 939 416
>
> 940 415
>
> 941 414
>
> 942 414
>
> 943 413
>
> 944 413
>
> 945 412
>
> 946 412
>
> 947 410
>
> 948 409
>
> 949 409
>
> 950 409
>
> 951 409
>
> 952 408
>
> 953 408
>
> 954 407
>
> 955 407
>
> 956 407
>
> 957 406
>
> 958 406
>
> 959 404
>
> 960 403
>
> 961 402
>
> 962 402
>
> 963 401
>
> 964 401
>
> 965 401
>
> 966 400
>
> 967 400
>
> 968 400
>
> 969 400
>
> 970 399
>
> 971 399
>
> 972 399
>
> 973 399
>
> 974 398
>
> 975 398
>
> 976 398
>
> 977 398
>
> 978 397
>
> 979 397
>
> 980 397
>
> 981 396
>
> 982 395
>
> 983 395
>
> 984 395
>
> 985 395
>
> 986 395
>
> 987 394
>
> 988 394
>
> 989 393
>
> 990 393
>
> 991 392
>
> 992 392
>
> 993 392
>
> 994 391
>
> 995 391
>
> 996 391
>
> 997 391
>
> 998 391
>
> 999 390
>
> 1000 389
>
> 12/06/19 16:01:03 INFO driver.MahoutDriver: Program took 1088343 ms
> (Minutes: 18.13905)
>
> Testing on /tmp/mahout-work-hadoop/20news-bydate/20news-bydate-test/ with
> model: /tmp/news-group.model
>
> hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running
> locally
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in
>
> [jar:file:/home/hadoop/mahouttrunk0210/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
>
> [jar:file:/home/hadoop/mahouttrunk0210/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
>
> [jar:file:/home/hadoop/mahouttrunk0210/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> 12/06/19 16:01:05 WARN driver.MahoutDriver: No
> org.apache.mahout.classifier.sgd.TestNewsGroups.props found on classpath,
> will use command-line arguments only
>
> 7532 test files
>
> =======================================================
>
> Summary
>
> -------------------------------------------------------
>
> Correctly Classified Instances : 5556 73.7653%
>
> Incorrectly Classified Instances : 1976 26.2347%
>
> Total Classified Instances : 7532
>
> =======================================================
>
> Confusion Matrix
>
> -------------------------------------------------------
>
> a b c d e f g h i j k l m n o p q r s t <--Classified as
>
> 26 11 23 1 9 55 2 23 12 6 2 81 56 34 31 11 0 1 3 2 | 389 a = comp.graphics
>
> 2 270 2 29 2 4 3 6 5 2 44 5 3 4 2 6 0 3 0 6 | 398 b =
> soc.religion.christian
>
> 0 1 367 0 1 4 1 4 3 1 0 3 3 3 1 1 2 0 0 1 | 396 c = sci.crypt
>
> 0 34 4 203 3 0 5 2 2 4 42 1 1 0 1 7 4 0 1 5 | 319 d = alt.atheism
>
> 0 1 2 0 341 9 0 4 4 3 1 4 3 0 3 7 2 0 3 7 | 394 e = sci.space
>
> 0 5 20 1 11 254 0 9 6 10 1 4 13 18 26 3 0 3 8 1 | 393 f = sci.electronics
>
> 0 3 5 20 2 1 287 3 5 5 11 1 1 0 1 1 5 0 3 22 | 376 g =
> talk.politics.mideast
>
> 1 0 2 0 3 3 1 346 1 3 0 2 3 12 11 1 0 0 1 0 | 390 h = misc.forsale
>
> 0 0 0 1 0 2 1 3 374 2 0 1 2 0 1 1 0 8 0 1 | 397 i = rec.sport.baseball
>
> 0 2 3 1 2 11 0 16 7 328 0 1 2 2 2 2 3 0 12 2 | 396 j = rec.autos
>
> 0 33 0 43 4 1 1 2 6 5 124 1 5 0 3 7 11 0 1 4 | 251 k = talk.religion.misc
>
> 1 6 6 0 4 4 0 11 2 0 5 300 39 4 6 4 0 0 3 0 | 395 l = comp.windows.x
>
> 1 7 9 4 3 5 0 3 1 0 1 12 302 27 14 2 0 0 2 1 | 394 m =
> comp.os.ms-windows.misc
>
> 0 1 3 0 1 32 0 15 1 5 0 3 38 256 31 0 1 0 3 2 | 392 n =
> comp.sys.ibm.pc.hardware
>
> 1 2 4 0 3 16 0 10 0 2 0 0 14 27 299 2 0 0 1 4 | 385 o =
> comp.sys.mac.hardware
>
> 0 12 6 5 2 19 1 9 11 10 3 1 9 2 8 288 2 0 6 2 | 396 p = sci.med
>
> 0 1 10 0 3 2 3 1 3 2 10 2 3 0 1 2 303 0 2 16 | 364 q = talk.politics.guns
>
> 0 3 0 0 2 1 0 5 32 3 0 1 2 0 0 2 1 344 3 0 | 399 r = rec.sport.hockey
>
> 0 2 1 0 1 0 0 4 3 12 0 2 2 0 0 2 0 0 367 2 | 398 s = rec.motorcycles
>
> 0 3 5 5 3 1 3 1 2 2 8 1 0 0 0 8 89 0 2 177 | 310 t = talk.politics.misc
>
>  Avg. Log-likelihood: -1.1741557385160035 25%-ile: -1.7622842491676183
> 75%-ile: -0.7387671438450112
>
> 12/06/19 16:01:34 INFO driver.MahoutDriver: Program took 28124 ms (Minutes:
> 0.46873333333333334)
>

Reply via email to