Thanks for the answers Ted.  Ill take a look inside the dissector.  I was just 
wondering because the results are quite a bit different from whats in the book 
- Listing 15.9.  Here are those results (where words have weights > 1).  

body=space 2.1 sci.space
body=sale 1.9 misc.forsale
body=car 1.9 rec.autos
body=windows 1.8 comp.os.ms-windows.misc
body=mac 1.7 comp.sys.mac.hardware
body=bike 1.7 rec.motorcycles
body=apple 1.5 comp.sys.mac.hardware
body=gun 1.5 talk.politics.guns
body=baseball 1.5 rec.sport.baseball
body=graphics 1.5 comp.graphics


I guess I mostly want to understand what changed.  Again, Ill take a look at 
the dissector, because the results of the training look pretty good.

Good luck, hope things calm down for you.
Chris

On Dec 17, 2010, at 4:58 PM, Ted Dunning wrote:

> Sorry Chris, I am snowed under for the rest of the week.  Big sim analysis
> at work and finishing revisions on the book at night.
> 
> If you can ping me next week, I should be able to take a look again.  My
> basic expectation is that you don't have anything going wrong.  If you want
> to see positive only terms, you might try tweaking the ModelDissector to
> sort on value descending rather than absolute value descending.  Coefficient
> values of 0.1 are about right for a model like this and with 20 newsgroups,
> it isn't so surprising to see lots of negative weights.
> 
> On Fri, Dec 17, 2010 at 11:04 AM, Chris Schilling <[email protected]>wrote:
> 
>> Hey Ted,
>> 
>> Any word on this?  Is there something I can do to help.  I am just not real
>> sure on what side of the problem I am on: dissector code or learning
>> algorithm.
>> 
>> 
>> On Dec 16, 2010, at 5:02 PM, Chris Schilling wrote:
>> 
>>> Hey Ted,
>>> 
>>> Okay.  I have tested in trunk and 0.4.  Pretty similar results.
>>> 
>>> 
>>> On Dec 16, 2010, at 4:43 PM, Ted Dunning wrote:
>>> 
>>>> I think that the confusion is that many of these have negative weights.
>>>> Thus god !=> sci.space, but windows => comp.windows.x.
>>>> 
>>>> Are you running from trunk or 0.4?
>>>> 
>>>> On Thu, Dec 16, 2010 at 4:24 PM, Chris Schilling <[email protected]>
>> wrote:
>>>> 
>>>>> First few results of dissect()
>>>>> body=god        -0.1    sci.space       4.0     -0.1394994576714021
>> 5.0
>>>>>  -0.10322063352194852
>>>>> body=atheists   -0.1    comp.windows.x  5.0     -0.07383748917466922
>> 1.0
>>>>>  -0.037205929610919175
>>>>> body=christian  -0.1    talk.politics.mideast   2.0
>>>>> -0.029106552130967654   4.0     -0.0033808015660384875
>>>>> body=he 0.1     talk.politics.mideast   18.0    0.07845100216340763
>> 5.0
>>>>>  -0.011218075788326903
>>>>> body=martin     -0.1    talk.politics.mideast   7.0
>>>>> -0.019407188307985972   10.0    0.00782255718617942
>>>>> body=say        -0.1    comp.sys.ibm.pc.hardware        4.0
>>>>> -0.0480512351042981     17.0    0.0037854045183534166
>>>>> body=windows    0.1     comp.windows.x  18.0    -0.06722265016470273
>> 5.0
>>>>>  -0.009627757932247396
>>>>> body=file       -0.1    sci.med 7.0     -0.05790809278204335    5.0
>>>>> -0.050492324263356765
>>>>> body=government 0.1     talk.religion.misc      3.0
>>>>> -0.06076111927305433    2.0     -0.052663471587524276
>>>>> body=sale       -0.1    talk.religion.misc      15.0
>>>>> -0.03535708180324768    12.0    -0.03532746353789419
>>>>> body=atheism    -0.1    misc.forsale    8.0     -0.05941771751639946
>> 1.0
>>>>>  -0.0500729187538798
>>>>> body=program    -0.1    sci.med 16.0    -0.03820018259936702    7.0
>>>>> -2.9675316187177843E-4
>>>>> body=193        0.1     talk.politics.mideast   5.0
>> 0.05061582599095028
>>>>>  17.0    -0.032606809778589076
>>>>> body=his        -0.1    talk.politics.misc      12.0
>> 0.05030942352260737
>>>>>  5.0     -0.04490996261214399
>>>>> 
>>>>> I am not adding any leaks (leakType = 0).
>>>>> 
>>>>> Any ideas here?
>>>>> 
>>> 
>> 
>> 

Reply via email to