They are not raw word counts. Instead they are processed using various
formulae. I don't know where these are articulated.

On Sun, Mar 25, 2012 at 1:01 PM, Necati Demir <[email protected]> wrote:
> You are right. I asked my question in a wrong way.
>
> I want to ask that some values are something like 25.5. How a wordcount can
> have 0.5 value? You can see a part of this file below:
>
> Key: 108 1 1: Value: 241.7667508731829
> Key: 108 4: Value: 8.554995151411276
> Key: 108 4 during: Value: 25.260550610371865
> Key: 108 billion: Value: 20.98225432772597
> Key: 108 kg: Value: 24.666483410952424
> Key: 108 kg a4: Value: 44.2003664152453
>
>
>
>
> On 25 March 2012 02:59, Lance Norskog <[email protected]> wrote:
>
>> The counts are doubles. Vectors in Mahout are always doubles.
>>
>> On Fri, Mar 23, 2012 at 4:23 PM, Necati Demir <[email protected]> wrote:
>> > Hello,
>> >
>> > I am running seq2sparse command with the parameter -ng 2.
>> > When I dump wordcount/ngrams/part-r-00000 file, I see that sum values are
>> > not integers. How are n-gram values calculated in mahout?
>> >
>> >
>> > --
>> > Necati DEMİR
>> > --------------------
>>
>>
>>
>> --
>> Lance Norskog
>> [email protected]
>>
>
>
>
> --
> Necati DEMİR
> --------------------



-- 
Lance Norskog
[email protected]

Reply via email to