How to dump/interpret CVB output

keeyong han Wed, 06 Feb 2013 16:53:28 -0800

Hello there,

After some struggle, I managed to run cvb successfully. But I found that 
dumping the output isn't much easier either. I tried to dump some keywords per 
cluster by running the following command:


mahout vectordump -i [final_state_output_directory_used_in_cvb_run] -o 
[output_file_path] --dictionary [dictionary_file_generated_in_vectorization]  
--dictionaryType sequencefile --vectorSize 5 --sortVectors true --printKey true 

When I opened the output file, it looked something like these:
0       
{�����:22.247111682871502,����:18.373163071757336,���:98.99212990547156,��:381.7630898807104,�:477.31989896222046}
10      
{�����:18.69052909454572,����:36.154751708278106,���:128.69867172165564,�:963.769624051711,U:8.647090616806189}
20      
{�����:17.571403244328565,����:85.64801880249307,���:78.07377559911669,��:347.51662400027806,�:871.9107248128981}
30      
{�����:22.330329037961235,����:35.7514504363204,���:93.79495229393099,��:101.67298391572345,�:560.0330529905118}
40      
{�����:7.139737125343593,����:46.70407309589953,���:105.44075086386623,��:350.2449503883152,�:903.5015132966541}

I guess some parameter was missing and/or wrong value was assigned? Please help 
me. 

BTW I am using Mahout 0.8 on Hadoop 1.0.3.

Cheers,
-Keeyong

How to dump/interpret CVB output

Reply via email to