Just saw this in skimming email...

On Aug 4, 2011, at 4:58 AM, Clément Notin wrote:

> Hello,
> 
> I'm tweaking and rewriting parts of the ARFF converter to meet my needs (I
> want to create NamedVectors with the the first column as name).
> I asked it to output the dictionnary it used but the file is always empty !
> (BTW, I'm using 0.6-SNAPSHOT)

Patch would be great, if you think it is general purpose enough.

> 
> In the code, it is the label binding which are written to the dictionnary.
> 
> Am I right if I think that :
> 
>   - the labels are the column names (@attribute) in the ARFF file ?

Yes.

>   - the dictionnary should be the mapping between strings and their long id

Yes.

>   ?
> 
> 
> Correct me if I'm wrong but when my clustering job (yes I want to do
> clustering ;) ) is finished it is useful to know what strings are behind
> numerical values.

Yes, that is usually helpful.

> Because the vector values are ordered so it's easy to know
> which column is which column...
> 
> Thanks for your anwsers.
> 
> Regards.
> 
> -- 
> *Clément **Notin*

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com

Reply via email to