Re: Why StringIndexer uses double instead of int for indexing?

2017-01-21 Thread Holden Karau
I'm downstream stages the labels & features are generally expected to be doubles, so its easier to use as a double. On Sat, Jan 21, 2017 at 5:32 PM Shiyuan wrote: > Hi Spark, > StringIndex uses double instead of int for indexing > http://spark.apache.org/docs/latest/ml-features.html#stringindexe

Why StringIndexer uses double instead of int for indexing?

2017-01-21 Thread Shiyuan
Hi Spark, StringIndex uses double instead of int for indexing http://spark.apache.org/docs/latest/ml-features.html#stringindexer. What's the rationale for using double to index? Would it be more appropriate to use int to index (which is consistent with other place like Vector.sparse) Shiyuan