Github user zhengruifeng commented on the issue:

    https://github.com/apache/spark/pull/21792
  
    @srowen I think we need to update the docs
    1, Current doc in `StringIndexer` is somewhat misleading: "The indices are 
in `[0, numLabels)`, ordered by label frequencies, so the most frequent label 
gets index `0`." this is true only with default ordering type.
    2, In RFormula, `stringOrderType` only affect feature columns, not label 
column. This need to be emphasised, which is somewhat out of expectation.
    
    @MLnick your thoughts?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to