Github user WeichenXu123 commented on the issue:

    https://github.com/apache/spark/pull/19621
  
    I checked the failed tests in sparkR. There's some trouble in the failed 
`glm` sparkR tests.
    These tests compare sparkR glm and R-lib glm results on test data "iris", 
but, what's the string indexer order for R-lib glm ? I check the dataset 
"iris", the "Species" column has three value "setosa", "versicolor", 
"virginica", **their frequency are all 50**, and only when `RFormula` index 
them as: "setosa"->2, "versicolor"->0, "virginica"->1, the result will be the 
same with R-lib glm. This is a strange indexer order.
    How to set string indexer order for R-lib glm ?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to