[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary

actuaryzhang Mon, 17 Jul 2017 15:49:33 -0700

Github user actuaryzhang commented on the issue:

    https://github.com/apache/spark/pull/16630
  
    @yanboliang Thanks for the suggestions. I have made a new commit that 
addresses your comments. 
    In the new version, I used an array of tuple to represent the coefficient 
matrix. I used tuple because I have mixed type of string and double (it's 
necessary to store the feature names since they also depend on whether there is 
intercept). I then wrote a `showString` function similar to that in the 
`DataSet` class that compiles all summary info into a string, and defined show 
methods to print out the estimated model. The output is very similar to that in 
R except that I did not show the residuals and significance levels. Please let 
me know your thoughts on this update. 
    
    Below is an example of the call and the output:
    ```
    model.summary.show()
    +-----------+--------+--------+------+------+
    |    Feature|Estimate|StdError|TValue|PValue|
    +-----------+--------+--------+------+------+
    |(Intercept)|   0.790|   4.013| 0.197| 0.862|
    | features_0|   0.226|   2.115| 0.107| 0.925|
    | features_1|   0.468|   0.582| 0.804| 0.506|
    +-----------+--------+--------+------+------+
    
    (Dispersion parameter for gaussian family taken to be 14.516)
        Null deviance: 46.800 on 2 degrees of freedom
    Residual deviance: 29.032 on 2 degrees of freedom
    AIC: 30.984
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #16630: [SPARK-19270][ML] Add summary table to GLM summary

Reply via email to