[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-17 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/291
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/569/



---


[GitHub] madlib issue #289: RF: Add impurity variable importance

2018-07-17 Thread fmcquillan99
Github user fmcquillan99 commented on the issue:

https://github.com/apache/madlib/pull/289
  
```
The model table produced by the training function contains the following 
columns:

gid INTEGER. Group id that uniquely identifies a set of grouping column 
values.
sample_id   INTEGER. The id of the bootstrap sample that this tree is a 
part of.
treeBYTEA8. Trained tree model stored in binary format (not human 
readable).
impurity_var_importance DOUBLE PRECISION[]. The gini impurity 
importance score for the tree.
```

I don't think we need the `impurity_var_importance` for each tree in the 
forest, since we have the final/averaged one on the grouping table.
And we don't put the `oob_var_importance` here so it is inconsistent.



---


[GitHub] madlib issue #291: Feature: Vector to Columns

2018-07-17 Thread asfgit
Github user asfgit commented on the issue:

https://github.com/apache/madlib/pull/291
  

Refer to this link for build results (access rights to CI server needed): 
https://builds.apache.org/job/madlib-pr-build/568/



---


[GitHub] madlib pull request #294: Pagerank: Remove duplicate entries from grouping o...

2018-07-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/madlib/pull/294


---