Hi James, It's a good idea. A JSON format is more convenient for visualization though a little inconvenient to read. How about toJson() method? It might make the mllib api inconsistent across models though.
You should probably create a JIRA for this. CC: dev list -Manish > On Aug 26, 2015, at 11:29 AM, Murphy, James <james.mur...@disney.com> wrote: > > Hey all, > > In working with the DecisionTree classifier, I found it difficult to extract > rules that could easily facilitate visualization with libraries like D3. > > So for example, using : print(model.toDebugString()), I get the following > result = > > If (feature 0 <= -35.0) > If (feature 24 <= 176.0) > Predict: 2.1 > If (feature 24 = 176.0) > Predict: 4.2 > Else (feature 24 > 176.0) > Predict: 6.3 > Else (feature 0 > -35.0) > If (feature 24 <= 11.0) > Predict: 4.5 > Else (feature 24 > 11.0) > Predict: 10.2 > > But ideally, I could see results in a more parseable format like JSON: > > { > "node": [ > { > "name":"node1", > "rule":"feature 0 <= -35.0", > "children":[ > { > "name":"node2", > "rule":"feature 24 <= 176.0", > "children":[ > { > "name":"node4", > "rule":"feature 20 < 116.0", > "predict": 2.1 > }, > { > "name":"node5", > "rule":"feature 20 = 116.0", > "predict": 4.2 > }, > { > "name":"node5", > "rule":"feature 20 > 116.0", > "predict": 6.3 > } > ] > }, > { > "name":"node3", > "rule":"feature 0 > -35.0", > "children":[ > { > "name":"node7", > "rule":"feature 3 <= 11.0", > "predict": 4.5 > }, > { > "name":"node8", > "rule":"feature 3 > 11.0", > "predict": 10.2 > } > ] > } > > ] > } > ] > } > > Food for thought! > > Thanks, > > Jim >