GitHub user manishamde reopened a pull request:

    https://github.com/apache/spark/pull/475

    SPARK-1544 Add support for deep decision trees.

    @etrain and I came with a PR for arbitrarily deep decision trees at the 
cost of multiple passes over the data at deep tree levels. 
    
    To summarize:
    1) We take a parameter that indicates the amount of memory users want to 
reserve for computation on each worker (and 2x that at the driver).
    2) Using that information, we calculate two things - the maximum depth to 
which we train as usual (which is, implicitly, the maximum number of nodes we 
want to train in parallel), and the size of the groups we should use in the 
case where we exceed this depth.
    
    cc: @atalwalkar, @hirakendu, @mengxr

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/manishamde/spark deep_tree

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/475.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #475
    
----
commit 50b143a4385f209fbc1793f3e03134cab3ab9583
Author: Manish Amde <[email protected]>
Date:   2014-04-20T20:33:03Z

    adding support for very deep trees

commit abc5a23bf80d792a345d723b44bff3ee217cd5ac
Author: Evan Sparks <[email protected]>
Date:   2014-04-22T01:41:36Z

    Parameterizing max memory.

commit 2f6072c12a1466d783da258d4aa1bde789e7e875
Author: manishamde <[email protected]>
Date:   2014-04-22T03:43:47Z

    Merge pull request #5 from etrain/deep_tree
    
    Parameterizing max memory.

commit 2f1e093c5187a1ed532f9c19b25f8a2a6a46e27a
Author: Manish Amde <[email protected]>
Date:   2014-04-22T03:49:46Z

    minor: added doc for maxMemory parameter

commit 02877721328a560f210a7906061108ce5dd4bbbe
Author: Evan Sparks <[email protected]>
Date:   2014-04-22T18:13:27Z

    Fixing scalastyle issue.

commit fecf89a51d6efc9e2ff06e700338ea944a4dd580
Author: manishamde <[email protected]>
Date:   2014-04-22T18:15:57Z

    Merge pull request #6 from etrain/deep_tree
    
    Fixing scalastyle issue.

commit 719d0098bb08b50e523cec3e388115d5a206512b
Author: Manish Amde <[email protected]>
Date:   2014-04-24T00:04:05Z

    updating user documentation

commit 9dbdabeeacc5fe5e0f1a729ce1ed8ab6ff399000
Author: Manish Amde <[email protected]>
Date:   2014-04-29T21:43:19Z

    merge from master

commit 15171550fe83e42fcb707744c9035ed540fb78d1
Author: Manish Amde <[email protected]>
Date:   2014-04-29T21:45:34Z

    updated documentation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to