zhengruifeng commented on pull request #31090:
URL: https://github.com/apache/spark/pull/31090#issuecomment-760122648
I just create another rf model with 10 trees and totally 2,789,824 nodes:
```
scala> rfcm.trees.length
res3: Int = 10
scala> rfcm.trees.map(_.numNodes).sum
res4: Int = 2789824
scala> rfcm.save("/tmp/rfcm")
```
save it to disk and its size is 49M.
```
du -sh /tmp/rfcm
49M /tmp/rfcm
```
Since the model size is in propotion to number of nodes, so what about
determine the number of paraitions by a formula like `numNodes / 1,000,000`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]