Yes, discarding is the best option which is currently supported. On Wed, Jun 3, 2015 at 12:10 PM, Maheshakya Wijewardena <[email protected] > wrote:
> Sparks' Decision tree does not accept datasets with a single value in a > feature. It produces the following error: > >> requirement failed: DecisionTree Strategy given invalid >> categoricalFeaturesInfo setting: feature 645 has 1 categories. The number >> of categories should be >= 2 >> > > This is not an uncommon scenario since large datasets can contain features > with only a single value (See training data in [1] for example). As this is > a Spark error, there should be a way to handle such datasets externally. > > One possible solution is to allow user to discard features(columns), so > that they can discard those features with single values before training a > Decision tree. Please suggest if there are any other feasible solutions. > > Best regards, > > [1] https://www.kaggle.com/c/digit-recognizer > -- > Pruthuvi Maheshakya Wijewardena > Software Engineer > WSO2 Lanka (Pvt) Ltd > Email: [email protected] > Mobile: +94711228855 > > > -- Thanks & regards, Nirmal Associate Technical Lead - Data Technologies Team, WSO2 Inc. Mobile: +94715779733 Blog: http://nirmalfdo.blogspot.com/
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
