Dear All, I am fine tuning a Cubist model (see https://cran.r-project.org/web/packages/Cubist/index.html). I am a bit puzzled by its output. On a dataset which contains 275 cases, I get non mutually exclusive rules. E.g., in the output below, rules 2 and 3 cover all the 275 cases of the data set and rule 1 overlaps partially. Am I misunderstanding something? Many thanks
Lorenzo Cubist [Release 2.07 GPL Edition] Thu Jan 12 23:10:40 2017 --------------------------------- Target attribute `outcome' Read 275 cases (21 attributes) from undefined.data Model: Rule 1: [204 cases, mean 0.5393324, range 0 to 2.285714, est err 0.2598495] if home_copub_after_all <= 0.7142857 host_copub_after_all <= 1.833333 then outcome = 0.1666667 + 0.9 home_copub_after_all + 0.11 home_copub_before_all Rule 2: [259 cases, mean 0.7445303, range 0 to 3.166667, est err 0.1866440] if host_copub_after_all <= 1.833333 then outcome = 0.0433333 + 0.75 home_copub_after_all + 0.33 host_copub_after_all + 0.37 top_10_after_all Rule 3: [16 cases, mean 4.4285712, range 2.142857 to 8.857142, est err 1.0346190] if host_copub_after_all > 1.833333 then outcome = 1.595 + 1.03 top_10_after_all + 0.45 home_copub_after_all Evaluation on training data (275 cases): Average |error| 0.2678023 Relative |error| 0.38 Correlation coefficient 0.94 Attribute usage: Conds Model 100% 54% host_copub_after_all 43% 100% home_copub_after_all 57% top_10_after_all 43% home_copub_before_all Time: 0.0 secs
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.