Thanks Subutai. I only used a medium swarm, large would make more sense since I have several input fields. I'll try to run them overnight over the weekend.
Also, day of week is certainly a major factor when you plot the debris counts alone. It looks to me like there are many fewer calls over the weekend, and a typical spike in calls Mondays. --------- Matt Taylor OS Community Flag-Bearer Numenta On Fri, Oct 9, 2015 at 9:46 AM, Subutai Ahmad <[email protected]> wrote: > >> 1. How do I interpret the "Field Contributions"? How are those number >> calculated? > > Those numbers are how much the error decreases (as a percent) if you include > that field. Let's say you are using the MAPE error, which is the default. A > field contribution of 30.16 means that if you include only that extra field > (and no others), the error will go down to original_error * (1-30.16%). > > Without knowing the specifics, I'm not sure why wind speed didn't help. With > streaming data often the field combination results are counterintuitive but > true. I'll try to go over this point in my chalk talk next week. > > Also, did you plot the data to see if there is a large day of week > contribution? Maybe that is indeed the biggest factor? > > BTW, did you use a large swarm? A medium swarm doesn't go beyond two-field > combinations, I believe. > > --Subutai > > On Thu, Oct 8, 2015 at 5:49 PM, Matthew Taylor <[email protected]> wrote: >> >> Hello NuPIC, >> >> I've got weather data that looks like this [1] for every day for the >> past several years. I'm trying to correlate this weather data with the >> number of 311 calls made in the same area over time. I'm swarming over >> a selection of weather input fields and the debris call count [2]. >> Weather certainly should contribute somehow to people calling for tree >> debris pickup. >> >> So far, I have swarmed twice with the following results. >> >> #1 included "rain", "snow", "precip", and "max wind speed" and the >> field contributions looked like this: >> >> Field Contributions: >> { u'debris': 30.163726239876382, >> u'maxwspd': -1.373108683713905, >> u'precip': 2.1176366006787224, >> u'rain': 0.0, >> u'snow': -3.0830847929189784, >> u'timestamp_dayOfWeek': 32.13034654690986, >> u'timestamp_timeOfDay': 3.9764609868384224, >> u'timestamp_weekend': 15.442651796208624} >> >> The best model params returned only encoded "debris" and day of week / >> weekend. I expected "max wind speed" to contribute much more to debris >> calls. >> >> #2 included "hail", "mean wind speed", "temperature variation", and >> "precip". The field contributions after swarming looked like this: >> >> Field Contributions: >> { u'debris': 28.19563250430966, >> u'hail': 1.7711291936725424, >> u'meanwindspdm': -6.274956215526072, >> u'precip': 0.0, >> u'tempvariation': -6.395026451990224, >> u'timestamp_dayOfWeek': 30.21767519999757, >> u'timestamp_timeOfDay': 1.2703697906231544, >> u'timestamp_weekend': 13.05969551380973} >> >> Still, it seems that wind and temperature variation do not contribute >> to better predictions of debris calls. You can see all my code and CSV >> data I am swarming over here: >> https://github.com/rhyolight/multivariate-example >> >> So, a couple of questions I have now are: >> >> 1. How do I interpret the "Field Contributions"? How are those number >> calculated? >> 2. What am I doing wrong? Weather certainly does contribute to 311 >> Tree Debris calls in the real world. Is my data not good enough? >> >> [1] https://gist.github.com/rhyolight/5631429c950529a7c947 >> [2] >> https://github.com/rhyolight/multivariate-example/blob/master/weather_debris_data.csv >> >> Thanks in advance, >> --------- >> Matt Taylor >> OS Community Flag-Bearer >> Numenta >> >
