Re: Questions about known upcoming events and multiple 'locations'

Pascal Weinberger Tue, 27 Oct 2015 14:37:55 -0700

Hey,

Just a quick thought:
Maybe HTM isn't the best choice to encode this. Especially given that you want 
to learn a sequence over a year (like the Xmas example) given 3months of data. 
Overall I'm not even sure that you would get this with 3 years of data as this 
is a really rare event and the decay of synapses would have it vanish. No decay 
on the other hand will make the learning more unstable and you loose a chunk of 
the online learning performance...


So why not assist the HTM with an additional layer:
You can of course get fancier than this with another stack of a classifier or a 
regression, but the simplified way to deal with it is to just check from known 
opinions that (pseudo code)

if time == Xmas
  deliveries = predicted_deliveries * 0.5 (people cook at Xmas)
end

if time == big football game
  deliveries = predicted_deliveries * 10 ( ;))
end

else 
  deliveries =predicted_deliveries (use the raw HTM prediction)
end 

I guess you get the idea. 
You can now learn the scaling factor with multiple methods, or just estimate 
them. 

In any case I think this is the easiest way to tweak the prediction 

If you try it, please let me know what you had to take care of in scaling ;D 
this is a cool social study :P 


Best

Pascal 
____________________________

BE THE CHANGE YOU WANT TO SEE IN THE WORLD ...


> On 27 Oct 2015, at 21:51, Alan Haverty <[email protected]> wrote:
> 
> Thank you Matthew,
> I'll experiment with the events.
> 
> No, this will actually be a component of my final year project (4th year 
> college, Ireland)
> 
> I missed the boat for this years challenge, but I'll be sure to join in next 
> year!
> 
> Thanks again,
> Alan Haverty
> 
>> On Tue 27 Oct 2015 at 04:23 Matthew Taylor <[email protected]> wrote:
>> Hi Alan,
>> 
>> Here are my comments about your questions.
>> 
>> 1.a. This was an ad-hoc idea, but I haven't tried it. 
>> 
>> 1.a.i.-ii. Ideally, you would not want to include this field at all, you 
>> would just have years worth of data an a learned model that has seen the 
>> patterns each holiday produced in the past. But since you don't have that 
>> kind of history, you'll need to experiment a little. Perhaps a simple 
>> countup isn't going to give you what you want... if a holiday like XMas is a 
>> big deal, maybe its value is higher and there is a longer countup to that 
>> date, rather than say St. Patrick's Day. Like I said, this was just an 
>> ad-hoc idea and I can't say for certain how it will work. You'll want to 
>> experiment with it. 
>> 
>> 2. If you have data for 15 locations, I would say that each location should 
>> have its own model. One model only make predictions for one field, anyway. 
>> 
>> 2.a. You would only lose value if there are correlations between the 
>> locations, but I imagine this is not the case. The frequency of deliveries 
>> at one restaurant are probably not directly affected by the frequency of 
>> deliveries at another.
>> 
>> 2.b. No.
>> 
>> By the way, is this an HTM Challenge project?
>> 
>> Regards,
>> 
>> 
>> ---------
>> Matt Taylor
>> OS Community Flag-Bearer
>> Numenta
>> 
>>> On Mon, Oct 26, 2015 at 1:27 PM, Alan Haverty <[email protected]> wrote:
>>> Hello Nupic,
>>> 
>>> 
>>> 
>>> I have some questions about feeding in known events and also, how I should 
>>> handle multiple 'locations' that have similar properties but that may not 
>>> be directly related in reality.
>>> 
>>> 
>>> 
>>> Please let me know if I'm asking in the wrong mail list.
>>> 
>>> I'm also providing a brief description and example of the project.
>>> 
>>> Outline of Problem
>>> 
>>> Restaurants that offer food delivery are forced to hire drivers, pay for 
>>> insurance, pay for wages + predict how many drivers are needed in advance 
>>> and schedule their hours.
>>> 
>>> 
>>> 
>>> I propose to abstract this as a service where restaurants can simply use an 
>>> app to request a driver and let this service-business worry about drivers, 
>>> insurance, wages, roster scheduling etc.
>>> 
>>> 
>>> 
>>> To achieve this, the central ‘delivery system’ needs to predict how many 
>>> jobs are going to come from each area within a city to allow scheduling of 
>>> drivers days/weeks in advance.
>>> 
>>> 
>>> 
>>> I believe NuPIC is ideal to solve this problem, but I have a few questions 
>>> that I hope the mailing list can help with.
>>> 
>>> 
>>> 
>>> Assuming for this example:
>>> 
>>> That a city is divided into 15 geographical areas.
>>> That I have 3 months of known data with the amount of total deliveries that 
>>> came from each area per hour.
>>> That I need to predict the number_of_deliveries per hour (days/week in 
>>> advance, not too concerned with how far in advance yet.)
>>> Example Data
>>> 
>>> Example of 3hrs of data for one of those 15 areas:
>>> 
>>> dttm
>>> number_of_deliveries
>>> datetime
>>> int
>>> T
>>> 
>>> 2015/08/01 00:00:00.0
>>> 178
>>> 2015/08/01 01:00:00.0
>>> 96
>>> 2015/08/01 02:00:00.0
>>> 52
>>>  
>>> 
>>> Questions
>>> 
>>> 1.  1.     If I want to incorporate event data for known upcoming events 
>>> such as a national holiday/football game/TV series finale airing; how 
>>> should this hourly event data be arranged?
>>> 
>>> a.       Matthew Taylor suggested to use a count down until the hours of 
>>> the event
>>> 
>>>                                                                i.      How 
>>> would this work if I wanted to weight certain events differently? (e.g. A 
>>> national bank holiday would be weighted higher than a television series 
>>> episode airing)
>>> 
>>>                                                              ii.      While 
>>> the event is occurring, how should the countdown be represented? Should it 
>>> be ..,5,4,3,2,1,1,1,1,1,…,1,20,19,.. (Red being the event currently on for 
>>> that hour(s) or some cases the whole day(s))
>>> 
>>> 2.  2.     I need to do this for multiple locations, would a field to 
>>> specify each location be correct (Meaning there would be x15 {Saturday @ 
>>> 12:00}, one for each of the 15 locations) or should they be totally 
>>> separated?
>>> 
>>> a.       If I separate locations completely, would you expect I lose value 
>>> in anyway?
>>> 
>>> b.      If I keep them together, could locations contaminate/effect each 
>>> other that may not happen in reality?
>>> 
>>> c.       Apologies for this broad question, if anyone could even point me 
>>> to suggested reading, I would appreciate it.
>>> 
>>> Thank you for reading!
>>> Best regards,
>>> Alan Haverty
>>> [email protected]

Re: Questions about known upcoming events and multiple 'locations'

Reply via email to