Pascal -

Your idea reminds me a bit of Banjo: http://ban.jo/

This is a private corporation, but doing something somewhat similar - at
least in that they have divided the globe up into a giant grid, and within
each cell of that grid, they do anomaly detection. Except, instead of
geophysical data, they are monitoring social activity by observing
geotagged photos, tweets, posts, etc.

- Jeff

On Tue, Aug 4, 2015 at 3:27 PM Jared Casner <[email protected]> wrote:

> Hi Pascal,
>
> So, let me see if I understand correctly.  For now, you don't require any
> geo-encoding of data (but it sounds like that might be a useful feature in
> the future?)  Instead, you will create a list of regions / polygons that
> represent a geofenced area.  Within each region, you will have some set of
> sensors - air pressure, humidity, wind speed, seismic activity,
> temperature, etc.  Your goal is to generate anomaly scores for each of
> those sensors - which produce scalar data.  You then plan to do some
> additional logistic regression on top of the anomaly scores to predict the
> likelihood of a natural disaster (earthquake, meteorological, etc) in that
> region or nearby regions.  It would be up to the statistician to correlate
> regions in the short term, correct?  Also, if I've understood you
> correctly, the biggest issue that researchers face currently with respect
> to this problem is that their predictions for each sensor aren't always
> accurate because of daily variations in the data that are unexpected?
>
> I hope I've now understood the problem, but please clarify if I've
> mis-stated anything.
>
> Assuming I have a basic understanding of the problem, I think you may be
> able to simplify the engineering task a little bit.  It seems to me that
> your primary objective isn't to have an easy-to-read user interface that
> displays data to an end user.  Instead, you want data available to
> researchers in a format that they can do the logistic regression on.  So,
> perhaps you can simplify your project by starting with HTMEngine directly.
> I'm sure by now you've seen Matt's demo [1] of HTMEngine - that may be a
> good place to start.  In his NYC Traffic demo [2], each road segment
> represents a geolocation and has a scalar metric (average speed) associated
> to it.  Assuming you have easy access to the data, you can probably use
> this as a good basis for getting started.  The output is available in both
> json and csv formats, so should be easily accessible to a researcher.
>
> To answer one of your original questions about Numenta engineers helping
> out on this project, they're all free to help in their off time!  One of
> our big objectives of opening access to NuPIC and the Numenta Apps was to
> provide a means for you - and those like you - to get in and do things that
> we just don't have the bandwidth to do internally.  I'm thrilled to see
> your excitement and hope that others in the community will want to get
> involved to help you out!
>
> Cheers,
>
> Jared
>
> [1] https://www.youtube.com/watch?v=lzJd_a6y6-E
> [2] https://github.com/nupic-community/htmengine-traffic-tutorial
>
>
>
>>
>> ---------- Forwarded message ----------
>> From: Pascal Weinberger <[email protected]>
>> To: "NuPIC general mailing list." <[email protected]>
>> Cc:
>> Date: Tue, 4 Aug 2015 12:13:04 +0200
>> Subject: Re: nostradamIQ Project help needed!
>> Matt,
>> That's true, but you do not need it at all:
>> Take the world, splice it in polygons (according to the density of data
>> availably and resolution needed); label you polygons, and get your data for
>> each polygon with the label consisting of Where:What, with where being the
>> label of the specific geo-area according to your above system, and what the
>> label for what kind of data you push (like seismic etc.). And there you
>> have your data format: Label to scalar!
>> Now the htmengine outputs you anomaly scores for each
>> Label Where:What and you take these to hierarchically (in a
>> geo-hierarchie) build logistic regression models, trained by the anomaly
>> output, and a binary value for whether a certain disaster happened there at
>> a time X later time or not. (This needs some past data which is why the
>> highest priority is getting the data polled and htmengine trained). You go
>> for logistic regression because that is what literature finds to perform
>> best. Now when that works, you have your 'live' data stream and get
>> predictions in the form of probabilities for the disaster occurring X time
>> in the future...
>>
>> This was the basic idea.. of course you will need to test it and refine
>> the architecture etc. But you got your work-around :)
>>
>> So htmengine is not supposed to do the entire job. its more for feature
>> detection :) The problem researchers find when building log-reg models with
>> real data (raw scalars of the sensors) is that they periodically make wrong
>> predictions due to daily etc. patterns. This is what HTM should filter out
>> ;)
>>
>> The point of using tuarus as a starter therefore is that you already have
>> your basic infrastructure of companies (your geo-polygons) and different
>> metrics (the different sensor data in that region)..
>>
>> Does it make more sense now? :) Of course a geoencoder and so would be
>> nice in addition to capture more of the patterns, but this is what I would
>> hope to achieve with the geo-hierarchy of  log-reg models so they capture
>> the spatial relationships in their input weights (of course only based on
>> historical data)... I do not think the geoEncoder Would get this as well..
>> When running the demo_app, you find that the geoendoding with
>> radius=Magnitude or any exponential function thereof makes HTM immune to
>> regions where at least one strong quake happend... and you dont want that.
>>
>> but David, you may think about building a engine for java as well :) Just
>> cause its faster ;D
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>

Reply via email to