By deleting those variables and running the script nupic_forwarder.py again
(I eventually figured out I needed to add the --prefix=/path/to/rrddb) the
model ran over the data and produced results for each metric. My goal is to
set this up temporarily in the cron to change the RRA measures to every 15
seconds and a run script for the nupic model every 15 seconds. I have
pretty beefy servers so hopefully cpu/memory intensive processes should be
no issue. Ultimately my goal is to integrate this in with kafka and spark,
so I will post about it if I can accomplish this.

On Thu, Jun 4, 2015 at 2:01 PM, Michael Parco <[email protected]>
wrote:

> It appears there may be a few more keys that were either renamed or
> deleted from model_params -- randomSP gave me an error (I just deleted and
> re-ran) and useHighTier (I also just deleted and re-ran). If they have been
> named let me know, I looked through the post talking about renames, but did
> not see these particular variables.
>
> On Thu, Jun 4, 2015 at 12:08 PM, Austin Marshall <[email protected]>
> wrote:
>
>> Ah, yes!  We renamed some of the keys in model params in
>> https://github.com/numenta/nupic/pull/1872
>>
>> "coincInputPoolPct" is now "potentialPct", for example.
>>
>> I've updated master in nupic.rogue with the updated params in
>> https://github.com/numenta/nupic.rogue/pull/2/files and you should be
>> able to pull in the latest to fix your specific problem.
>>
>> On Thu, Jun 4, 2015 at 7:52 AM, Michael Parco <
>> [email protected]> wrote:
>>
>>> Austin this was a great rundown on the ins and outs of nupic rogue. I've
>>> done a lot of work with rrdtool and used such agents previously to stream
>>> metrics data collected by ganglia agents to some streaming analytics.
>>> Although I think rrdtool is great for temporary local storage, I am looking
>>> to possible replace it with a different backend that I can better
>>> communicate with.
>>>
>>> I think the issue that I have seen thus far is that the nupic forwarder
>>> has been giving me errors when I attempt to forward data to it.
>>> "RuntimeError: Unknown parameter 'coincInputPoolPct' for region 'SP' of
>>> type 'py.SPRegion'" and then gives me a list of valid parameters. This
>>> seems to be an error from nupic within python itself and not anything to do
>>> with nupic.rogue
>>>
>>> On Wed, Jun 3, 2015 at 11:12 PM, Austin Marshall <[email protected]>
>>> wrote:
>>>
>>>> There are a few options, depending on what you’re trying to do.
>>>>
>>>> The application is structured such that a set of agents (
>>>> https://github.com/numenta/nupic.rogue/blob/master/avogadro/__init__.py#L39-L60)
>>>>  that
>>>> run constantly in the background and periodically poll for metrics and save
>>>> to a local rrdtool database.   rrdtool is essentially a flat file-based
>>>> time series database with some interesting properties.  In the case of
>>>> nupic.rogue, it’s used as a buffer between either a grok instance (
>>>> https://aws.amazon.com/marketplace/pp/B00I18SNQ6/ref=srh_res_product_title?ie=UTF8&sr=0-2&qid=1433386525659)
>>>>  if
>>>> using the grok forwarder, or a nupic model if using the nupic forwarder.
>>>> Then, the data is forwarded for analysis in a separate process.
>>>>
>>>> There is one major difference between the grok forwarder and nupic
>>>> forwarder: the grok forwarder is meant to be run regularly with cron.  The
>>>> grok forwarder maintains a set of “.pos” files to keep track of position
>>>> between runs so that it can send everything since the last run to a running
>>>> grok instance.  The nupic forwarder has no such bookkeeping and sends the
>>>> entire batch to a freshly created model in a one-off sort of way, and saves
>>>> the results to a cvs file locally.
>>>>
>>>> If you want to set it up in a streaming fashion, imagine replacing the
>>>> rrdtool component with some sort of queue implementation (say, rabbitmq or
>>>> redis pubsub).  You could even do simple communication over a socket.  The
>>>> code is structured such that one back end can be swapped out for another
>>>> (there only happens to be one back end right now — “rrdtool”).  For
>>>> example, each of the agents are a subclass of AvogadroAgent, which itself
>>>> is a subclass of RRDToolClient.  You could create an alternate back end
>>>> implementation that writes to a queue, and change AvogradoAgent to be a
>>>> subclass of your new class rather than RRDToolClient.  If that’s what you’d
>>>> like to do, I suggest starting with a copy of
>>>> https://github.com/numenta/nupic.rogue/blob/master/avogadro/rrdtool.py,
>>>> remove the methods prefixed with “_”, and re-implement __init__(),
>>>> createParams(), addParseOptions(), and store() to suit your needs.  Then,
>>>> you need only write a simple script which reads from the queue and feeds
>>>> the samples to a model you’ve created.
>>>>
>>>> You could also create a new forwarder which is sort of a hybrid between
>>>> the grok and nupic forwarders.  For example, use the “.pos” file approach
>>>> of grok forwarder, and keep the script running rather than scheduled by
>>>> cron periodically.
>>>>
>>>> On Jun 3, 2015, at 2:28 PM, Michael Parco <[email protected]>
>>>> wrote:
>>>>
>>>> I am attempting to setup a similar system to the way grok would operate
>>>> using nupic algorithms. Currently I git cloned the nupic.rogue github and
>>>> built nupic.rogue using the setup python scripts. I have also built nupic
>>>> and run some of the test examples such as hot gym and cpu predictions.
>>>>
>>>> I have the nupic.rogue agent running and collecting cpu I/O, network,
>>>> memory data as rrds and I am able to execute rogue-export --prefix=var/db
>>>> to obtain the .csv conversions of the rrd files. My next step is to feed
>>>> the .csv files or .rrd files in an nupic model running the HTM model to run
>>>> for predictions and anomaly scores. Ideally I would like to set this up in
>>>> a streaming fashion on a local box. I came across the nupic_forwarder.py
>>>> script within nupic.rogue, but I have been unable to feed in the collected
>>>> data... any ideas?
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to