There are a few options, depending on what you’re trying to do. 

The application is structured such that a set of agents 
(https://github.com/numenta/nupic.rogue/blob/master/avogadro/__init__.py#L39-L60
 
<https://github.com/numenta/nupic.rogue/blob/master/avogadro/__init__.py#L39-L60>)
 that run constantly in the background and periodically poll for metrics and 
save to a local rrdtool database.   rrdtool is essentially a flat file-based 
time series database with some interesting properties.  In the case of 
nupic.rogue, it’s used as a buffer between either a grok instance 
(https://aws.amazon.com/marketplace/pp/B00I18SNQ6/ref=srh_res_product_title?ie=UTF8&sr=0-2&qid=1433386525659
 
<https://aws.amazon.com/marketplace/pp/B00I18SNQ6/ref=srh_res_product_title?ie=UTF8&sr=0-2&qid=1433386525659>)
 if using the grok forwarder, or a nupic model if using the nupic forwarder.  
Then, the data is forwarded for analysis in a separate process.

There is one major difference between the grok forwarder and nupic forwarder: 
the grok forwarder is meant to be run regularly with cron.  The grok forwarder 
maintains a set of “.pos” files to keep track of position between runs so that 
it can send everything since the last run to a running grok instance.  The 
nupic forwarder has no such bookkeeping and sends the entire batch to a freshly 
created model in a one-off sort of way, and saves the results to a cvs file 
locally.

If you want to set it up in a streaming fashion, imagine replacing the rrdtool 
component with some sort of queue implementation (say, rabbitmq or redis 
pubsub).  You could even do simple communication over a socket.  The code is 
structured such that one back end can be swapped out for another (there only 
happens to be one back end right now — “rrdtool”).  For example, each of the 
agents are a subclass of AvogadroAgent, which itself is a subclass of 
RRDToolClient.  You could create an alternate back end implementation that 
writes to a queue, and change AvogradoAgent to be a subclass of your new class 
rather than RRDToolClient.  If that’s what you’d like to do, I suggest starting 
with a copy of 
https://github.com/numenta/nupic.rogue/blob/master/avogadro/rrdtool.py 
<https://github.com/numenta/nupic.rogue/blob/master/avogadro/rrdtool.py>, 
remove the methods prefixed with “_”, and re-implement __init__(), 
createParams(), addParseOptions(), and store() to suit your needs.  Then, you 
need only write a simple script which reads from the queue and feeds the 
samples to a model you’ve created.

You could also create a new forwarder which is sort of a hybrid between the 
grok and nupic forwarders.  For example, use the “.pos” file approach of grok 
forwarder, and keep the script running rather than scheduled by cron 
periodically.

> On Jun 3, 2015, at 2:28 PM, Michael Parco <[email protected]> 
> wrote:
> 
> I am attempting to setup a similar system to the way grok would operate using 
> nupic algorithms. Currently I git cloned the nupic.rogue github and built 
> nupic.rogue using the setup python scripts. I have also built nupic and run 
> some of the test examples such as hot gym and cpu predictions. 
> 
> I have the nupic.rogue agent running and collecting cpu I/O, network, memory 
> data as rrds and I am able to execute rogue-export --prefix=var/db to obtain 
> the .csv conversions of the rrd files. My next step is to feed the .csv files 
> or .rrd files in an nupic model running the HTM model to run for predictions 
> and anomaly scores. Ideally I would like to set this up in a streaming 
> fashion on a local box. I came across the nupic_forwarder.py script within 
> nupic.rogue, but I have been unable to feed in the collected data... any 
> ideas?

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to