Hi,

I've been thinking about implementing a RethinkDB processor as I'm needing
one for my project. Right now, if I put my code inside of an ExecuteScript,
I basically connect to the database as many times as I'm inserting
documents, and that's rather inefficient (I believe). The best I can get is
to insert 90 documents a second. Also, it seems that I can't increase the
number of concurrent tasks on this processor.

Here's my test code for reference (python):
import rethinkdb as r
r.connect('<myhost>', 28015).repl()
r.table('tv_shows').insert({ 'name': 'Star Trek TNG'
}).run(durability="soft", noreply=True)
flowFile = session.get()
session.transfer(flowFile, REL_SUCCESS)

I have been thinking of doing some kind of implementation that's similar to
PutMongo. I see there is a @OnScheduled annotation that connects to the
database. Is this piece of code run every time a flowfile arrives, or is it
more "smartly" run? Also, can I, instead of going the long way and building
a NAR, use InvokeScriptedProcessor, alongside the @OnScheduled annotation?

Finally, I seem to be quickly having some PermGen space issues. Is that
expected?

Thanks,
Stephane

Reply via email to