Re: Nifi to Titan... How?

Matt Burgess Fri, 20 May 2016 20:21:19 -0700

Pat,

No worries, this discussion is relevant to the devs group :) I
appreciate and share your interest in getting connected data into
graph databases in order to get more insight out of the data.

It may be possible to put the data directly into the backing store
(Cassandra, e.g.) via NiFi, but in my experience that may be a fragile
and possibly non-future-proof solution. IMHO I think you're on the
right track with respect to taking graph-ready data and putting it
directly into a graph DB like Titan.

To that end, and there are some email thread(s) in this (and/or the
users list) that mention it, I think we need a PutToGraphDatabase
processor. In my mind, this processor takes GraphSON (or some other
supported format) and writes it to a Tinkerpop graph DB (to include
Titan, Neo4J, etc.). Conversion of input data should be possible with
existing processors in NiFi, and such a Put processor would allow the
user to pick the destination DB (Titan, Neo4J, Sail, OrientDB, e.g.)

Querying existing graphs is a different animal; in fact, it's so
complex that a DSL like Gremlin is likely the best play (as mentioned
in the thread), but certainly a processor that hides the scaffolding
would be helpful. Maybe we can get a graph bundle with
PutToGraphDatabase and QueryGraphDatabase processors, if you agree
(partially or wholly) please feel free to log Jira(s) to add such
things :)

As it turns out, I may have some free time this weekend (thanks
Grandma for watching the kids!), I've wanted the PutToGraphDB
processor for a while, as well as a Tinkerpop-enabled
Site-to-Site-client to ingest NiFi flow files (containing graph-ready
data) that writes to graph databases. Stay tuned to this list, if I
get something useful I will be sure to share it. Also I would be very
appreciative for any guidance, suggestions, and review you'd like to
share :)

Cheers,
Matt

On Fri, May 20, 2016 at 10:02 PM, Pat Trainor <[email protected]> wrote:
> Joe Witt & Andre,
>
> You guys are very nice, but I thought this was the only mailing list for
> Nifi... I now realize that there is a user's list, where this kind of
> question would be more appropriate.
>
> Thanks for not tazing me... :)
>
> As for the great replies:
>
> Joe,
>
> I want the output to be used in Titan for analysis (which is why I figured
> a Nifi->Titan connection was needed), but I didn't think of going right to
> the DB store of Titan's. I'm wondering now if doing so will allow Titan to
> do it's thing if graphs are created/modified in this fashion.
>
> I'm only into Titan/Cassandra for a month or so now, starting off on
> Hadoop. I will go on the user list and see if anyone has taken this
> approach, and I'll try to figure out which connector (processor) would
> update/insert into Cassandra.
>
> Very insightful!
>
> Andre,
>
> This is promising, but the running of scripts is a way I really don't want
> to go in. I like the speed/performance of the code in memory (like a
> daemon), instead of loading off the HD every time it's needed. It would
> seem to me to be a limiting factor for scaling.
>
> Also, I'm really trying to not add anything more to the zoo I have now. If
> it is inevitable, so be it. I can write code in Java that talks to the
> tinkerpop3 stack and I can see the data in cassandra, but again, I have to
> load up the java program each time it is run.
>
> If a Processor in Nifi could just convert to the needed format/syntax and
> act against either Titan, Cassandra, or an existing, running component
> directly, it would make the flow of data very fast In my mind).
>
> I need to figure out what other users are doing, and I will do that on the
> user's list...
> Maybe like Joe said, I can go from Nifi processor directly to Cassandra...
> Sounds very interesting...
>
> Thanks again for putting up with a non-dev question, folks!
>
> pat
> :)
>
>
>
>
>
> Thanks!
>
> pat <http://about.me/PatTrainor>
> ( ͡° ͜ʖ ͡°)
>
> LinkedIn <http://LinkedIn.com/in/PatTrainor>
> Hobby/Fun Blog <http://blog.atcp.us>
> Sales Engineering University <http://seu.atcp.us>
>
>
>
> "A wise man can learn more from a foolish *question *than a fool can learn
> from a wise *answer*". ~ Bruce Lee.
>
> On Fri, May 20, 2016 at 12:22 AM, Joe Witt <[email protected]> wrote:
>
>> Pat
>>
>> It looks like Titan can be backed by Apache Cassandra or Apache HBase.
>> NiFi can deliver to both of those.  Would that take care of what
>> you're looking to do?
>>
>> Thanks
>> Joe
>>
>> On Thu, May 19, 2016 at 11:16 PM, Pat Trainor <[email protected]>
>> wrote:
>> > Thanks to everyone making this now open sourced tool awesome.
>> >
>> > I can't find anything that directly links the 2.
>> >
>> > I want to use Nifi to coordinate page scraping, parsing, and finally
>> throw
>> > Titan data for graphs.
>> >
>> > Do I have to use Kafka or Spark for this?
>> >
>> > I'm looking at the output mechanisms (Processors), and I'm not a JSON
>> > expert, but I can write anything needed in Java...
>> >
>> > I kind of like the elegance of Titan on Cassandra, and am reticent to add
>> > more animals to my little ark!
>> >
>> > Just looking for pointers to the tech that would fit neatly...
>> >
>> > Thanks
>> > in advance for your insights
>> > !
>> >
>> > pat <http://about.me/PatTrainor>
>> > ( ͡° ͜ʖ ͡°)
>>

Re: Nifi to Titan... How?

Reply via email to