Pat, No worries, this discussion is relevant to the devs group :) I appreciate and share your interest in getting connected data into graph databases in order to get more insight out of the data.
It may be possible to put the data directly into the backing store (Cassandra, e.g.) via NiFi, but in my experience that may be a fragile and possibly non-future-proof solution. IMHO I think you're on the right track with respect to taking graph-ready data and putting it directly into a graph DB like Titan. To that end, and there are some email thread(s) in this (and/or the users list) that mention it, I think we need a PutToGraphDatabase processor. In my mind, this processor takes GraphSON (or some other supported format) and writes it to a Tinkerpop graph DB (to include Titan, Neo4J, etc.). Conversion of input data should be possible with existing processors in NiFi, and such a Put processor would allow the user to pick the destination DB (Titan, Neo4J, Sail, OrientDB, e.g.) Querying existing graphs is a different animal; in fact, it's so complex that a DSL like Gremlin is likely the best play (as mentioned in the thread), but certainly a processor that hides the scaffolding would be helpful. Maybe we can get a graph bundle with PutToGraphDatabase and QueryGraphDatabase processors, if you agree (partially or wholly) please feel free to log Jira(s) to add such things :) As it turns out, I may have some free time this weekend (thanks Grandma for watching the kids!), I've wanted the PutToGraphDB processor for a while, as well as a Tinkerpop-enabled Site-to-Site-client to ingest NiFi flow files (containing graph-ready data) that writes to graph databases. Stay tuned to this list, if I get something useful I will be sure to share it. Also I would be very appreciative for any guidance, suggestions, and review you'd like to share :) Cheers, Matt On Fri, May 20, 2016 at 10:02 PM, Pat Trainor <[email protected]> wrote: > Joe Witt & Andre, > > You guys are very nice, but I thought this was the only mailing list for > Nifi... I now realize that there is a user's list, where this kind of > question would be more appropriate. > > Thanks for not tazing me... :) > > As for the great replies: > > Joe, > > I want the output to be used in Titan for analysis (which is why I figured > a Nifi->Titan connection was needed), but I didn't think of going right to > the DB store of Titan's. I'm wondering now if doing so will allow Titan to > do it's thing if graphs are created/modified in this fashion. > > I'm only into Titan/Cassandra for a month or so now, starting off on > Hadoop. I will go on the user list and see if anyone has taken this > approach, and I'll try to figure out which connector (processor) would > update/insert into Cassandra. > > Very insightful! > > Andre, > > This is promising, but the running of scripts is a way I really don't want > to go in. I like the speed/performance of the code in memory (like a > daemon), instead of loading off the HD every time it's needed. It would > seem to me to be a limiting factor for scaling. > > Also, I'm really trying to not add anything more to the zoo I have now. If > it is inevitable, so be it. I can write code in Java that talks to the > tinkerpop3 stack and I can see the data in cassandra, but again, I have to > load up the java program each time it is run. > > If a Processor in Nifi could just convert to the needed format/syntax and > act against either Titan, Cassandra, or an existing, running component > directly, it would make the flow of data very fast In my mind). > > I need to figure out what other users are doing, and I will do that on the > user's list... > Maybe like Joe said, I can go from Nifi processor directly to Cassandra... > Sounds very interesting... > > Thanks again for putting up with a non-dev question, folks! > > pat > :) > > > > > > Thanks! > > pat <http://about.me/PatTrainor> > ( ͡° ͜ʖ ͡°) > > LinkedIn <http://LinkedIn.com/in/PatTrainor> > Hobby/Fun Blog <http://blog.atcp.us> > Sales Engineering University <http://seu.atcp.us> > > > > "A wise man can learn more from a foolish *question *than a fool can learn > from a wise *answer*". ~ Bruce Lee. > > On Fri, May 20, 2016 at 12:22 AM, Joe Witt <[email protected]> wrote: > >> Pat >> >> It looks like Titan can be backed by Apache Cassandra or Apache HBase. >> NiFi can deliver to both of those. Would that take care of what >> you're looking to do? >> >> Thanks >> Joe >> >> On Thu, May 19, 2016 at 11:16 PM, Pat Trainor <[email protected]> >> wrote: >> > Thanks to everyone making this now open sourced tool awesome. >> > >> > I can't find anything that directly links the 2. >> > >> > I want to use Nifi to coordinate page scraping, parsing, and finally >> throw >> > Titan data for graphs. >> > >> > Do I have to use Kafka or Spark for this? >> > >> > I'm looking at the output mechanisms (Processors), and I'm not a JSON >> > expert, but I can write anything needed in Java... >> > >> > I kind of like the elegance of Titan on Cassandra, and am reticent to add >> > more animals to my little ark! >> > >> > Just looking for pointers to the tech that would fit neatly... >> > >> > Thanks >> > in advance for your insights >> > ! >> > >> > pat <http://about.me/PatTrainor> >> > ( ͡° ͜ʖ ͡°) >>
