> > Nice article. Stephen, is there any chance I could steal some of the > content for the documentation? A lot of it is relevant to unipop.
Sure - just link back at the article so folks can get at the full thing. Interesting stuff you're working on. Thanks for sharing. On Wed, Oct 28, 2015 at 7:59 PM, Ran Magen <[email protected]> wrote: > I love the logo :) > > Nice article. Stephen, is there any chance I could steal some of the > content for the documentation? A lot of it is relevant to unipop. > > Thanks for taking the time to respond, I'll keep you guys updated. BTW if > anyone wants to help they're welcome :) > > Cheers, > Ran > On יום ה׳, 29 באוק׳ 2015 at 0:54 Marko Rodriguez <[email protected]> > wrote: > > > Also, if you don't like my logo pull request, you don't have to keep it > > :). It won't hurt my feelings. > > > > Marko. > > > > http://markorodriguez.com > > > > On Oct 28, 2015, at 4:50 PM, Marko Rodriguez <[email protected]> > wrote: > > > > > Hello Ran, > > > > > > Thank you for detailing your work. > > > > > > I was just looking over Apache Drill. That looks like a really cool > > (complicated) project. > > > > > > It sounds like Unipop has its work cut out for it. However, as you say, > > if you can abstract away the database layer like Titan does, then you > will > > be in luck. > > > > > > I understand why you didn't choose Sqlg. Sqlg is starting with a > "blank" > > database and enforcing a graph schema into it. Unipop is starting with an > > existing database and allowing the user to query it like a graph. > > > > > > This could really help a lot of people in the area of master data > > management. I have worked on many projects that have MongoDB, SQLServer, > > Cassandra, Voldemort, etc. all under the same roof. If you could "just > use > > Gremlin" to manipulate that data, life would be much easier. > > > > > > I have shared this before on this list, but perhaps you missed it. > > Stephen Mallette wrote this article a while back that may provide some > > inspiration. > > > > > > http://thinkaurelius.com/2013/02/04/polyglot-persistence-and-query-with-gremlin/ > > > > > > Anywho, keep up the good work and when you have things working, we can > > help you promote your project. > > > > > > Take care, > > > Marko. > > > > > > http://markorodriguez.com > > > > > > On Oct 28, 2015, at 3:43 PM, Ran Magen <[email protected]> wrote: > > > > > >> Awesome, thanks Marko! > > >> > > >> Good point, I'll try to explain my reasoning behind not using Sqlg > (even > > >> though it seems like a great project on its own). I'd be happy to > > receive > > >> any feedback on it. > > >> > > >> First, a bit about the motivation behind Unipop... > > >> Unipop is meant to be a DAL on top of any databases of your choice. > The > > >> philosophy being that these days many organizations (as the one I work > > for) > > >> have alot of different "kinds" of data, spread throughout many > > specialized > > >> data stores (RDBMS / DocumentStore, etc.) > > >> > > >> What we wanted to make is a DAL that'll enable us to query all our > > >> different data stores and hundreds of different schemas, including the > > >> relationships between the data, in one simple interface. > > >> > > >> There are some projects that try to do the same thing (Drill > > >> <http://drill.apache.org/>, Calcite < > > https://calcite.incubator.apache.org/>, > > >> Dremel <http://research.google.com/pubs/pub36632.html>), but they use > > sql > > >> as the "unified" query language. We figured that in a schema with many > > >> connections, a property-graph representation would be better than a > > >> relational model (trying to avoid "JOIN hell"). So we decided to > > implement > > >> a Calcite-like application using gremlin - Unipop. > > >> > > >> On the issue of using Sqlg, There were a few design decisions we made > in > > >> Unipop that seemed to go against it: > > >> > > >> 1. The graph Ontology should not be dependent on the underlying > > schemas. > > >> One could choose to represent a table in a database as a vertex, or > > as a > > >> vertex + edge (represented by some FK column). You might even choose > > to > > >> make a "virtual" vertex (let's say an 'email-address' vertex) that > > isn't > > >> represented anywhere physically, but is used as a connection-point > > between > > >> other vertices in our ontology (e.g. the user's posts, stored each > as > > a > > >> document in elasticsearch). Basically, we shouldn't bind the design > > of our > > >> "user-facing" ontology with the design of our optimized data store > > schemas. > > >> - OTOH, in Sqlg the schema is (understandably) mapped directly to > the > > >> graph ontology <http://umlg.org/sqlg.html> (take a look at the > > >> Architecture section.) > > >> 2. We must be able to query multiple different data stores in the > > >> same traversal, and even in the same step. Practically that meant > that > > >> instead of implementing the process package (Steps, Strategies, > etc.) > > for > > >> each data-store, we made one implementation that coordinates the > > different > > >> Controllers (elastic, jdbc, etc). > > >> - Before starting the work on the jdbc package I scanned through > > the > > >> sqlg code, and (again, understandably) the code seemed heavily > > >> dependent on > > >> the process package. > > >> 3. Translating gremlin's in/out steps to JOIN statments is a big > pain. > > >> It's probably the hardest part about creating an sql implementation. > > We > > >> figured that for Unipop we'd just bypass that problem, create the > > JOINs we > > >> needed as views in the DB, and simply map those views to the > > vertices&edges > > >> to which they correspond in the graph ontology. (This explanation > > might not > > >> be too clear, I can expand on it if anyone's interested). > > >> > > >> > > >> The reason for going into these details is because I'd be happy to > get a > > >> second opinion from you guys, about using Sqlg in particular, and > about > > the > > >> design decisions in general. > > >> > > >> BTW, the same points are probably relevant in regards to using Titan's > > >> Cassandra/Hbase/etc connectors. > > >> > > >> Thanks, > > >> Ran > > >> > > >> On Tue, 27 Oct 2015 at 16:58 Marko Rodriguez <[email protected]> > > wrote: > > >> > > >>> Hi Ran, > > >>> > > >>> I just submitted a PR to your Unipop project. > > >>> > > >>> https://github.com/rmagen/unipop/pull/3 > > >>> > > >>> However, while cruising around, I notice your unipop-jdbc/ package. > Why > > >>> not just use Pieter Martin's Sqlg project for JDBC/TinkerPop? > > >>> > > >>> https://github.com/pietermartin/sqlg > > >>> > > >>> Perhaps I don't understand the purpose of your package… just a random > > >>> thought. > > >>> > > >>> Thanks, > > >>> Marko. > > >>> > > >>> http://markorodriguez.com > > >>> > > >>> > > > > > > > >
