Frankly, I'm not sure the author understands the real issues. Pretty much everything he says about the limits of SQL is not strictly true. The whole issue here is not what kinds of models you can create in a relational database or graph database. If you know what you are doing you can use either database to do the trick. The question is, which he sorta of hints at, is how efficiently can you make that model perform in a particular database? You can certainly create meta models and abstract models in relational databases, (it's not neccesarily easy, but unfortunately abstraction seems to be the main issue that separates the real modelers from people who may just never get it). However, if you succeed in getting such a model into a relational database, then what? The relational model requires many joins to get from the abstract model to the concrete implementations. Caching will solve that (for both technologies), but to quote Tim Bray quoting Phil Karlton "There are only two hard things in Computer Science: cache invalidation and naming things." (Both applicable here, since half the problem of abstraction is naming....) In the end, if this stuff was easy we wouldn't have jobs; you've got to know how to match the technology to the problems, but make sure the problem requirements drive the choice of technology, not the other way around.
More to the point, NoSQL is not going to displace relational, but NoSQL and Neo4J in particular, will solve problems relational is not good at. Is traversal one of those areas? Depends on the model, but for large data sets with many loosely coupled attributes then graph databases are probably going to be the better match. Are large data sets with many loosely coupled attributes one of those areas in general? Not necessarily, there are many good technologies (not necessarily pure relational) for handling sparse data sets, so depending on the problem at hand they may be a better solution. If the problem isn't traversal, and maybe doesn't require near real time answers, then you may have a more standard data ware house type issue at hand and maybe you should look at those technologies.... There's a whole slew of ways attacking the issues raised in the article and it seems to ignore most of them. It ends up seeming mostly an argument for NoSQL technologies, but the rational lined up to support the argument just doesn't make it in my book, nor does the little rant against Neo4J: The "Node" is a way to access the graph, just like the "ResultSet" (pick your class) is the way to access the set (of sets) in a relational database. Separating that level of "physical" from the "logical" isn't the issue at the point you are doing data access. As others have pointed out, you can layer on mapping tools if you want to do that, but then of course you've got to play with the performance issues and the limitations of the mappings and sometimes (usually in my experience) that's a bigger problem then mixing a physical model with a logical model, at least if you contain that to some small portion of your application. Peter Hunsberger On Fri, Oct 14, 2011 at 7:48 AM, Tobias Ivarsson < [email protected]> wrote: > We had an interesting discussion about this internally at Neo Technology > today. We thought it might be of interest to the broader community. I don't > think the discussion is over, so it would be interesting to continue it on > the public mailing list. > > It regards the initial paragraphs of an article posted to dzone recently: > http://www.dzone.com/links/rss/the_coming_sql_collapse.html > It mentions Neo4j and how the author dislikes a common way of using Neo4j > for building applications. > > It would be interesting to hear suggestions on how to improve this. > > Forwarded conversation follows: > > On Fri, Oct 14, 2011 at 10:13 AM, Tobias Ivarsson < > [email protected]> wrote: > > > I found this while reading feeds in bed last night: > > > > *The Coming SQL Collapse* > > http://www.dzone.com/links/rss/the_coming_sql_collapse.html > > > > (Sent from Flipboard <http://flipboard.com>) > > > > > > The things he say about SQL vs NOSQL is not very interesting, but I'd > like > > to raise what he says about Neo4j: > > > > I looked at neo4j briefly the other day, and quite predictably thought > >> ‘wow, this looks like a serious tinkertoy: it‘s basically a bunch of > nodes > >> where you just blob your attributes.‘ Worse than that, to wrap objects > >> around it, you have to have them explicitly incorporate their node > class, > >> which is ugly, smelly, violates every law of separation of concerns and > >> logical vs. physical models. On the plus side, as I started to look at > it > >> more, I realized that it was the perfect way to implement a backend for > a > >> bayesian inference engine (more on that later). Why? Because inference > >> doesn‘t care particularly about all the droll requirements that are > settled > >> for you by SQL, and there are no real set operations to speak of. > > > > > > He attacks our pattern of building domain models with Neo4j, calling it > > "ugly", "smelly" and "in violation of every law of separation of concerns > > and logical vs. physical models". Is he right? My feeling is that he is > > brain washed with too many so called "best practices", but Neo4j has been > my > > main model for a long time now, my perspective is likely skewed. I'd like > to > > hear your thoughts. > > > > -- > > Tobias Ivarsson <[email protected]> > > > > On Fri, Oct 14, 2011 at 10:32 AM, Rickard Öberg < > [email protected]> wrote: > > > > > > Well, I'd tend to agree with the author. Mixing persistence details with > > the domain model itself is really a bad idea. Infrastructure details > should > > not pollute the domain logic as it does with the currently "suggested" > usage > > of Neo4j. But I think both Spring Data Graph and the Qi4j usage model > fixes > > this, as it allows you to keep many of those things outside of the domain > > code. > > > > /Rickard > > > On Fri, Oct 14, 2011 at 11:45 AM, Tobias Ivarsson <tobias.ivarsson@ > neotechnology.com> wrote: > > > On Fri, Oct 14, 2011 at 11:21 AM, Rickard Öberg < > > [email protected]> wrote: > > > >> On 10/14/11 17:16 , Tobias Ivarsson wrote: > >> > >>> I was hoping for a bit more elaboration, of why it is a bad idea. > >>> > >>> Spring Data Neo4j operates mainly in the same way (at least it did when > >>> I was part of the design process), it just hides the details of it. > >>> > >>> The model we suggest is not to mix infrastructure details (nodes, > >>> relationships, traversals) with the domain logic. We suggest the domain > >>> logic be a separate layer, acting on domain data objects (defined as a > >>> set of interfaces). What we do suggest though is for those domain data > >>> objects to be implemented as wrappers of nodes and relationships. > >>> > >> > >> That sounds like "transaction script+anemic domain model", which is an > >> anti-pattern as well. > > > > > > I'm guessing the anemic domain model is what you are pointing out as the > > anti-pattern. I can see how transaction scripts are and ADM usually come > > together though. > > > > References: > > > > http://martinfowler.com/eaaCatalog/transactionScript.html > > http://martinfowler.com/bliki/AnemicDomainModel.html > > > > > > > >> Domain logic should be in the domain objects, and so splitting them into > >> several layers confuse things more than it helps. > > > > > > I agree with you, SDN "solves" this, and does so well. > > > > > >> So is the bad part of it just the part of having the implementation of > >>> your domain data objects deal with the Neo4j API, instead of having > DTOs > >>> and DAOs that you load from the database, operate on in memory, then > >>> store back to the db at explicit points? Or are there deeper issues > than > >>> that? > >>> > >> > >> It relates to testability (you should be able to test domain objects > >> without infrastructure), code cohesion (put domain logic in domain > objects), > >> and as the article author points out, separation of concerns in general. > > > > > > The testability aspect is interesting and important. I don't have enough > > SDN experience to know how it fares in that regard. > > > > "separation of concerns in general" - anything else to this? or would > this > > (normally) be achieved when testability without infrastructure is > achieved? > > > > On Fri, Oct 14, 2011 at 11:49 AM, Rickard Öberg <rickard.oberg@ > neotechnology.com> wrote: > > > On 10/14/11 17:45 , Tobias Ivarsson wrote: > > > >> I'm guessing the anemic domain model is what you are pointing out as the > >> anti-pattern. I can see how transaction scripts are and ADM usually come > >> together though. > >> > > > > Yeah, it's when you use the two together (very common in Hibernate world > as > > well) that you run into problems, if you do anything but cruddy stuff. > > > > > > The testability aspect is interesting and important. I don't have enough > >> SDN experience to know how it fares in that regard. > >> > >> "separation of concerns in general" - anything else to this? or would > >> this (normally) be achieved when testability without infrastructure is > >> achieved? > >> > > > > I suggest that you read up on SOLID: > > http://en.wikipedia.org/wiki/**SOLID_%28object-oriented_**design%29< > http://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29> > > > > And in particular SRP: > > http://en.wikipedia.org/wiki/**Single_responsibility_**principle< > http://en.wikipedia.org/wiki/Single_responsibility_principle> > > > > Separating the domain objects from the infrastructure is a good start > > ("required but not enough"). > > > > > > /Rickard > > > -- > Tobias Ivarsson <[email protected]> > Hacker, Neo Technology > www.neotechnology.com > Cellphone: +46 706 534857 (Swe); +1 650 450 3806 (US) > _______________________________________________ > Neo4j mailing list > [email protected] > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

