Frankly, I'm not sure the author understands the real issues. Pretty much
everything he says about the limits of SQL is not strictly true.  The whole
issue here is not what kinds of models you can create in a relational
database or graph database.  If you know what you are doing you can use
either database to do the trick.  The question is, which he sorta of hints
at, is how efficiently can you make that model perform in a particular
database?  You can certainly create meta models and abstract models in
relational databases, (it's not neccesarily easy, but unfortunately
abstraction seems to be the main issue that separates the real modelers from
people who may just never get it).  However, if you succeed in getting such
a model into a relational database, then what?  The relational model
requires many joins to get from the abstract model to the concrete
implementations.  Caching will solve that (for both technologies), but to
quote Tim Bray quoting Phil Karlton "There are only two hard things in
Computer Science: cache invalidation and naming things."  (Both applicable
here, since half the problem of abstraction is naming....) In the end, if
this stuff was easy we wouldn't have jobs; you've got to know how to match
the technology to the problems, but make sure the problem requirements drive
the choice of technology, not the other way around.

More to the point, NoSQL is not going to displace relational, but NoSQL and
Neo4J in particular, will solve problems relational is not good at.  Is
traversal one of those areas?  Depends on the model, but for large data sets
with many loosely coupled attributes then graph databases are probably going
to be the better match.  Are large data sets with many loosely coupled
attributes one of those areas in general?  Not necessarily, there are many
good technologies (not necessarily pure relational) for handling sparse data
sets, so depending on the problem at hand they may be a better solution.  If
the problem isn't traversal, and maybe doesn't require near real time
answers, then you may have a more standard data ware house type issue at
hand and maybe you should look at those technologies....

There's a whole slew of ways attacking the issues raised in the article and
it seems to ignore most of them.  It ends up seeming mostly an argument for
NoSQL technologies, but the rational lined up to support the argument just
doesn't make it in my book, nor does the little rant against Neo4J:  The
"Node" is a way to access the graph, just like the "ResultSet" (pick your
class) is the way to access the set (of sets) in a relational database.
 Separating that level of "physical" from the "logical" isn't the issue at
the point you are doing data access.  As others have pointed out, you can
layer on mapping tools if you want to do that, but then of course you've got
to play with the performance issues and the limitations of the mappings and
sometimes (usually in my experience) that's a bigger problem then mixing a
physical model with a logical model, at least if you contain that to some
small portion of your application.

Peter Hunsberger


On Fri, Oct 14, 2011 at 7:48 AM, Tobias Ivarsson <
[email protected]> wrote:

> We had an interesting discussion about this internally at Neo Technology
> today. We thought it might be of interest to the broader community. I don't
> think the discussion is over, so it would be interesting to continue it on
> the public mailing list.
>
> It regards the initial paragraphs of an article posted to dzone recently:
> http://www.dzone.com/links/rss/the_coming_sql_collapse.html
> It mentions Neo4j and how the author dislikes a common way of using Neo4j
> for building applications.
>
> It would be interesting to hear suggestions on how to improve this.
>
> Forwarded conversation follows:
>
> On Fri, Oct 14, 2011 at 10:13 AM, Tobias Ivarsson <
> [email protected]> wrote:
>
> > I found this while reading feeds in bed last night:
> >
> > *The Coming SQL Collapse*
> > http://www.dzone.com/links/rss/the_coming_sql_collapse.html
> >
> > (Sent from Flipboard <http://flipboard.com>)
> >
> >
> > The things he say about SQL vs NOSQL is not very interesting, but I'd
> like
> > to raise what he says about Neo4j:
> >
> >  I looked at neo4j briefly the other day, and quite predictably thought
> >> ‘wow, this looks like a serious tinkertoy: it‘s basically a bunch of
> nodes
> >> where you just blob your attributes.‘ Worse than that, to wrap objects
> >> around it, you have to have them explicitly incorporate their node
> class,
> >> which is ugly, smelly, violates every law of separation of concerns and
> >> logical vs. physical models. On the plus side, as I started to look at
> it
> >> more, I realized that it was the perfect way to implement a backend for
> a
> >> bayesian inference engine (more on that later). Why? Because inference
> >> doesn‘t care particularly about all the droll requirements that are
> settled
> >> for you by SQL, and there are no real set operations to speak of.
> >
> >
> > He attacks our pattern of building domain models with Neo4j, calling it
> > "ugly", "smelly" and "in violation of every law of separation of concerns
> > and logical vs. physical models". Is he right? My feeling is that he is
> > brain washed with too many so called "best practices", but Neo4j has been
> my
> > main model for a long time now, my perspective is likely skewed. I'd like
> to
> > hear your thoughts.
> >
> > --
> > Tobias Ivarsson <[email protected]>
> >
>
> On Fri, Oct 14, 2011 at 10:32 AM, Rickard Öberg <
> [email protected]> wrote:
> >
> >
> > Well, I'd tend to agree with the author. Mixing persistence details with
> > the domain model itself is really a bad idea. Infrastructure details
> should
> > not pollute the domain logic as it does with the currently "suggested"
> usage
> > of Neo4j. But I think both Spring Data Graph and the Qi4j usage model
> fixes
> > this, as it allows you to keep many of those things outside of the domain
> > code.
> >
> > /Rickard
>
>
> On Fri, Oct 14, 2011 at 11:45 AM, Tobias Ivarsson <tobias.ivarsson@
> neotechnology.com> wrote:
>
> > On Fri, Oct 14, 2011 at 11:21 AM, Rickard Öberg <
> > [email protected]> wrote:
> >
> >> On 10/14/11 17:16 , Tobias Ivarsson wrote:
> >>
> >>> I was hoping for a bit more elaboration, of why it is a bad idea.
> >>>
> >>> Spring Data Neo4j operates mainly in the same way (at least it did when
> >>> I was part of the design process), it just hides the details of it.
> >>>
> >>> The model we suggest is not to mix infrastructure details (nodes,
> >>> relationships, traversals) with the domain logic. We suggest the domain
> >>> logic be a separate layer, acting on domain data objects (defined as a
> >>> set of interfaces). What we do suggest though is for those domain data
> >>> objects to be implemented as wrappers of nodes and relationships.
> >>>
> >>
> >> That sounds like "transaction script+anemic domain model", which is an
> >> anti-pattern as well.
> >
> >
> > I'm guessing the anemic domain model is what you are pointing out as the
> > anti-pattern. I can see how transaction scripts are and ADM usually come
> > together though.
> >
> > References:
> >
> > http://martinfowler.com/eaaCatalog/transactionScript.html
> > http://martinfowler.com/bliki/AnemicDomainModel.html
> >
> >
> >
> >> Domain logic should be in the domain objects, and so splitting them into
> >> several layers confuse things more than it helps.
> >
> >
> > I agree with you, SDN "solves" this, and does so well.
> >
> >
> >> So is the bad part of it just the part of having the implementation of
> >>> your domain data objects deal with the Neo4j API, instead of having
> DTOs
> >>> and DAOs that you load from the database, operate on in memory, then
> >>> store back to the db at explicit points? Or are there deeper issues
> than
> >>> that?
> >>>
> >>
> >> It relates to testability (you should be able to test domain objects
> >> without infrastructure), code cohesion (put domain logic in domain
> objects),
> >> and as the article author points out, separation of concerns in general.
> >
> >
> > The testability aspect is interesting and important. I don't have enough
> > SDN experience to know how it fares in that regard.
> >
> > "separation of concerns in general" - anything else to this? or would
> this
> > (normally) be achieved when testability without infrastructure is
> achieved?
> >
>
> On Fri, Oct 14, 2011 at 11:49 AM, Rickard Öberg <rickard.oberg@
> neotechnology.com> wrote:
>
> > On 10/14/11 17:45 , Tobias Ivarsson wrote:
> >
> >> I'm guessing the anemic domain model is what you are pointing out as the
> >> anti-pattern. I can see how transaction scripts are and ADM usually come
> >> together though.
> >>
> >
> > Yeah, it's when you use the two together (very common in Hibernate world
> as
> > well) that you run into problems, if you do anything but cruddy stuff.
> >
> >
> > The testability aspect is interesting and important. I don't have enough
> >> SDN experience to know how it fares in that regard.
> >>
> >> "separation of concerns in general" - anything else to this? or would
> >> this (normally) be achieved when testability without infrastructure is
> >> achieved?
> >>
> >
> > I suggest that you read up on SOLID:
> > http://en.wikipedia.org/wiki/**SOLID_%28object-oriented_**design%29<
> http://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29>
> >
> > And in particular SRP:
> > http://en.wikipedia.org/wiki/**Single_responsibility_**principle<
> http://en.wikipedia.org/wiki/Single_responsibility_principle>
> >
> > Separating the domain objects from the infrastructure is a good start
> > ("required but not enough").
> >
> >
> > /Rickard
> >
> --
> Tobias Ivarsson <[email protected]>
> Hacker, Neo Technology
> www.neotechnology.com
> Cellphone: +46 706 534857 (Swe); +1 650 450 3806 (US)
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to