On Thu, May 28, 2015 at 5:20 PM, daniela florescu <[email protected]> wrote:
> The NoSQl industry is extremely successful, used everywhere, and > considered by many the child prodigee of the database industry. > > I could have sworn it is the unacknowledged hip but bastard grandchild of the network and hierarchical databases of the 60's and 70's.... so correct me where I am wrong in what follows. > > They are proud of themselves because they satisfy user needs, aka: they > store data: > (a) which is not in 1st normal form (aka nested, pre-aggregated) > (b) without schema > > …to the practical benefit of: > (a) the application getting the data out of the database exactly as the > application needs it, and not > altered through a normalization phase. > Which can give you blazing fast performance IFF. But to take an example from my movie project. We have stored movie reviews by critic. You pull up a page for the critic and get all the movie reviews he has ever written. Then the client suddenly turns around (as mine did) and says I want to pull up the movie and get all the reviews the different critics have written. That query isn't going to be fast, and if you are not working with a proper query language it might not be straightforward to write. So not only do you not get a free lunch on the performance, you mind end up with a double whammy. In that sense nothing has changed from db's of 60's and 70's . Enter relational and you had (after normalisation) a database design that was neutral to the queries that were to be run as there was no nesting. In addition you got a proper relational query language and something very important - query optimisation (in theory at least) for free. > (b) the lack of fixed schema helps with data flexibility… things change > extremely quickly inside an application > those days (fields being added, deleted, changed, etc) > > How much data independence does that afford you. > > So far so good, and I think until here they are all right. > > [[ One may think that this looks a little bit like … XML, but hey, they > don’t like XML. Fine.]] > > The problems comes when they try to QUERY this data. > > > The NoSQL industry is re-inventing the wheel from scratch, and in a very > chaotic and ad-hoc manner. > > Just look at the sad state of affairs in terms of query languages and > their semantics. > > <snipped/> > > ============== > > Now I can spot several mistake here: > > 1. None of those query language has a clearly designed, mathematical data > model. in the absence of such a data model, that describes the input, the > output > and the intermediate results of a query, how can we define a clean > semantics ? > > 2. All of them have a hacky semantics — “let’s run it and we’ll se what > the result is” kind of thing. The semantics in most cost corner cases — and > by definition > semi-structured data is ONLY corner cases -- is not defined. > > 3. Some try to piggy back on the SQL semantics, ignoring the fact that the > SQL was designed to work on relations, and JSON (or in general, nested > data) > has nothing to do with relations. SQL semantics cannot be “ported”….just > because we reuse the same keywords. > > A big reason why people in Analytics who know what they are talking about are keen to use SQL is because you get query optimisation for free. > 4. None attempted to define a type system (even a basic one for atomic > types like dates, and arithmetics on them..) and a schema language. > > Now maybe it’s clear why I am so sad that the XQuery community, instead of > trying to help the younger and naive NoSQL community, which still believes > that > SQL is “good enough”, and using the SELECT-FROM-WHERE keywords is the > magic bullet to define the semantics of any kind of query language, the > XQuery community > is still looking at their own navel, and marveling, like the well known > CEO: "we can handle flexible data" !!! > > Just compare those languages I listed above with the work that has been > done in the past 16 years in XQuery, and the correctness and the complexity > of the result > vs, the hacky solutions above. > > P.S. And yes, that work from XQuery was used 100% in the design of JSONiq, > which was designed with the dual goal in mind: > (a) reuse 100% of the experience of design and implementation of XQuery and > (b) provide a query language that is synactically and semantically > acceptable for the JSON community. > > It's called Javascript. Also known as Python. > if we succeeded or not, that’s another story, but I am not aware of any > other solution that even comes CLOSE to that goal. > > They don't share that goal.
_______________________________________________ [email protected] http://x-query.com/mailman/listinfo/talk
