[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157218#comment-14157218
 ] 

Jonathan Ellis commented on CASSANDRA-7970:
-------------------------------------------

bq. With SELECT, the number and type of selected columns changes, which is 
somewhat quirky.

I guess I don't see how this is more quirky than number and type of selected 
columns changing in a non-json select.

bq. It's not possible to select both normal columns and a JSON document that 
contains some of the columns.

True, but that seems like a reasonable constraint for the design goals here.

bq. Functions cannot be applied to the JSON document. (For example, concat(), 
if we consider aggregation functions.)

Also a reasonable constraint for the design goals.

Here's my problem with the row_to_json approach: it inherently requires special 
casing the function machinery, which is Bad.  

What I mean by this is, all other function calls should have the same behavior 
whether called on a literal or a column with the same value.  Thus, sin(2) and 
sin(a) where a=2 are semantically equivalent.  But row_to_json(a) has to cheat 
and generate {code}{'a': 2}{code}.

(Unless you are saying that row_to_json(a) would generate just the literal 2, 
in which case I have to say that having to spell out all the json fields is 
insufficiently usable to be a viable candidate.)

Postgresql gets around this with subselects and the concept of a row 
constructor: 
http://hashrocket.com/blog/posts/faster-json-generation-with-postgresql

I'm okay with the postgresql approach in a lot of respects, but it introduces a 
ton of complexity that I don't really want to block for.  Nor am I sure that we 
want to open up the subquery can of worms for this, which will inevitably lead 
to people pushing for them to be generalized.

> JSON support for CQL
> --------------------
>
>                 Key: CASSANDRA-7970
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>             Fix For: 3.0
>
>
> JSON is popular enough that not supporting it is becoming a competitive 
> weakness.  We can add JSON support in a way that is compatible with our 
> performance goals by *mapping* JSON to an existing schema: one JSON documents 
> maps to one CQL row.
> Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
> [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
> document from a service or a user into a CQL row, with all the validation 
> that entails.
> Since we are not looking to support schemaless documents, we will not be 
> adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
> map the JSON to UDT, collections, and primitive CQL types.
> Here's how this might look:
> {code}
> CREATE TYPE address (
>   street text,
>   city text,
>   zip_code int,
>   phones set<text>
> );
> CREATE TABLE users (
>   id uuid PRIMARY KEY,
>   name text,
>   addresses map<text, address>
> );
> INSERT INTO users JSON
> {‘id’: 4b856557-7153,
>    ‘name’: ‘jbellis’,
>    ‘address’: {“home”: {“street”: “123 Cassandra Dr”,
>                         “city”: “Austin”,
>                         “zip_code”: 78747,
>                         “phones”: [2101234567]}}};
> SELECT JSON id, address FROM users;
> {code}
> (We would also want to_json and from_json functions to allow mapping a single 
> column's worth of data.  These would not require extra syntax.)
> [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
> [2] https://blog.compose.io/schema-less-is-usually-a-lie/
> [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to