[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

Tyler Hobbs (JIRA) Fri, 06 Mar 2015 11:19:41 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350751#comment-14350751
 ]


Tyler Hobbs commented on CASSANDRA-7970:
----------------------------------------

Okay, I've updated the 
[branch|https://github.com/apache/cassandra/compare/trunk...thobbs:7970-trunk-v2]
 to address your changes, with roughly one commit per comment to make it easier 
to follow.  Some responses:

bq. It's not entirely this patch fault but I don't understand why we need 
{{FunctionName.equalsNativeFunction}}

It's just to assume the system keyspace if a keyspace isn't set on the 
function.  I've modified the function to make that more correct and a little 
clearer.  Let me know if that makes sense to you.

bq. If we made fromJSONObject return a Term, we'd avoid serializing collection 
to re-deserialize them right away.

I too would like to avoid the unnecessary serializations and deserializations. 
We have this problem in other parts of the code as well (IN value lists, 
multi-column relations, and collections in column conditions, iirc).  However, 
I'm not clear on what you're suggesting.  Can you elaborate?

bq. {{FromJsonFct}} should probably use the new {{ExecutionException}}

I'm not sure about this.  Errors in {{FromJsonFct}} are type errors of a sort 
(at least if you consider JSON's native type formatting) or incorrect CQL 
literals (which also result in an IRE).  I think sticking with an IRE for type 
errors with native functions would be more consistent from a user's point of 
view.  Plus, the purpose of {{ExecutionException}} is pretty clear right now: 
it's for errors that occur while executing a user-defined function.  We would 
lose that if we use it here.

bq. For INSERT JSON, I think having a ParsedInsertJson separate from 
ParsedInsert makes things a bit cleaner. \[...\] How do you feel about it?

I definitely agree that the separate Insert classes are an improvement.  The 
Json Term refactoring is less of a win, because it still has some odd parts 
(classes that are both Raw and Terminal), but the old version had odd parts 
too.  I'm fine with the new setup, so I made those changes with a few things 
renamed for clarity (e.g. SimplePrepared -> PreparedLiteral).

bq. Any particular reason for having DecimalType and IntegerType quote their 
values?

Well, I had some concerns around loss of precision and overflows if the 
client-side JSON decoder attempted to store them as, say, Java doubles and 
longs.  However, I think it would be reasonable to say that the client should 
ensure their JSON decoder handles these issues.  So for now, I've switched them 
to non-string representations.  Let me know if you thing the client-side 
decoding issues warrant keeping them as strings.

bq. Out of curiosity, was there a profound reason to make Term.Terminal take 
the protocol version only?

This was part of what I was attempting to address with the 
CollectionSerializer.Format code (now reverted).  Some parts of the code need 
to specify a format/protocol version without any QueryOptions present (or it 
doesn't make sense to use QueryOptions).  It's a bit weird to pass around a 
protocol version when we're not actually dealing with the native protocol, 
which is why I looked at switching to a format enum.

I understood why we were using QueryOptions to begin with, but this seemed like 
the least invasive change.

> JSON support for CQL
> --------------------
>
>                 Key: CASSANDRA-7970
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>              Labels: client-impacting, cql3.3, docs-impacting
>             Fix For: 3.0
>
>         Attachments: 7970-trunk-v1.txt
>
>
> JSON is popular enough that not supporting it is becoming a competitive 
> weakness.  We can add JSON support in a way that is compatible with our 
> performance goals by *mapping* JSON to an existing schema: one JSON documents 
> maps to one CQL row.
> Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
> [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
> document from a service or a user into a CQL row, with all the validation 
> that entails.
> Since we are not looking to support schemaless documents, we will not be 
> adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
> map the JSON to UDT, collections, and primitive CQL types.
> Here's how this might look:
> {code}
> CREATE TYPE address (
>   street text,
>   city text,
>   zip_code int,
>   phones set<text>
> );
> CREATE TABLE users (
>   id uuid PRIMARY KEY,
>   name text,
>   addresses map<text, address>
> );
> INSERT INTO users JSON
> {‘id’: 4b856557-7153,
>    ‘name’: ‘jbellis’,
>    ‘address’: {“home”: {“street”: “123 Cassandra Dr”,
>                         “city”: “Austin”,
>                         “zip_code”: 78747,
>                         “phones”: [2101234567]}}};
> SELECT JSON id, address FROM users;
> {code}
> (We would also want to_json and from_json functions to allow mapping a single 
> column's worth of data.  These would not require extra syntax.)
> [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
> [2] https://blog.compose.io/schema-less-is-usually-a-lie/
> [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

Reply via email to