[
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14350751#comment-14350751
]
Tyler Hobbs commented on CASSANDRA-7970:
----------------------------------------
Okay, I've updated the
[branch|https://github.com/apache/cassandra/compare/trunk...thobbs:7970-trunk-v2]
to address your changes, with roughly one commit per comment to make it easier
to follow. Some responses:
bq. It's not entirely this patch fault but I don't understand why we need
{{FunctionName.equalsNativeFunction}}
It's just to assume the system keyspace if a keyspace isn't set on the
function. I've modified the function to make that more correct and a little
clearer. Let me know if that makes sense to you.
bq. If we made fromJSONObject return a Term, we'd avoid serializing collection
to re-deserialize them right away.
I too would like to avoid the unnecessary serializations and deserializations.
We have this problem in other parts of the code as well (IN value lists,
multi-column relations, and collections in column conditions, iirc). However,
I'm not clear on what you're suggesting. Can you elaborate?
bq. {{FromJsonFct}} should probably use the new {{ExecutionException}}
I'm not sure about this. Errors in {{FromJsonFct}} are type errors of a sort
(at least if you consider JSON's native type formatting) or incorrect CQL
literals (which also result in an IRE). I think sticking with an IRE for type
errors with native functions would be more consistent from a user's point of
view. Plus, the purpose of {{ExecutionException}} is pretty clear right now:
it's for errors that occur while executing a user-defined function. We would
lose that if we use it here.
bq. For INSERT JSON, I think having a ParsedInsertJson separate from
ParsedInsert makes things a bit cleaner. \[...\] How do you feel about it?
I definitely agree that the separate Insert classes are an improvement. The
Json Term refactoring is less of a win, because it still has some odd parts
(classes that are both Raw and Terminal), but the old version had odd parts
too. I'm fine with the new setup, so I made those changes with a few things
renamed for clarity (e.g. SimplePrepared -> PreparedLiteral).
bq. Any particular reason for having DecimalType and IntegerType quote their
values?
Well, I had some concerns around loss of precision and overflows if the
client-side JSON decoder attempted to store them as, say, Java doubles and
longs. However, I think it would be reasonable to say that the client should
ensure their JSON decoder handles these issues. So for now, I've switched them
to non-string representations. Let me know if you thing the client-side
decoding issues warrant keeping them as strings.
bq. Out of curiosity, was there a profound reason to make Term.Terminal take
the protocol version only?
This was part of what I was attempting to address with the
CollectionSerializer.Format code (now reverted). Some parts of the code need
to specify a format/protocol version without any QueryOptions present (or it
doesn't make sense to use QueryOptions). It's a bit weird to pass around a
protocol version when we're not actually dealing with the native protocol,
which is why I looked at switching to a format enum.
I understood why we were using QueryOptions to begin with, but this seemed like
the least invasive change.
> JSON support for CQL
> --------------------
>
> Key: CASSANDRA-7970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
> Project: Cassandra
> Issue Type: New Feature
> Components: API
> Reporter: Jonathan Ellis
> Assignee: Tyler Hobbs
> Labels: client-impacting, cql3.3, docs-impacting
> Fix For: 3.0
>
> Attachments: 7970-trunk-v1.txt
>
>
> JSON is popular enough that not supporting it is becoming a competitive
> weakness. We can add JSON support in a way that is compatible with our
> performance goals by *mapping* JSON to an existing schema: one JSON documents
> maps to one CQL row.
> Thus, it is NOT a goal to support schemaless documents, which is a misfeature
> [1] [2] [3]. Rather, it is to allow a convenient way to easily turn a JSON
> document from a service or a user into a CQL row, with all the validation
> that entails.
> Since we are not looking to support schemaless documents, we will not be
> adding a JSON data type (CASSANDRA-6833) a la postgresql. Rather, we will
> map the JSON to UDT, collections, and primitive CQL types.
> Here's how this might look:
> {code}
> CREATE TYPE address (
> street text,
> city text,
> zip_code int,
> phones set<text>
> );
> CREATE TABLE users (
> id uuid PRIMARY KEY,
> name text,
> addresses map<text, address>
> );
> INSERT INTO users JSON
> {‘id’: 4b856557-7153,
> ‘name’: ‘jbellis’,
> ‘address’: {“home”: {“street”: “123 Cassandra Dr”,
> “city”: “Austin”,
> “zip_code”: 78747,
> “phones”: [2101234567]}}};
> SELECT JSON id, address FROM users;
> {code}
> (We would also want to_json and from_json functions to allow mapping a single
> column's worth of data. These would not require extra syntax.)
> [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
> [2] https://blog.compose.io/schema-less-is-usually-a-lie/
> [3] http://dl.acm.org/citation.cfm?id=2481247
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)