[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-04-01 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390820#comment-14390820
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

Alright, +1, thanks.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt, 7970-trunk-v2.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-03-26 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382768#comment-14382768
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


I've pushed some new commits to my branch to address your comments.  I also 
merged in the latest trunk and added support for the new date and time types.

bq. So, AbstractType.fromJSONObject would return a Term

Done (over a few commits).  The only hangup was that with collections and 
tuples, we need to avoid serializing elements in {{fromJSONObject}} because 
this can happen at prepare-time when we don't know the protocol version.  
Accordingly, Those classes return a DelayedValue instead of a terminal Value.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-03-19 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14369330#comment-14369330
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

bq. It's just to assume the system keyspace if a keyspace isn't set on the 
function.

Ok. I still think the whole dealing with keyspace in {{FunctionName}} is 
fragile but it's somewhat outside this ticket so I've created CASSANDRA-8994.

bq. However, I'm not clear on what you're suggesting. Can you elaborate?

So, {{AbstractType.fromJSONObject}} would return a {{Term}}. For basic types, 
it would be a {{Constants.Value}} but for say a list, it would be a 
{{Lists.Value}} (containing the unserialized collection). Then 
{{Json.ColumnValue}} would just be a {{Term.Raw}} (not a {{Term.Terminal}}) and 
it's {{prepare}} would return the result of {{fromJSONObject}}. The end result 
being that {{Lists.Appender.doAppend}} would always get a {{Lists.Value}} and 
we won't need {{Lists.getElementsFromValue}}.

bq. We have this problem in other parts of the code as well

Right, and we should fix those some day, but that's another story :)

bq. Plus, the purpose of {{ExecutionException}} is pretty clear right now: it's 
for errors that occur while executing a user-defined function.

It's meant for errors while executing functions in general, not necessarily 
user-defined ones and I'm absolutely convinced we'll have to use it in native 
functions sooner or later. And functions in general can be used in select 
clauses, in which case they'll error out during execution, not during 
validation, so we shouldn't use IRE for them (as mentioned by Jonathan in [that 
comment|https://issues.apache.org/jira/browse/CASSANDRA-5910?focusedCommentId=13746044page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13746044]).
  Now, you could say that {{FromJsonFct}} cannot currently be used in select 
clauses, but 1) I'm hoping we can fix that at some point and 2) I'd really 
prefer consider those JSON functions as normal functions as much as possible, 
so I do prefer saying those errors are errors during the execution of a 
function rather than type errors of a sort.

Lastly, a micro nits: you could make {{JSON_IDENTIFIER}} public in 
{{Json.java}} and use it in {{Selection}}.


 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-03-06 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350767#comment-14350767
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

bq. DecimalType and IntegerType quote their values

IMO we should leave it as non-quoted as Sylvain suggests. With Java there 
shouldn't be any problem with (modern) JSON libs. Not sure about other 
languages' libs.

The only issue I can imagine is accidentally passing a decimal for an integer 
(e.g. {{5.0}} instead of {{5}} or {{5E+03}} instead of {{5000}}) - json spec 
does only know numbers and does not distinguish between decimals and integers. 
That might be something we should ensure (i.e. allow a decimal value for 
integer values) - not sure whether the patch supports that. If it does, forgive 
me for not knowing ;)

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-03-06 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350804#comment-14350804
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


The current patch will accept integers for float, double, and decimal types.  
(Thanks for the reminder, though: I needed to disallow floats and overflow for 
int and bigint types.  I pushed a commit to do this.)

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-03-06 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350751#comment-14350751
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


Okay, I've updated the 
[branch|https://github.com/apache/cassandra/compare/trunk...thobbs:7970-trunk-v2]
 to address your changes, with roughly one commit per comment to make it easier 
to follow.  Some responses:

bq. It's not entirely this patch fault but I don't understand why we need 
{{FunctionName.equalsNativeFunction}}

It's just to assume the system keyspace if a keyspace isn't set on the 
function.  I've modified the function to make that more correct and a little 
clearer.  Let me know if that makes sense to you.

bq. If we made fromJSONObject return a Term, we'd avoid serializing collection 
to re-deserialize them right away.

I too would like to avoid the unnecessary serializations and deserializations. 
We have this problem in other parts of the code as well (IN value lists, 
multi-column relations, and collections in column conditions, iirc).  However, 
I'm not clear on what you're suggesting.  Can you elaborate?

bq. {{FromJsonFct}} should probably use the new {{ExecutionException}}

I'm not sure about this.  Errors in {{FromJsonFct}} are type errors of a sort 
(at least if you consider JSON's native type formatting) or incorrect CQL 
literals (which also result in an IRE).  I think sticking with an IRE for type 
errors with native functions would be more consistent from a user's point of 
view.  Plus, the purpose of {{ExecutionException}} is pretty clear right now: 
it's for errors that occur while executing a user-defined function.  We would 
lose that if we use it here.

bq. For INSERT JSON, I think having a ParsedInsertJson separate from 
ParsedInsert makes things a bit cleaner. \[...\] How do you feel about it?

I definitely agree that the separate Insert classes are an improvement.  The 
Json Term refactoring is less of a win, because it still has some odd parts 
(classes that are both Raw and Terminal), but the old version had odd parts 
too.  I'm fine with the new setup, so I made those changes with a few things 
renamed for clarity (e.g. SimplePrepared - PreparedLiteral).

bq. Any particular reason for having DecimalType and IntegerType quote their 
values?

Well, I had some concerns around loss of precision and overflows if the 
client-side JSON decoder attempted to store them as, say, Java doubles and 
longs.  However, I think it would be reasonable to say that the client should 
ensure their JSON decoder handles these issues.  So for now, I've switched them 
to non-string representations.  Let me know if you thing the client-side 
decoding issues warrant keeping them as strings.

bq. Out of curiosity, was there a profound reason to make Term.Terminal take 
the protocol version only?

This was part of what I was attempting to address with the 
CollectionSerializer.Format code (now reverted).  Some parts of the code need 
to specify a format/protocol version without any QueryOptions present (or it 
doesn't make sense to use QueryOptions).  It's a bit weird to pass around a 
protocol version when we're not actually dealing with the native protocol, 
which is why I looked at switching to a format enum.

I understood why we were using QueryOptions to begin with, but this seemed like 
the least invasive change.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: 

[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-23 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333620#comment-14333620
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

bq. I attempted to improve the handling of collection serialization formats 
\[...\]. I have mixed feelings about the results.

Me too :). I do think some clarification around this can be done, but I'm not 
entirely sure having a special version for collection serialization is 
necessarily the best way to go. Overall, I'd prefer leaving such refactor out 
of this ticket.

Other remarks on the patch:
* For the textile doc, I'd split out a new JSON section and only link to that 
from the {{INSERT}} and {{SELECT}} doc. Documenting JSON in the middle of the 
statements feels slightly confusing, it would feel clearer to me to concentrate 
it in a separate section, one that also start by stating what JSON support is 
all about (i.e. really just a convenience for getting data from/to a JSON 
form).  Also, the doc is fully up to date with the patch (mention that 
{{VALUES}} for {{INSERT}} is optional when using JSON for instance) - 
{{cql3handling.py}} is not up to date on the {{INSERT}} syntax.
* When doing {{INSERT INTO foo JSON ?}}, I'd maybe use {{\[json\]}} as receiver 
name to be consistent with what we do for {{LIMIT}}, {{TIMESTAMP}} and {{TTL}}. 
Also, UpdateStatement.ParsedInsert.jsonReceiverSpec is unused. Same for 
{{SELECT}} (for consistency with {{\[applied\]}}).
* Should make {{FromJsonFct/toJsonFct}} extends {{NativeScalarFunction}} 
directly.
* It's not entirely this patch fault but I don't understand why we need 
{{FunctionName.equalsNativeFunction}} (why not just use equals)?  The only 
difference is that if {{this}} doesn't have a keyspace, it doesn't check the 
other name doesn't either, but that just feels wrong to me tbh.
* For UDT, I think we need to care about case sensitivity in {{fromJSONObject}}.
* In TupleType/UDTType.fromJSONObject, we should force the format to V3. It's 
also a tad weird to use CollectionSerializer.writeValue to write the UDT. Why 
not use the TupleValue.buildValue() function? Or my next suggestion.
* If we made {{fromJSONObject}} return a {{Term}}, we'd avoid serializing 
collection to re-deserialize them right away. We'd also leave the serialization 
of UDT/Tuple to UserTypes/TupleTypes.DelayedValues (doesn't hurt to concentrate 
it in one place as much as possible) and save a bunch of changes (in 
ColumnCondition, Lists, Sets and Maps).
* I'd prefer limiting the json pollution in {{Selection}}. We (mostly) really 
need to bother about JSON when writing the ResultSet, so we could simply pass 
the isJson to the {{resultSetBuilder}} method (somehow it makes more sense to 
me to call {{rowToJson}} in the builder anyway).  The one other place we need 
it is {{getResultMetadata}}, but we could also pass a {{isJson}} flag there and 
either build the metadata if it's json (not a big deal) or find a simple way to 
cache the json metadata per-table (since it's always the same thing except for 
the keyspace and table name).
* {{FromJsonFct}} should probably use the new {{ExecutionException}} from 
CASSANDRA-8528 rather than {{InvalidRequestException}}.
* For {{INSERT JSON}}, I think having a ParsedInsertJson separate from 
ParsedInsert makes things a bit cleaner. I also feel that Json.java could 
encapsulate a bit better the necessary specificity of {{INSERT JSON}}. As I'm 
not sure how to express clearly what I mean (and because I wasn't sure that 
what I had in mind would really work), I've push a commit at 
[here|https://github.com/pcmanus/cassandra/commits/7970] shows what I have in 
mind. It's not really different from the initial approach but it feels cleaner 
overall. How do you feel about it?
* Any particular reason for having {{DecimalType}} and {{IntegerType}} quote 
their values (they use the default {{toJSONString}}). I would have expected 
them to translate to JSON numbers since as far as I can tell JSON doesn't 
impose any size limit on its numbers.
* Maybe we could abstract slighty our use of jackson (put the helpers we need 
in Json.java maybe?), so that 1) we have only one place to change if we upgrade 
jackson and the API change (or we want to change of library) and 2) we save 
creating multiple ObjectMapper or JsonStringEncoder objects.

Also two minor nits:
* The patch makes ResultSet.COUNT_COLUMN public for no reason it seems.
* Out of curiosity, was there a profound reason to make Term.Terminal take the 
protocol version only (instead of the full QueryOptions)? If it's just because 
it's the only thing used, that's fine, but just fyi passing QueryOptions was 
done so we don't have to change everything again if a future Term needs 
something else than the protocol version.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: 

[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-19 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328097#comment-14328097
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


bq. I'd actually have a preference for not allowing the column name declaration 
at all as it doesn't buy us anything

There is a minor difference in behavior.  If the JSON value map does not 
contain an entry for a column listed in the column names, null will be inserted 
for that column.  Omitting the column names is currently equivalent to listing 
them all.  An example of the current behavior may clarify this:

{code:sql}
 CREATE TABLE foo (a int, b int, c int, d int, PRIMARY KEY (a, b));
 INSERT INTO foo JSON '{a: 0, b: 0, c: 0, d: 0}';
 SELECT * FROM foo;

 a | b | c | d
---+---+---+---
 0 | 0 | 0 | 0

 INSERT INTO foo JSON '{a: 0, b: 0, c: 0}';
 SELECT * FROM foo;

 a | b | c | d
---+---+---+--
 0 | 0 | 0 | null
{code}

Since column {{d}} didn't have a value in the json map, null was inserted.

However, if we explicitly list the column names (and leave out {{d}}), we can 
avoid inserting null:

{code:sql}
 INSERT INTO foo JSON '{a: 0, b: 0, c: 0, d: 0}';
 INSERT INTO foo (a, b, c) JSON '{a: 0, b: 0, c: 0}';
 SELECT * FROM foo;

 a | b | c | d
---+---+---+---
 0 | 0 | 0 | 0
{code}

bq. Even if some of our literals can be of multiple types (typically numeric 
literals), they will always translate to the same thing in JSON anyway so that 
shouldn't be a problem. As for bind markers, we can do what we do for other 
functions when their is an ambiguity and require the user to provide a 
type-cast. Is it just that it's not convenient to do with the current code, or 
is there something more fundamental I'm missing?

Hmm, I think what you suggest should be possible, although it will probably 
require a fair amount of code changes, especially for handling literals (which 
would probably bypass the whole AbstractType system).  Let me play around with 
this and get back to you.  Would you like to get support for this and the 
{{fromJson()}} changes in this patch, or would you be willing to push this to 
another ticket?

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-19 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328152#comment-14328152
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

bq. There is a minor difference in behavior.

I see. But the thing is that we don't really have a way to implement such 
difference except at the root level. Meaning that if you have:
{noformat}
INSERT INTO foo JSON '{a: 0, b : { c : 1, d : 2}}'
{noformat}
where {{b}} is a non-frozen UDT (which we don't have yet but will eventually) 
with more than columns {{c}} and {{d}}, we'll have to choose the behavior once 
and for all (either it replace {{b}} completely, setting it's other column to 
null, which is what I think it should do, or it let those other column 
untouched). Hence I'm not entirely sure having a special case for the root 
level is all that consistent/useful.

Further, while C* doesn't strongly enforce it, we've tried to make it clear 
that {{INSERT}} should be used for actual inserts and you should use {{UPDATE}} 
for updates. While if you care about the difference between those two variants, 
it feels to me that you're using an {{INSERT}} where you should be using an 
{{UPDATE}}.  So I still would feel more confortable removing the 'with column 
name' variant from the initial version, we can always revisit later if it feels 
truly useful.

bq. would you be willing to push this to another ticket?

That. My suggestion is indeed a bit involved and is to some extend orthogonal 
to this issue (it's more about improving the type system). Let's indeed stick 
to simple rejection when we don't know how to handle it for now.  We can 
scratch my itch later :)


 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-19 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328357#comment-14328357
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


bq. So I still would feel more confortable removing the 'with column name' 
variant from the initial version, we can always revisit later if it feels truly 
useful.

That's reasonable.  I've updated my branch to remove support for column names + 
JSON.

bq. Let's indeed stick to simple rejection when we don't know how to handle it 
for now.

Okay, I've opened CASSANDRA-8837 for a more complete solution.  In the 
meantime, I pushed a (slightly hacky) commit to make {{fromJson()}} results 
always weakly assignable, forcing the user to typecast when it's used as the 
argument of an overloaded function.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-16 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14322549#comment-14322549
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

Sorry for the lack of communication. I'll have a good look at this but that 
might only be next week if that's ok (got a long flight next Sunday so I'll 
review it then). That said a couple of early remarks (that might be somewhat 
off since I haven't checked the patch) based on the comments on this ticket so 
far.

bq. I've made the column name declaration optional with doing INSERT JSON

I'd actually have a preference for not allowing the column name declaration at 
all as it doesn't buy us anything and imo having 2 forms is more confusing than 
anything. Even if we later want to allow both {{VALUES}} and {{JSON}} (which 
I'm actually kind of against but we can argue later since we've at least agreed 
on postoning that option), we can introduce back the names declaration later.

bq. toJson() can only be used in the selection clause of a SELECT statement, 
because it can accept any type and the exact argument type must be known.

Not 100% sure I see where the problem is on this one, at least in theory. Even 
if some of our literals can be of multiple types (typically numeric literals), 
they will always translate to the same thing in JSON anyway so that shouldn't 
be a problem. As for bind markers, we can do what we do for other functions 
when their is an ambiguity and require the user to provide a type-cast. Is it 
just that it's not convenient to do with the current code, or is there 
something more fundamental I'm missing?

bq. fromJson() can only be used in INSERT/UPDATE/DELETE statements because the 
receiving type must be known in order to parse the JSON correctly.

That one I understand, but I'm not sure a per-statement restriction is 
necessary the most appropriate because I suppose there is a problem with 
functions too since we allow overloading (namely, we can have 2 {{foo}} method, 
one taking a {{listtext}} as argument, and the other taking a {{int}}, so 
{{foo(fromJson(z))}} would be problematic). So the most logical way to handle 
this for me would be to generalize slightly the notion of some type that we 
already have due to bind marker.  Typically, both a bind marker type and 
{{fromJson}} return type would be some type, and when the type checker 
encounter one and can't resolve it to a single type, it would reject it asking 
the user to type-cast explicitely. Similarly, {{toJon()}} argument could be 
some type. Again, we already do this for bind markers, it's a just a bit 
adhoc so it would just be a matter of generalizing it a bit. 

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-14 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321365#comment-14321365
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

Latest contents on your branch LGTM

/cc [~slebresne]

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-13 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319732#comment-14319732
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

bq. JsonTest.testCaseSensitivity()
An INSERT with an upper-case 'K' would be fine IMO

Hm - and the CQL doc is missing ({{doc/cql3/CQL.textile}}). I've started using 
the label {{cql3.3}} for tickets that adds changes to CQL syntax.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-13 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320441#comment-14320441
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


Okay, I've added a test case and updated the CQL syntax docs on my branch.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-11 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316308#comment-14316308
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

Beside some last points from my side. I’ve compared your branch against it’s 
origin in trunk, since it does not merge without conflicts against current 
trunk - but that’s fine for a ”quick look”.

* {{ColumnCondition.CollectionBound#valueAppliesTo}}, 
{{Lists.getElementsFromValue}}, {{Sets.getElementsFromValue}} and 
{{Maps.Putter}} do similar things. Any chance to consolidate everything/most 
into a single function/few functions?
* in {{Json#handleCaseSensitivity}} you can remove the two ArrayLists and the 
Map.put/remove sequences when you iterate iterate over {{for (String mapKey : 
new HashSet(valueMap.keySet()))}}
* not sure, if you already have, but a unit test to test all combinations in 
{{Json#handleCaseSensitivity}} would be great
* Can you remove {{Selection#validateSelectors}}? It’s unused.
* Can you extract a method for the two switch-statements at the end of 
{{ParsedInsert#prepareInternal}}?
* In {{SetType.fromJSONObject}} you could add a simple {{if (!buffers.add(…)) 
throw SomeMeaningfulDuplicateElementException}} (and remove the size-check 
after the loop).
* in {{UserType.fromJSONObject}} you can safely replace {{buffers}} with an 
ordinary array

Regarding the JSON library: I’ve created CASSANDRA-8785 (software not 
maintained for many years does feel right).

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-11 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316339#comment-14316339
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

Heh - CASSANDRA-8785 is no longer a problem.

The JSON library used in C* (json-simple 1.1) is quite aged: 1.1 is dated 
08/2009 and the newest version 1.1.1 is dated 03/2012.
That library uses very old Java APIs (e.g. StringBuffer). On [this 
page|http://eclipsesource.com/blogs/2013/04/18/minimal-json-parser-for-java/] 
is some JSON parser performance comparison.
IMO Jackson feels like a nice alternative to json-simple for this one (maybe a 
follow-up ticket) so we could completely get rid of that json-simple.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-11 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316950#comment-14316950
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


The branch has been updated to address your comments and merge in the latest 
trunk.

bq. not sure, if you already have, but a unit test to test all combinations in 
Json#handleCaseSensitivity would be great

There is a unit test: {{JsonTest.testCaseSensitivity()}}.  I believe that 
covers all combinations; let me know if I've missed something.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-03 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304148#comment-14304148
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


bq. But I can imagine situations where some values are collected via JSON and 
enriched by the application code \[...\]

That's a fair point.  If users ask for it, I would probably be okay adding it, 
but let's take the wait and see approach for now.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-03 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304159#comment-14304159
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

bq.  both allows VALUES and JSON ... let's take the wait and see approach for 
now.

Agree

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-03 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304029#comment-14304029
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


Thanks for the comments, [~snazy].  I've updated the branch with one commit per 
comment.  A few responses:

bq. Can you add a INSERT JSON variant that tackles tuples and types?

I assume you mean a unit test.  If so, that's done.

bq.  Was wondering if we really need to mention the column names since they are 
contained in the JSON

Good point, I've made the column name declaration optional with doing INSERT 
JSON.

bq. The Java Driver tries to contact a coordinator that owns the target 
partition \[...\]

We currently have the same problem if any functions are used for the 
insert/update.  I don't think it's too big of a deal, and I'm not sure that we 
have any good solutions right now.

bq. Maybe we can discuss a follow-up that both allows {{VALUES}} and {{JSON}}

I would personally be -1 on that.  I don't think it's worth making the query 
language more complicated for a networking-related optimization.  Plus, if 
folks are splitting up JSON to do that anyway, they might as well just use 
VALUES all the way.

bq. would be nicer (and eliminate one String instance) to replace 
sb.append(JSONValue.escape(columnName)) with something like 
JSONValue.escape(columnName, sb)

I would like to do that as well, but JSONValue doesn't support anything like 
that.  (It's from the JSON.simple library, not our code.)  We could always 
implement our own escape method, but I'm not sure that's worth it.

bq. ReversedType : not sure whether the impl is correct - doesn’t it need to 
reverse the binary representation?

It doesn't.  The only thing ReversedType changes is the compare() method.  See 
fromString(), for example.

bq. FromJsonFct + ToJsonFct : would be nicer to have a CHM instead of 
synchronized HashMap.

Done.  We should probably also do this for the CollectionType subclasses.

bq. Also these maps in these classes have AbstractType as the key and are never 
evicted - means: a changed user type would remain forever in these maps and if 
a UDT

Hmm, that's a good point.  It's a pretty minor concern, so I would be in favor 
of a follow-up ticket (especially considering possible AbstractType changes).

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-03 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304072#comment-14304072
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

Heh - yes, I meant a unit test.

bq. follow-up that both allows VALUES and JSON

Understand your point. Sure - it would be too much syntactic foo just for a 
partition key. But I can imagine situations where some values are collected via 
JSON and enriched by the application code. Maybe some IoT data from sensors 
that can be passed to the table as is in JSON plus some enrichment done by 
the application (e.g. adding location of the sensors). For example sensors that 
emit {{wind_direction}} and {{wind_speed}} in JSON format - the collecting 
application could enrich it with {{sample_timestamp}} and {{gps_coordinates}} 
using conventional {{VALUES}} syntax.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-02-01 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300120#comment-14300120
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

I took a short look at the patch and played around a little bit with this.

Some comments:
* A test with null values in {{testFromJsonFct()}} and {{testToJsonFct()}} 
would be nice (toJson handles null correctly, fromJson throws an NPE for 
{{INSERT INTO tojson (k, asciival ) VALUES ( 0, fromJson(null) );}})
* Can you add a {{INSERT JSON}} variant that tackles tuples and types?
* I tried {{insert into some_table (k, asciival ) JSON \{k: 0, asciival: 
foobar\};}} in cqlsh - it complains with a syntax error (obviously my bad). 
Was wondering if we really need to mention the column names since they are 
contained in the JSON. Couldn't a {{insert into some_table JSON ?;}} with json 
{{\{k: 0, asciival: foobar\}}} also work?
* The Java Driver tries to contact a coordinator that owns the target 
partition. This gets complicated with {{INSERT JSON}} since the driver would 
have to parse the JSON before it actually knows the correct node. Maybe we 
can discuss a follow-up that both allows {{VALUES}} and {{JSON}} - e.g. 
{{INSERT INTO some_table (partitionKey, clusteringKey) VALUES (1, 2) JSON ?}}, 
where the {{JSON ?}} part contains more columns.
* missing license header in {{Json.java}}
* In {{FunctionCall}} you exchanged „fun“ with „fun.name()“ eliminating the 
function signature in the exception message
* {{Selection.rowToJson}} : use Collections.singleton instead of 
{{Arrays.asList}}
* {{Selection.rowToJson}} : would be nicer (and eliminate one String instance) 
to replace sb.append(JSONValue.escape(columnName)) with something like 
JSONValue.escape(columnName, sb); (make escape write to sb)
* Similar for {{AbstractType.toJSONString}} - let the implementation write to 
the string builder
* {{ReversedType}} : not sure whether the impl is correct - doesn’t it need to 
reverse the binary representation?
* {{FromJsonFct}} + {{ToJsonFct}} : would be nicer to have a CHM instead of 
synchronized HashMap. Also these maps in these classes have {{AbstractType}} as 
the key and are never evicted - means: a changed user type would remain forever 
in these maps and if a UDT. This might get irrelevant when {{AbstractType}} is 
removed - so not sure, whether we should fix this now - maybe open a 
follow-up-ticket for this? A simple fix would be to 'statically cache' function 
instances for primitive types but always create a new instance of these 
functions for tuples + UDTs ; then you could pre-create the function instances 
completely eliminating the need for CHM or synchronized.

Altogether: really nice work :)


 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-20 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284484#comment-14284484
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


[~lhartzman] that's a fair point.  We had similar debates about supporting 
other (Cassandra) types that JSON doesn't support natively, like UUIDs.  The 
decision was that if we didn't support those types, users would end up using 
strings instead.  The downside to this is a lack of type safety, validation, 
and correct sorting.  Additionally, functions become incompatible and drivers 
(including cqlsh) no longer have proper type information.

I think the same argument can be applied here.  Users will want to use maps 
keyed by ints, uuids, frozen sets, etc.  If we don't support that, they'll 
simply declare the keys to be strings instead, and the same problems around 
losing type information will come up.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-07 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267409#comment-14267409
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

I think it might be worth clarifying what we're talking about here because it 
appears I don't have the same understanding as some of you guys.

I'm not at all in favor of making CQL weakly typed. We've remove the accept 
string for everything in CQL early on and I'm still convinced it was a good 
thing.

My understanding of Tyler's suggestion is that it's only related to JSON 
support. It's to say that if we have the table:
{noformat}
CREATE TABLE foo (
   c1 int PRIMARY KEY,
   c2 float,
   c3 mapint, timestamp,
)
{noformat}
then we'll agree to map to it a json literal like
{noformat}
{
  'c1' : '3',
  'c2' : '4.2',
  'c3' : { '4', '2011-02-03 04:05'}
}
{noformat}
and this even though the JSON uses strings everywhere. And I think it's ok to 
do this because JSON is not really a typed thing in the first place: it has 
different kind of literals, but it's not typed per se, and it doesn't support 
all the literals that CQL supports anyway (typically uuid literals, which is 
why we will have to at least accept string for uuids). But I think it's ok to 
do this for the JSON mapping (which again, I just see as considering JSON as 
untyped/weakly typed) without going as far as making CQL itself weakly typed. 
But if we disagree on that part, and you guys think it would be horribly 
inconsistent to accept that kind of thing in the JSON translation without 
weakening the CQL typing, then count me as a *strong* -1 on the whole thing.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-07 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267412#comment-14267412
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

Or to clarify, if Tyler's everywhere means everywhere in CQL, then I'm a 
strong -1. I had understood it as systematically in the JSON translation, 
which I'm fine with. If I misunderstood, my bad.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-07 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267422#comment-14267422
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

-1 on accepting strings for anything _outside_ of JSON *map keys* (object 
names).

So don't allow it on JSON object values - just on [JSON object 
names|http://www.json.org/]
{noformat}
{
  'c1' : 3,
  'c2' : 4.2,
  'c3' : { '4', '2011-02-03 04:05'}
}
{noformat}


 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-07 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267611#comment-14267611
 ] 

Aleksey Yeschenko commented on CASSANDRA-7970:
--

I'm sure Tyler's 'everywhere' meant 'everywhere in JSON', in which case I'm 
okay with it - we already have to accept it b/c a JSON key can only be a 
string, and we'll need to do type-casting there anyway.

The alternative would be to limit tojson/fromjson to the subset of schemas w/ 
types strictly matching to JSON literals' types - when keys are strings, and 
values are strings/numbers/booleans/lists/UDTs. I will not fight for that 
option anymore, though.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-07 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268341#comment-14268341
 ] 

Robert Stupp commented on CASSANDRA-7970:
-

[~thobbs] I was a bit confused about the discussion because I thought there 
were plans to allow string encoded numbers everywhere in CQL. Guess, now I got 
it :)

TLDR -  I'm okay with accepting numbers in strings in JSON:
I think that we should only use string representation where the JSON spec gives 
us no other option (that's on 'object names' / 'map keys'). OK, it's a 
different thing whether we _accept_ numbers in strings or whether we _produce_. 
IMO we should produce numbers (and not numbers as strings) for JSON object 
values. Unfortunately I've no better argument than follow the spec. But OTOH 
it does not matter whether number parsing in antlr fails or whether a 
Float.parseFloat fails and JSON spec does not force us to not encode numbers 
in strings.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-07 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267969#comment-14267969
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


Yes, by everywhere, I meant everywhere in JSON.  It sounds like everybody 
is roughly in agreement with accepting string representations of ints and 
floats anywhere in a JSON string (which I think is reasonable, given that we 
accept string representations of other types within JSON anyway).

The only object is [~snazy].  Can you expand on why you think it's a bad idea?

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-07 Thread Les Hartzman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267990#comment-14267990
 ] 

Les Hartzman commented on CASSANDRA-7970:
-

Reading through the proposals, I can understand the idea of providing some 
flexibility. But I think that it would be wrong to allow for numeric values to 
be implicitly converted to strings. It seems that this came up from the idea of 
converting a mapint, text structure into JSON. But what was the purpose of 
this enhancement to begin with? It seemed to me that the purpose of this was to 
allow for the storage and retrieval of JSON in a columnar format as opposed to 
a CLOB. If the focus remains solely on the reading and writing of 'normal' 
JSON, there is no issue.

I think that this issue has expanded beyond what was intended:  it is to allow 
a convenient way to easily turn a JSON document from a service or a user into a 
CQL row, with all the validation that entails.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-06 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266973#comment-14266973
 ] 

Aleksey Yeschenko commented on CASSANDRA-7970:
--

Also, sqlite.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14266964#comment-14266964
 ] 

Jonathan Ellis commented on CASSANDRA-7970:
---

I could go either way.  On the one hand, my experience with weak typing (js, 
perl) leads me to think it is a misfeature, so we should limit its scope to the 
json context where the limited spec forces us to do so.  But I'm not sure if 
that objection really applies to a non-procedural language like SQL.  Are there 
other weakly typed SQL dialects for comparison?



 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267037#comment-14267037
 ] 

Jonathan Ellis commented on CASSANDRA-7970:
---

I'm a firm -1 on going as far as sqlite:

{code}
sqlite select 'foo' + 0;
0
{code}

But only accepting strings if they are reasonable number-ish representations 
seems at least *more* reasonable. (I *think* this is the mysql approach.)

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-06 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267160#comment-14267160
 ] 

Jeremiah Jordan commented on CASSANDRA-7970:


The idea of allowing this in a prepared statement seems very wrong to me, as 
long as we are thinking this only happens for unprepared statements, and we 
don't do best guess at the formatting. Abc 123 and 123 abc both get 
rejected out right, I guess I can see it being useful. Though I really think 
doing the str to int client side is the right way to go.  I can't really see 
why you need it outside of json.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2015-01-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264342#comment-14264342
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

bq.  are you okay with accepting strings in place of ints/floats/etc everywhere 
(for the sake of consistency)?

I'm no Jonathan, but it would make sense to me to do that.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-12-31 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262451#comment-14262451
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


{quote}
I think we should apply Postel's principle here. If we have a mapint, text 
and someone gives us
{'1': 'foo'}

we should turn it into an integer rather than rejecting it. Especially since, 
as you point out, there's really no other option absent transform functions 
that we're not implementing initially.
{quote}

[~jbellis] to check, if we do that, are you okay with accepting strings in 
place of ints/floats/etc everywhere (for the sake of consistency)?

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-11-06 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200974#comment-14200974
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


There's one problem that I forgot about: the JSON spec only accepts strings for 
map keys.

I think that leaves us with two options:
# Accept a string version of anything (including ints, floats, etc).  For 
example, if we're expecting a float when decoding JSON and the JSON decoder 
returns a String, we should try to create a float from it.  For {{toJson()}}, 
we would convert non-strings to strings when they're map keys.
# Don't allow {{toJson()}} and {{fromJson()}} to be used on maps with 
non-string keys.  If we have transform functions at some point, those could be 
used to support these kinds of maps.

Option 2 would be my preference.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-11-06 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200982#comment-14200982
 ] 

Aleksey Yeschenko commented on CASSANDRA-7970:
--

bq. Option 2 would be my preference.

Likewise.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-11-06 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200981#comment-14200981
 ] 

Joshua McKenzie commented on CASSANDRA-7970:


Option 2 seems consistent from a JSON ecosystem perspective as well.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-11-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200992#comment-14200992
 ] 

Jonathan Ellis commented on CASSANDRA-7970:
---

I think we should apply Postel's principle here.  If we have a mapint, text 
and someone gives us {'1': 'foo'} we should turn it into an integer rather than 
rejecting it.  Especially since, as you point out, there's really no other 
option absent transform functions that we're not implementing initially.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-10-10 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166841#comment-14166841
 ] 

Aleksey Yeschenko commented on CASSANDRA-7970:
--

+1 as well

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-10-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165276#comment-14165276
 ] 

Sylvain Lebresne commented on CASSANDRA-7970:
-

I'll note that there is imo 2 slightly separate case to handle. Not only do we 
want to translate full JSON objects to CQL rows and vice-versa as in the 
example of the description. But we should also allow to convert CQL values to 
JSON and vice-versa, because well, the row to/from json will really just be a 
little extension of the cql value to/from json. Concretely, we'd want something 
along the lines of:
{noformat}
UPDATE users SET addresses = addresses + fromJson($${work : { ... }}$$) WHERE 
id = ...
{noformat}
and I don't see why we wouldn't allow stuffs like:
{noformat}
SELECT id, toJson(addresses) FROM users;
{noformat}
since again, we will have to implement such functions internally. For those 
cases, I do think the functional syntax work better than a specific syntax 
because well, those are really plain old functions.

That said, I'd be personally be fine with adding toJson/fromJson functions for 
the conversion of CQL value to and from json, but having specific syntax like 
the one in the description to apply the conversion to full rows (I do agree 
that the functional syntax is somewhat inconsistent with normal functions 
when it comes to applying it to row).


 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-10-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166189#comment-14166189
 ] 

Jonathan Ellis commented on CASSANDRA-7970:
---

bq.  I'd be personally be fine with adding toJson/fromJson functions for the 
conversion of CQL value to and from json, but having specific syntax like the 
one in the description to apply the conversion to full rows

+1

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-10-02 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157097#comment-14157097
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


We had some discussion about implementing this as a function instead of a 
syntax extension, I would like to document my thoughts on the two choices.

h3. Syntax Approach

Inserts:
{{INSERT INTO users (id, name, address) JSON $$\{'id': ...\}$$;}}

Selects:
{{SELECT JSON id, address FROM users;}}
or
{{SELECT id, address FROM users WITH JSON FORMATTING;}}

Possible future enhancements such as default values and transformation 
functions can be handled with syntax:

* {{INSERT INTO users JSON ? WITH JSON DEFAULTS = \{field: default\}}}
* {{INSERT INTO users JSON ? WITH JSON TRANSFORMS = \{field: fn\}}}

Problems:
* With SELECT, the number and type of selected columns changes, which is 
somewhat quirky.
* It's not possible to select both normal columns and a JSON document that 
contains some of the columns.
* Functions cannot be applied to the JSON document.  (For example, 
{{concat()}}, if we consider aggregation functions.)

h3. Functional Approach

Inserting:
{{INSERT INTO users (id, name, address) VALUES jsonToRow($${'id': ...$$);}}

Selecting:
{{SELECT rowToJson(id, address) FROM users;}}

Possible future enhancements such as default values and transformation 
functions can be handled with arguments:

{{INSERT INTO users VALUES jsonToRow(..., \{field: default\}, \{field: 
fn\})}}

Problems:
* When selecting, field names could get strange with function calls and 
subfields (e.g. {{SELECT rowToJson(k, sin(v)) ...}}). We might eventually need 
to offer a way to change field names.  This could be handled with another 
optional argument or perhaps {{AS}} syntax inside the function call.


Although I think the syntax option is somewhat cleaner, the function approach 
is more powerful, doesn't require expanding the language, and is probably a bit 
more consistent with the rest of CQL. For those reasons, I think we should go 
with json functions.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-10-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157218#comment-14157218
 ] 

Jonathan Ellis commented on CASSANDRA-7970:
---

bq. With SELECT, the number and type of selected columns changes, which is 
somewhat quirky.

I guess I don't see how this is more quirky than number and type of selected 
columns changing in a non-json select.

bq. It's not possible to select both normal columns and a JSON document that 
contains some of the columns.

True, but that seems like a reasonable constraint for the design goals here.

bq. Functions cannot be applied to the JSON document. (For example, concat(), 
if we consider aggregation functions.)

Also a reasonable constraint for the design goals.

Here's my problem with the row_to_json approach: it inherently requires special 
casing the function machinery, which is Bad.  

What I mean by this is, all other function calls should have the same behavior 
whether called on a literal or a column with the same value.  Thus, sin(2) and 
sin(a) where a=2 are semantically equivalent.  But row_to_json(a) has to cheat 
and generate {code}{'a': 2}{code}.

(Unless you are saying that row_to_json(a) would generate just the literal 2, 
in which case I have to say that having to spell out all the json fields is 
insufficiently usable to be a viable candidate.)

Postgresql gets around this with subselects and the concept of a row 
constructor: 
http://hashrocket.com/blog/posts/faster-json-generation-with-postgresql

I'm okay with the postgresql approach in a lot of respects, but it introduces a 
ton of complexity that I don't really want to block for.  Nor am I sure that we 
want to open up the subquery can of worms for this, which will inevitably lead 
to people pushing for them to be generalized.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-10-02 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157224#comment-14157224
 ] 

Jonathan Ellis commented on CASSANDRA-7970:
---

One nit on the special syntax: since it's already special, we wouldn't need 
to require string delimiters on the json literal.  Could just be implicit.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7970) JSON support for CQL

2014-10-02 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157292#comment-14157292
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


bq. I guess I don't see how this is more quirky than number and type of 
selected columns changing in a non-json select.

We don't normally do that in a non-json select, though.  (The one exception I 
can think of is selecting duplicate columns, like {{SELECT a, b, a FROM}}.)  
With {{SELECT JSON a, b, c}}, you get back one column.  I'm not saying this is 
terrible behavior, but it's definitely an exception to the way things normally 
work.

bq. What I mean by this is, all other function calls should have the same 
behavior whether called on a literal or a column with the same value. Thus, 
sin(2) and sin(a) where a=2 are semantically equivalent. But row_to_json(a) has 
to cheat and generate...

Currently we don't allow selecting constants or function calls, so technically 
that's not an issue yet, but I guess it's only a matter of time until we do 
support that.  {{rowToJson()}} would use the natural identifiers of selections 
as field names for where possible (columns, subfields, etc) and use the 
constant or function call as the field name otherwise.  So for example, 
{{SELECT rowToJson(2, now(), a) FROM ...}} would return \{'2': 2, 'now()': 
1234567, 'a': 'foo'\}.  This could be overridden with additional arguments or 
{{AS}} syntax.

bq. (Unless you are saying that row_to_json(a) would generate just the literal 
2, in which case I have to say that having to spell out all the json fields is 
insufficiently usable to be a viable candidate.)

Nope, agreed.

bq. Postgresql gets around this with subselects and the concept of a row 
constructor

I agree that subselects are not something we want to introduce.  Row 
constructors might internally simplify some of the typing issues, but from an 
API perspective they're roughly equivalent to my proposal (unless I'm missing 
some advantage).

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: Bug
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
 Fix For: 3.0


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)