I'm actually doing something almost the same. I serialize my objects into
byte[] using Jackson's SMILE format, then compress it using Snappy then store
the byte[] in Cassandra. I actually created a simple Cassandra Type for this
but I hit a wall with cassandra-cli:
Sounds awesome Drew. Mind sharing your custom type? I just wrote a basic
JSON type and did the validation the same way you did, but I don't have any
SMILE support yet. It seems that if your type were committed to the
Cassandra codebase then the issue you ran into of the CLI only supporting
Is there a reason you would prefer a JSONType over CASSANDRA-3647? It
would seem the only thing a JSON type offers you is validation. 3647 takes
it much further by deconstructing a JSON document using composite columns
to flatten the document out, with the ability to access and update portions
Could you explain further how I would use CASSANDRA-3647? There's still
very little documentation on composite columns and it was not clear to me
whether they could be used to store document oriented data. Say for
example that I had a document like:
user: {
firstName: 'ben',
skills:
Ben,
You can create a materialized path for each field in the document:
{
[user, firstName]: ben,
[user, skills, TimeUUID]: java,
[user, skills, TimeUUID]: javascript,
[user, skills, TimeUUID]: html,
[user, education, school]: cmu,
[user, education, major]: computer science
}
This way each
Its not clear what 3647 actually is, there is no code attached, and no real
example in it.
Aside from that, the reason this would be useful to me (if we could get
indexing of attributes working), is that I already have my data in
JSON/Thrift/ProtoBuff, depending how large the data is, it isn't
Its not clear what 3647 actually is, there is no code attached, and no real
example in it.
Aside from that, the reason this would be useful to me (if we could get
indexing of attributes working), is that I already have my data in
JSON/Thrift/ProtoBuff, depending how large the data is, it isn't
Would there be interest in adding a JsonType?
What about checking that data inserted into a JsonType is valid JSON? How
would you do it, and would the overhead be something we are concerned
about, especially if the JSON string is large?
Creating materialized paths may well be a possible solution. If that were
the solution the community were to agree upon then I would like it to be a
standardized and well-documented best practice. I asked how to store a
list of values on the user
The issue with these super complex types is to do anything useful with
them you would either need scanners or co processors. As its stands
right now complex data like json is fairly opaque to Cassandra.
Getting cassandra to natively speak protobuffs or whatever flavor of
the week serialization
On Thu, Mar 29, 2012 at 9:57 AM, Jeremiah Jordan
jeremiah.jor...@morningstar.com wrote:
Its not clear what 3647 actually is, there is no code attached, and no real
example in it.
Aside from that, the reason this would be useful to me (if we could get
indexing of attributes working), is that
But it isn't special case logic. The current AbstractType and Indexing of
Abstract types for the most part would already support this. Someone just has
to write the code for JSONType or ProtoBuffType.
The problem isn't writing the code to break objects up, the problem is
encode/decode time.
Hi Ben,
Sure, there's nothing really to it, but I'll email it to you. As far as why I'm
using Snappy on the type instead of sstable_compression is because when you set
sstable_compression the compression happens on the Cassandra nodes and I see
two advantages with my approach:
1. Saving extra
I agree with Edward here, the simpler we keep the core the better. I think all
the ser/deser and conversions should happen on the client side.
-- Drew
On Mar 29, 2012, at 8:36 AM, Edward Capriolo wrote:
The issue with these super complex types is to do anything useful with
them you would
I think this is a much better approach because that gives you the
ability to update or retrieve just parts of objects efficiently,
rather than making column values just blobs with a bunch of special
case logic to introspect them. Which feels like a big step backwards
to me.
Unless your
On Thu, Mar 29, 2012 at 1:11 PM, Drew Kutcharian d...@venarc.com wrote:
I think this is a much better approach because that gives you the
ability to update or retrieve just parts of objects efficiently,
rather than making column values just blobs with a bunch of special
case logic to
Yes, I meant the row header index. What I have done is that I'm storing an
object (i.e. UserProfile) where you read or write it as a whole (a user updates
their user details in a single page in the UI). So I serialize that object into
a binary JSON using SMILE format. I then compress it using
Jonathan, I asked Brian about his REST
APIhttps://groups.google.com/forum/?fromgroups#!topic/virgil-users/oncBas9C8Usand
he said he does not take the json objects and split them because the
client libraries do not agree on implementations. This was exactly my
concern as well with this solution.
On Thu, Mar 29, 2012 at 2:06 PM, Ben McCann b...@benmccann.com wrote:
As far as I can tell, Cassandra
doesn't support maps and lists in a standardized way today, which is the
root of my problem.
I'm pretty serious about adding those for 1.2, for what that's worth.
(If you want to jump in and
Jonathan,
I was actually going to take this up with Nate McCall a few weeks back. I
think it might make sense to get the client development community together
(Netflix w/ Astyanax, Hector, Pycassa, Virgil, etc.)
I agree whole-heartedly that it shouldn't go into the database for all the
reasons
Thanks Jonathan. The only reason I suggested JSON was because it already
has support for lists. Native support for lists in Cassandra would more
than satisfy me. Are there any existing proposals or a bug I can follow?
I'm not familiar with the Cassandra codebase, so I'm not entirely sure how
Jonathan,
We store JSON as our column values. I'd love to see support for maps and
lists. If I get some time this weekend, I'll take a look to see what is
required. I doesn't seem like it would be that hard.
-brian
Brian O'Neill
Lead Architect, Software Development
Health Market
I kind of hijacked
https://issues.apache.org/jira/browse/CASSANDRA-3647 (Sylvain
suggests we start with (non-nested) lists, maps, and sets. I agree
that this is a great 80/20 approach to the problem) but we could
split it out to another ticket.
On Thu, Mar 29, 2012 at 2:24 PM, Ben McCann
Cool. How were you thinking we should store the data? As a stanardized
composite column (e.g. potentially a list as [fieldName, TimeUUID]:
fieldValue and a set as [fieldName, fieldValue ]:)? Or as a new
column type?
On Thu, Mar 29, 2012 at 12:35 PM, Jonathan Ellis jbel...@gmail.com wrote:
24 matches
Mail list logo