Re: Storing values of mixed types in a list

2014-06-25 Thread Tuukka Mustonen
Unfortunately, I need to query per list items. That's why I'm running
Cassandra 2.1rc1 (offers secondary indexes for collections).

I'm also studying Dynamo, it seems to be somewhat more dynamic by nature
and allows mixed type lists. As I understood it, also Cassandra supports
dynamic schemas, but only through Thrift protocol. Also, I don't think it
changes the fact that collections need to be strongly-typed in Cassandra,
no matter what protocol is used?

Tuukka



On Tue, Jun 24, 2014 at 9:41 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Jeremy, with blob field (ByteBuffer), I can query exact matches (just
 encode the value in query), but greater/less than queries would not work.
 Any sort of serialization kills native ways to query data -- Not
 necessarily. You still use normal types (uuid, string, timestamp,...) for
 clustering columns and use them for querying. For the cells where you store
 values, use blob type.




 On Tue, Jun 24, 2014 at 8:21 PM, Tuukka Mustonen 
 tuukka.musto...@gmail.com wrote:

 What if I need to query by list items?

 1. Jeremy, with blob field (ByteBuffer), I can query exact matches (just
 encode the value in query), but greater/less than queries would not work.
 Any sort of serialization kills native ways to query data
 2. Even with user defined types, I would need to define separate fields
 for each value. Running queries would be cumbersome (something like WHERE
 items CONTAINS {'text_value': 'foobar'} or WHERE items CONTAINS
 {'int_value': 3}. Pavel, did you mean like this?

 I'm running 2.1rc1 with python driver 2.0.2.

 Tuukka


 On Tue, Jun 24, 2014 at 4:39 PM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 1) You can use list of strings which are serialized JSONs, or use
 ByteBuffer with your own serialization as Jeremy suggested.
 2) Use Cassandra 2.1 (not officially released yet) were there is new
 feature of user defined types.

 Pavel




 On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Use a ByteBuffer value type with your own serialization (we use
 protobuf for complex value structures)
  On Jun 24, 2014 5:30 AM, Tuukka Mustonen tuukka.musto...@gmail.com
 wrote:

 Hello,

 I need to store a list of mixed types in Cassandra. The list may
 contain numbers, strings and booleans. So I would need something like
 list?.

 Is this possible in Cassandra and if not, what workaround would you
 suggest for storing a list of mixed type items? I sketched a few (using a
 list per type, using list of user types in Cassandra 2.1, etc.), but I get
 a bad feeling about each.

 Couldn't find an exact answer to this through searches...
 Regards,
 Tuukka

 P.S. I first asked this at SO before realizing the traffic there is
 very low:
 http://stackoverflow.com/questions/24380158/storing-a-list-of-mixed-types-in-cassandra







Re: Storing values of mixed types in a list

2014-06-25 Thread Sylvain Lebresne
On Wed, Jun 25, 2014 at 8:49 AM, Tuukka Mustonen tuukka.musto...@gmail.com
wrote:

 Unfortunately, I need to query per list items. That's why I'm running
 Cassandra 2.1rc1 (offers secondary indexes for collections).


Using a list of blobs does not in any way prevent you from doing that.
Types are constraints on what values C* will accept and using blob is
simply asking C* to not reject any value. Doing so does not in any way
limit the kind of queries you can do.

The small downside of using blobs is that you'll have to
serialize/deserialize your value manually client-side, but that's not a
huge deal either. That said, if you really only have 3 types of values to
store and if you don't particularly care about the order of items in the
collection (i.e. if you said you want a list but could really do with a
set), then storing 3 different sets can be a viable solution too (as in,
there is no strong downside to doing it as far as C* is concerned and it
may be simpler to deal with client side (or not, it depends a bit on what
your client side code does exactly)).



 As I understood it, also Cassandra supports dynamic schemas, but only
 through Thrift protocol.


dynamic schemas is a terribly imprecise term that means different things
to different people, but in general that statement is incorrect: you can do
the same things with CQL and with Thrift.


 Also, I don't think it changes the fact that collections need to be
 strongly-typed in Cassandra, no matter what protocol is used?


Well, yes since you do have to provide a type for the elements in the
collection, but as said previously that does not in any way prevent you for
having collections of anything since you can use a blob type.

--
Sylvain



 Tuukka



 On Tue, Jun 24, 2014 at 9:41 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Jeremy, with blob field (ByteBuffer), I can query exact matches (just
 encode the value in query), but greater/less than queries would not work.
 Any sort of serialization kills native ways to query data -- Not
 necessarily. You still use normal types (uuid, string, timestamp,...) for
 clustering columns and use them for querying. For the cells where you store
 values, use blob type.




 On Tue, Jun 24, 2014 at 8:21 PM, Tuukka Mustonen 
 tuukka.musto...@gmail.com wrote:

 What if I need to query by list items?

 1. Jeremy, with blob field (ByteBuffer), I can query exact matches (just
 encode the value in query), but greater/less than queries would not work.
 Any sort of serialization kills native ways to query data
 2. Even with user defined types, I would need to define separate fields
 for each value. Running queries would be cumbersome (something like WHERE
 items CONTAINS {'text_value': 'foobar'} or WHERE items CONTAINS
 {'int_value': 3}. Pavel, did you mean like this?

 I'm running 2.1rc1 with python driver 2.0.2.

 Tuukka


 On Tue, Jun 24, 2014 at 4:39 PM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 1) You can use list of strings which are serialized JSONs, or use
 ByteBuffer with your own serialization as Jeremy suggested.
 2) Use Cassandra 2.1 (not officially released yet) were there is new
 feature of user defined types.

 Pavel




 On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Use a ByteBuffer value type with your own serialization (we use
 protobuf for complex value structures)
  On Jun 24, 2014 5:30 AM, Tuukka Mustonen tuukka.musto...@gmail.com
 wrote:

 Hello,

 I need to store a list of mixed types in Cassandra. The list may
 contain numbers, strings and booleans. So I would need something like
 list?.

 Is this possible in Cassandra and if not, what workaround would you
 suggest for storing a list of mixed type items? I sketched a few (using a
 list per type, using list of user types in Cassandra 2.1, etc.), but I 
 get
 a bad feeling about each.

 Couldn't find an exact answer to this through searches...
 Regards,
 Tuukka

 P.S. I first asked this at SO before realizing the traffic there is
 very low:
 http://stackoverflow.com/questions/24380158/storing-a-list-of-mixed-types-in-cassandra








Re: Storing values of mixed types in a list

2014-06-25 Thread Tuukka Mustonen
Sorry for confusion, I should have lined my requirements better in the
first place. Let me try to summarize:

- I can use listblob and query against it using secondary indexes and by
encoding my data on the client side. However, *this only allows exact
matches, not greater/lesser than *for numbers at least (not sure I need to,
but maybe). Please correct me if I got it wrong? I'm not very familiar with
playing with binary.
- My supported list of types is very limited, indeed, and the order doesn't
matter, so I could use separate list for each type. However, that makes
playing with data somewhat cumbersome and I need to have multiple clauses
in queries then, for each type.
- I could use user defined types, but I would still have to define separate
field for each value and queries would again be cumbersome.

Let's forget about dynamic schema as I'm a Cassandra newbie and
definitively need to study more before opening that chest of wonders.
Thanks for correcting me.

I just wish there was an easy way to define a list as list? and to run
queries against. But, sounds like there isn't (and nobody is seeing need
for it) so I think I'll just take one of the suggested workarounds...

Tuukka



On Wed, Jun 25, 2014 at 10:47 AM, Sylvain Lebresne sylv...@datastax.com
wrote:

 On Wed, Jun 25, 2014 at 8:49 AM, Tuukka Mustonen 
 tuukka.musto...@gmail.com wrote:

 Unfortunately, I need to query per list items. That's why I'm running
 Cassandra 2.1rc1 (offers secondary indexes for collections).


 Using a list of blobs does not in any way prevent you from doing that.
 Types are constraints on what values C* will accept and using blob is
 simply asking C* to not reject any value. Doing so does not in any way
 limit the kind of queries you can do.

 The small downside of using blobs is that you'll have to
 serialize/deserialize your value manually client-side, but that's not a
 huge deal either. That said, if you really only have 3 types of values to
 store and if you don't particularly care about the order of items in the
 collection (i.e. if you said you want a list but could really do with a
 set), then storing 3 different sets can be a viable solution too (as in,
 there is no strong downside to doing it as far as C* is concerned and it
 may be simpler to deal with client side (or not, it depends a bit on what
 your client side code does exactly)).



 As I understood it, also Cassandra supports dynamic schemas, but only
 through Thrift protocol.


 dynamic schemas is a terribly imprecise term that means different things
 to different people, but in general that statement is incorrect: you can do
 the same things with CQL and with Thrift.


 Also, I don't think it changes the fact that collections need to be
 strongly-typed in Cassandra, no matter what protocol is used?


 Well, yes since you do have to provide a type for the elements in the
 collection, but as said previously that does not in any way prevent you for
 having collections of anything since you can use a blob type.

 --
 Sylvain



 Tuukka



 On Tue, Jun 24, 2014 at 9:41 PM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Jeremy, with blob field (ByteBuffer), I can query exact matches (just
 encode the value in query), but greater/less than queries would not work.
 Any sort of serialization kills native ways to query data -- Not
 necessarily. You still use normal types (uuid, string, timestamp,...) for
 clustering columns and use them for querying. For the cells where you store
 values, use blob type.




 On Tue, Jun 24, 2014 at 8:21 PM, Tuukka Mustonen 
 tuukka.musto...@gmail.com wrote:

 What if I need to query by list items?

 1. Jeremy, with blob field (ByteBuffer), I can query exact matches
 (just encode the value in query), but greater/less than queries would not
 work. Any sort of serialization kills native ways to query data
 2. Even with user defined types, I would need to define separate fields
 for each value. Running queries would be cumbersome (something like WHERE
 items CONTAINS {'text_value': 'foobar'} or WHERE items CONTAINS
 {'int_value': 3}. Pavel, did you mean like this?

 I'm running 2.1rc1 with python driver 2.0.2.

 Tuukka


 On Tue, Jun 24, 2014 at 4:39 PM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 1) You can use list of strings which are serialized JSONs, or use
 ByteBuffer with your own serialization as Jeremy suggested.
 2) Use Cassandra 2.1 (not officially released yet) were there is new
 feature of user defined types.

 Pavel




 On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Use a ByteBuffer value type with your own serialization (we use
 protobuf for complex value structures)
  On Jun 24, 2014 5:30 AM, Tuukka Mustonen 
 tuukka.musto...@gmail.com wrote:

 Hello,

 I need to store a list of mixed types in Cassandra. The list may
 contain numbers, strings and booleans. So I would need something like
 list?.

 Is this possible in Cassandra and if not, what workaround would you
 suggest for storing a 

Re: Storing values of mixed types in a list

2014-06-25 Thread Tuukka Mustonen
Actually, come to think of it, of course I cannot run greater/less than
queries on list items anyway (would be something like WHERE items CONTAINS
 4), so binary encoding should be fine. Thanks for everybody's input!

Tuukka


On Wed, Jun 25, 2014 at 1:49 PM, Tuukka Mustonen tuukka.musto...@gmail.com
wrote:

 Sorry for confusion, I should have lined my requirements better in the
 first place. Let me try to summarize:

 - I can use listblob and query against it using secondary indexes and by
 encoding my data on the client side. However, *this only allows exact
 matches, not greater/lesser than *for numbers at least (not sure I need
 to, but maybe). Please correct me if I got it wrong? I'm not very familiar
 with playing with binary.
 - My supported list of types is very limited, indeed, and the order
 doesn't matter, so I could use separate list for each type. However, that
 makes playing with data somewhat cumbersome and I need to have multiple
 clauses in queries then, for each type.
 - I could use user defined types, but I would still have to define
 separate field for each value and queries would again be cumbersome.

 Let's forget about dynamic schema as I'm a Cassandra newbie and
 definitively need to study more before opening that chest of wonders.
 Thanks for correcting me.

 I just wish there was an easy way to define a list as list? and to run
 queries against. But, sounds like there isn't (and nobody is seeing need
 for it) so I think I'll just take one of the suggested workarounds...

 Tuukka



 On Wed, Jun 25, 2014 at 10:47 AM, Sylvain Lebresne sylv...@datastax.com
 wrote:

 On Wed, Jun 25, 2014 at 8:49 AM, Tuukka Mustonen 
 tuukka.musto...@gmail.com wrote:

 Unfortunately, I need to query per list items. That's why I'm running
 Cassandra 2.1rc1 (offers secondary indexes for collections).


 Using a list of blobs does not in any way prevent you from doing that.
 Types are constraints on what values C* will accept and using blob is
 simply asking C* to not reject any value. Doing so does not in any way
 limit the kind of queries you can do.

 The small downside of using blobs is that you'll have to
 serialize/deserialize your value manually client-side, but that's not a
 huge deal either. That said, if you really only have 3 types of values to
 store and if you don't particularly care about the order of items in the
 collection (i.e. if you said you want a list but could really do with a
 set), then storing 3 different sets can be a viable solution too (as in,
 there is no strong downside to doing it as far as C* is concerned and it
 may be simpler to deal with client side (or not, it depends a bit on what
 your client side code does exactly)).



 As I understood it, also Cassandra supports dynamic schemas, but only
 through Thrift protocol.


 dynamic schemas is a terribly imprecise term that means different
 things to different people, but in general that statement is incorrect: you
 can do the same things with CQL and with Thrift.


 Also, I don't think it changes the fact that collections need to be
 strongly-typed in Cassandra, no matter what protocol is used?


 Well, yes since you do have to provide a type for the elements in the
 collection, but as said previously that does not in any way prevent you for
 having collections of anything since you can use a blob type.

 --
 Sylvain



 Tuukka



 On Tue, Jun 24, 2014 at 9:41 PM, DuyHai Doan doanduy...@gmail.com
 wrote:

 Jeremy, with blob field (ByteBuffer), I can query exact matches (just
 encode the value in query), but greater/less than queries would not work.
 Any sort of serialization kills native ways to query data -- Not
 necessarily. You still use normal types (uuid, string, timestamp,...) for
 clustering columns and use them for querying. For the cells where you store
 values, use blob type.




 On Tue, Jun 24, 2014 at 8:21 PM, Tuukka Mustonen 
 tuukka.musto...@gmail.com wrote:

 What if I need to query by list items?

 1. Jeremy, with blob field (ByteBuffer), I can query exact matches
 (just encode the value in query), but greater/less than queries would not
 work. Any sort of serialization kills native ways to query data
 2. Even with user defined types, I would need to define separate
 fields for each value. Running queries would be cumbersome (something like
 WHERE items CONTAINS {'text_value': 'foobar'} or WHERE items CONTAINS
 {'int_value': 3}. Pavel, did you mean like this?

 I'm running 2.1rc1 with python driver 2.0.2.

 Tuukka


 On Tue, Jun 24, 2014 at 4:39 PM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 1) You can use list of strings which are serialized JSONs, or use
 ByteBuffer with your own serialization as Jeremy suggested.
 2) Use Cassandra 2.1 (not officially released yet) were there is new
 feature of user defined types.

 Pavel




 On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Use a ByteBuffer value type with your own serialization (we use
 protobuf for complex value 

Re: Storing values of mixed types in a list

2014-06-25 Thread Robert Coli
On Tue, Jun 24, 2014 at 11:49 PM, Tuukka Mustonen tuukka.musto...@gmail.com
 wrote:

 Unfortunately, I need to query per list items. That's why I'm running
 Cassandra 2.1rc1 (offers secondary indexes for collections).


As a general statement, if you have to use a just added feature in a
pre-release version of the Datastore to model your problem, you may be
Doing It Wrong.


 As I understood it, also Cassandra supports dynamic schemas, but only
 through Thrift protocol. Also, I don't think it changes the fact that
 collections need to be strongly-typed in Cassandra, no matter what protocol
 is used?


There is an E-A-V scheme within CQL which, if you squint your eyes and
click your heels three times, looks like what ppl think of when they think
of dynamic schema.

=Rob


Re: Storing values of mixed types in a list

2014-06-24 Thread Jeremy Jongsma
Use a ByteBuffer value type with your own serialization (we use protobuf
for complex value structures)
On Jun 24, 2014 5:30 AM, Tuukka Mustonen tuukka.musto...@gmail.com
wrote:

 Hello,

 I need to store a list of mixed types in Cassandra. The list may contain
 numbers, strings and booleans. So I would need something like list?.

 Is this possible in Cassandra and if not, what workaround would you
 suggest for storing a list of mixed type items? I sketched a few (using a
 list per type, using list of user types in Cassandra 2.1, etc.), but I get
 a bad feeling about each.

 Couldn't find an exact answer to this through searches...
 Regards,
 Tuukka

 P.S. I first asked this at SO before realizing the traffic there is very
 low:
 http://stackoverflow.com/questions/24380158/storing-a-list-of-mixed-types-in-cassandra




Re: Storing values of mixed types in a list

2014-06-24 Thread Pavel Kogan
1) You can use list of strings which are serialized JSONs, or use
ByteBuffer with your own serialization as Jeremy suggested.
2) Use Cassandra 2.1 (not officially released yet) were there is new
feature of user defined types.

Pavel




On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma jer...@barchart.com wrote:

 Use a ByteBuffer value type with your own serialization (we use protobuf
 for complex value structures)
 On Jun 24, 2014 5:30 AM, Tuukka Mustonen tuukka.musto...@gmail.com
 wrote:

 Hello,

 I need to store a list of mixed types in Cassandra. The list may contain
 numbers, strings and booleans. So I would need something like list?.

 Is this possible in Cassandra and if not, what workaround would you
 suggest for storing a list of mixed type items? I sketched a few (using a
 list per type, using list of user types in Cassandra 2.1, etc.), but I get
 a bad feeling about each.

 Couldn't find an exact answer to this through searches...
 Regards,
 Tuukka

 P.S. I first asked this at SO before realizing the traffic there is very
 low:
 http://stackoverflow.com/questions/24380158/storing-a-list-of-mixed-types-in-cassandra




Re: Storing values of mixed types in a list

2014-06-24 Thread Tuukka Mustonen
What if I need to query by list items?

1. Jeremy, with blob field (ByteBuffer), I can query exact matches (just
encode the value in query), but greater/less than queries would not work.
Any sort of serialization kills native ways to query data
2. Even with user defined types, I would need to define separate fields for
each value. Running queries would be cumbersome (something like WHERE items
CONTAINS {'text_value': 'foobar'} or WHERE items CONTAINS {'int_value': 3}.
Pavel, did you mean like this?

I'm running 2.1rc1 with python driver 2.0.2.

Tuukka


On Tue, Jun 24, 2014 at 4:39 PM, Pavel Kogan pavel.ko...@cortica.com
wrote:

 1) You can use list of strings which are serialized JSONs, or use
 ByteBuffer with your own serialization as Jeremy suggested.
 2) Use Cassandra 2.1 (not officially released yet) were there is new
 feature of user defined types.

 Pavel




 On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Use a ByteBuffer value type with your own serialization (we use protobuf
 for complex value structures)
  On Jun 24, 2014 5:30 AM, Tuukka Mustonen tuukka.musto...@gmail.com
 wrote:

 Hello,

 I need to store a list of mixed types in Cassandra. The list may contain
 numbers, strings and booleans. So I would need something like list?.

 Is this possible in Cassandra and if not, what workaround would you
 suggest for storing a list of mixed type items? I sketched a few (using a
 list per type, using list of user types in Cassandra 2.1, etc.), but I get
 a bad feeling about each.

 Couldn't find an exact answer to this through searches...
 Regards,
 Tuukka

 P.S. I first asked this at SO before realizing the traffic there is very
 low:
 http://stackoverflow.com/questions/24380158/storing-a-list-of-mixed-types-in-cassandra





Re: Storing values of mixed types in a list

2014-06-24 Thread DuyHai Doan
Jeremy, with blob field (ByteBuffer), I can query exact matches (just
encode the value in query), but greater/less than queries would not work.
Any sort of serialization kills native ways to query data -- Not
necessarily. You still use normal types (uuid, string, timestamp,...) for
clustering columns and use them for querying. For the cells where you store
values, use blob type.




On Tue, Jun 24, 2014 at 8:21 PM, Tuukka Mustonen tuukka.musto...@gmail.com
wrote:

 What if I need to query by list items?

 1. Jeremy, with blob field (ByteBuffer), I can query exact matches (just
 encode the value in query), but greater/less than queries would not work.
 Any sort of serialization kills native ways to query data
 2. Even with user defined types, I would need to define separate fields
 for each value. Running queries would be cumbersome (something like WHERE
 items CONTAINS {'text_value': 'foobar'} or WHERE items CONTAINS
 {'int_value': 3}. Pavel, did you mean like this?

 I'm running 2.1rc1 with python driver 2.0.2.

 Tuukka


 On Tue, Jun 24, 2014 at 4:39 PM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 1) You can use list of strings which are serialized JSONs, or use
 ByteBuffer with your own serialization as Jeremy suggested.
 2) Use Cassandra 2.1 (not officially released yet) were there is new
 feature of user defined types.

 Pavel




 On Tue, Jun 24, 2014 at 9:18 AM, Jeremy Jongsma jer...@barchart.com
 wrote:

 Use a ByteBuffer value type with your own serialization (we use protobuf
 for complex value structures)
  On Jun 24, 2014 5:30 AM, Tuukka Mustonen tuukka.musto...@gmail.com
 wrote:

 Hello,

 I need to store a list of mixed types in Cassandra. The list may
 contain numbers, strings and booleans. So I would need something like
 list?.

 Is this possible in Cassandra and if not, what workaround would you
 suggest for storing a list of mixed type items? I sketched a few (using a
 list per type, using list of user types in Cassandra 2.1, etc.), but I get
 a bad feeling about each.

 Couldn't find an exact answer to this through searches...
 Regards,
 Tuukka

 P.S. I first asked this at SO before realizing the traffic there is
 very low:
 http://stackoverflow.com/questions/24380158/storing-a-list-of-mixed-types-in-cassandra