Re: Wide rows splitting

2017-09-18 Thread Stefano Ortolani
You might find this interesting: https://medium.com/@foundev/synthetic-sharding-in-cassandra-to-deal-with-large-partitions-2124b2fd788b Cheers, Stefano On Mon, Sep 18, 2017 at 5:07 AM, Adam Smith wrote: > Dear community, > > I have a table with inlinks to URLs, i.e.

Re: wide rows

2016-10-18 Thread Yabin Meng
With CQL data modeling, everything is called a "row". But really in CQL, a row is just a logical concept. So if you think of "wide partition" instead of "wide row" (partition is what is determined by the has index of the partition key), it will help the understanding a bit: one wide-partition may

Re: wide rows

2016-10-18 Thread DuyHai Doan
// user table: skinny partition CREATE TABLE user ( user_id uuid, firstname text, lastname text, PRIMARY KEY ((user_id)) ); // sensor_data table: wide partition CREATE TABLE sensor_data ( sensor_id uuid, date timestamp, value double, PRIMARY KEY

RE: wide rows

2016-10-18 Thread S Ahmed
Hi, Can someone clarify how you would model a "wide" row cassandra table? From what I understand, a wide row table is where you keep appending columns to a given row. The other way to model a table would be the "regular" style where each row contains data so you would during a SELECT you would

Re: Wide rows best practices and GC impact

2014-12-04 Thread Jabbar Azam
Hello, I saw this earlier yesterday but didn't want to reply because I didn't know what the cause was. Basically I using wide rows with cassandra 1.x and was inserting data constantly. After about 18 hours the JVM would crash with a dump file. For some reason I removed the compaction throttling

Re: Wide rows best practices and GC impact

2014-12-03 Thread Robert Coli
On Tue, Dec 2, 2014 at 5:01 PM, Gianluca Borello gianl...@draios.com wrote: We mainly store time series-like data, where each data point is a binary blob of 5-20KB. We use wide rows, and try to put in the same row all the data that we usually need in a single query (but not more than that). As

Re: Wide rows best practices and GC impact

2014-12-03 Thread Gianluca Borello
Thanks Robert, I really appreciate your help! I'm still unsure why Cassandra 2.1 seem to perform much better in that same scenario (even setting the same values of compaction threshold and number of compactors), but I guess we'll revise when we'll decide to upgrade 2.1 in production. On Dec 3,

Re: Wide Rows - Data Model Design

2014-09-19 Thread Jonathan Lacefield
Hello, Yes, this is a wide row table design. The first col is your Partition Key. The remaining 2 cols are clustering cols. You will receive ordered result sets based on client_name, record_date when running that query. Jonathan [image: datastax_logo.png] Jonathan Lacefield Solution

Re: Wide Rows - Data Model Design

2014-09-19 Thread DuyHai Doan
Does my above table falls under the category of wide rows in Cassandra or not? -- It depends on the cardinality. For each distinct test_id, how many combinations of client_name/record_data do you have ? By the way, why do you put the record_data as part of primary key ? In your table partiton

Re: Wide Rows - Data Model Design

2014-09-19 Thread Check Peck
@DuyHai - I have put that because of this condition - In this table, we can have multiple record_data for same client_name. It can be multiple combinations of client_name and record_data for each distinct test_id. On Fri, Sep 19, 2014 at 8:48 AM, DuyHai Doan doanduy...@gmail.com wrote: Does

Re: Wide Rows - Data Model Design

2014-09-19 Thread DuyHai Doan
Ahh yes, sorry, I read too fast, missed it. On Fri, Sep 19, 2014 at 5:54 PM, Check Peck comptechge...@gmail.com wrote: @DuyHai - I have put that because of this condition - In this table, we can have multiple record_data for same client_name. It can be multiple combinations of client_name

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Vivek Mishra
Can Kundera work with wide rows in an ORM manner? What specifically you looking for? Composite column based implementation can be built using Kundera. With Recent CQL3 developments, Kundera supports most of these. I think POJO needs to be aware of number of fields needs to be persisted(Same as

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Hiller, Dean
PlayOrm supports different types of wide rows like embedded list in the object, etc. etc. There is a list of nosql patterns mixed with playorm patterns on this page http://buffalosw.com/wiki/patterns-page/ From: Les Hartzman lhartz...@gmail.commailto:lhartz...@gmail.com Reply-To:

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Les Hartzman
Hi Vivek, What I'm looking for are a couple of things as I'm gaining an understanding of Cassandra. With wide rows and time series data, how do you (or can you) handle this data in an ORM manner? Now I understand that with CQL3, doing a select * from time_series_data will return the data as

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Les Hartzman
Thanks Dean. I'll check that page out. Les On Wed, Oct 23, 2013 at 7:52 AM, Hiller, Dean dean.hil...@nrel.gov wrote: PlayOrm supports different types of wide rows like embedded list in the object, etc. etc. There is a list of nosql patterns mixed with playorm patterns on this page

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Hiller, Dean
@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, October 23, 2013 11:12 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Wide rows (time series

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Vivek Mishra
Hi, CREATE TABLE sensor_data ( sensor_id text, date text, data_time_stamptimestamp, reading int,

Re: Wide rows (time series data) and ORM

2013-10-23 Thread Les Hartzman
Thanks Vivek. I'll look over those links tonight. On Wed, Oct 23, 2013 at 4:20 PM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, CREATE TABLE sensor_data ( sensor_id text, date text,

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Les Hartzman
So looking at Patrick McFadin's data modeling videos I now know about using compound keys as a way of partitioning data on a by-day basis. My other questions probably go more to the storage engine itself. How do you refer to the columns in the wide row? What kind of names are assigned to the

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Jon Haddad
If you're working with CQL, you don't need to worry about the column names, it's handled for you. If you specify multiple keys as part of the primary key, they become clustering keys and are mapped to the column names. So if you have a sensor_id / time_stamp, all your sensor readings will be

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Les Hartzman
What if you plan on using Kundera and JPQL and not CQL? Les On Oct 21, 2013 4:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're working with CQL, you don't need to worry about the column names, it's handled for you. If you specify multiple keys as part of the primary key, they become

Re: Wide rows/composite keys clarification needed

2013-10-21 Thread Les Hartzman
So I just saw a post about how Kundera translates all JPQL to CQL. On Mon, Oct 21, 2013 at 4:45 PM, Jon Haddad j...@jonhaddad.com wrote: If you're working with CQL, you don't need to worry about the column names, it's handled for you. If you specify multiple keys as part of the primary key,

Re: Wide rows in CQL 3

2013-01-10 Thread Vegard Berget
Thanks for explaining, Sylvain.You say that it is not a mandatory one, how long could we expect it to be not mandatory?I think the new CQL stuff is great and I will probably use it heavily.  I understand the upgrade path, but my question is if I should start planning for an all-CQL future, or if I

Re: Wide rows in CQL 3

2013-01-10 Thread aaron morton
: Sent: Wed, 9 Jan 2013 23:14:25 +0100 Subject: Re: Wide rows in CQL 3 I'd be clear, CQL3 is meant as an upgrade from thrift. Not a mandatory one, you can stick to thrift if you don't think CQL3 is better. But if you do decide to upgrade, you should see CQL3 non compact tables as the new

Re: Wide rows in CQL 3

2013-01-09 Thread Hiller, Dean
Probably should read this http://www.datastax.com/dev/blog/cql3-for-cassandra-experts I don't see wide row support going away since they specifically made the change to enable 2 billion columns in a row according to that paper. Dean From: mrevilgnome

Re: Wide rows in CQL 3

2013-01-09 Thread Ben Hood
I'm currently in the process of porting my app from Thrift to CQL3 and it seems to me that the underlying storage layout hasn't really changed fundamentally. The difference appears to be that CQL3 offers a neater abstraction on top of the wide row format. For example, in CQL3, your query results

Re: Wide rows in CQL 3

2013-01-09 Thread Edward Capriolo
I ask myself this every day. CQL3 is new way to do things, including wide rows with collections. There is no upgrade path. You adopt CQL3's sparse tables as soon as you start creating column families from CQL. There is not much backwards compatibility. CQL3 can query compact tables, but you may

Re: Wide rows in CQL 3

2013-01-09 Thread Sylvain Lebresne
There is no upgrade path. I don't think that's true. The goal of the blog post you've linked is to discuss that upgrade path (and in particular show that for the most part, you can access your thrift data from CQL3 without any modification whatsoever). You adopt CQL3's sparse tables as soon as

Re: Wide rows in CQL 3

2013-01-09 Thread Edward Capriolo
By no upgrade path I mean to say if I have a table with compact storage I can not upgrade it to sparse storage. If i have an existing COMPACT table and I want to add a Map to it, I can not. This is what I mean by no upgrade path. Column families that mix static and dynamic columns are pretty

Re: Wide rows in CQL 3

2013-01-09 Thread Edward Capriolo
Also I have to say I do not get that blank sparse column. Ghost ranges are a little weird but they don't bother me. 1 its a row of nothing. The definition of a waste. 2 suppose of have 1 billion rows and my distribution is mostly rows of 1 or 2 columns. My database is now significantly bigger.

Re: Wide rows in CQL 3

2013-01-09 Thread Janne Jalkanen
On 10 Jan 2013, at 01:30, Edward Capriolo edlinuxg...@gmail.com wrote: Column families that mix static and dynamic columns are pretty common. In fact it is pretty much the default case, you have a default validator then some columns have specific validators. In the old days people used to

Re: Wide rows and reads

2012-07-05 Thread Philip Shon
From what I understand, wide rows have quite a bit of overhead, especially if you are picking columns that are far apart from each other for a given row. This post by Aaron Morton was quite good at explaining this issue http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ -Phil On Thu,

Re: Wide rows or tons of rows?

2010-10-11 Thread Edward Capriolo
2010/10/11 Héctor Izquierdo Seliva izquie...@strands.com: Hi everyone. I'm sure this question or similar has come up before, but I can't find a clear answer. I have to store a unknown number of items in cassandra, which can vary from a few hundreds to a few millions per customer. I read

Re: Wide rows or tons of rows?

2010-10-11 Thread Héctor Izquierdo Seliva
El lun, 11-10-2010 a las 11:08 -0400, Edward Capriolo escribió: Inlined: 2010/10/11 Héctor Izquierdo Seliva izquie...@strands.com: Hi everyone. I'm sure this question or similar has come up before, but I can't find a clear answer. I have to store a unknown number of items in cassandra,

Re: Wide rows or tons of rows?

2010-10-11 Thread Jeremy Davis
Thanks for this reply. I'm wondering about the same issue... Should I bucket things into Wide rows (say 10M rows), or narrow (say 10K or 100K).. Of course it depends on my access patterns right... Does anyone know if a partial row cache is a feasible feature to implement? My use case is something

Re: Wide rows or tons of rows?

2010-10-11 Thread Aaron Morton
No idea about a partial row cache, but I would start with fat rows in your use case. If you find that performance is really a problem then you could add a second "recent / oldest" CF that you maintain with the most recent entries and use the row cache there. OR add more nodes.AaronOn 12 Oct,