Re: Columns limit

2010-08-07 Thread Thomas Heller
Ok, I think the part I was missing was the concatenation of the key and partition to do the look ups. Is this the preferred way of accomplishing needs such as this? Are there alternatives ways? Depending on your needs you can concat the row key or use super columns. How would one then query

Re: Columns limit

2010-08-07 Thread Mark
On 8/7/10 11:30 AM, Mark wrote: On 8/7/10 4:22 AM, Thomas Heller wrote: Ok, I think the part I was missing was the concatenation of the key and partition to do the look ups. Is this the preferred way of accomplishing needs such as this? Are there alternatives ways? Depending on your needs

Re: Columns limit

2010-08-07 Thread Benjamin Black
Right, this is an index row per time interval (your previous email was not). On Sat, Aug 7, 2010 at 11:43 AM, Mark static.void@gmail.com wrote: On 8/7/10 11:30 AM, Mark wrote: On 8/7/10 4:22 AM, Thomas Heller wrote: Ok, I think the part I was missing was the concatenation of the key and

Re: Columns limit

2010-08-07 Thread Mark
On 8/7/10 2:33 PM, Benjamin Black wrote: Right, this is an index row per time interval (your previous email was not). On Sat, Aug 7, 2010 at 11:43 AM, Markstatic.void@gmail.com wrote: On 8/7/10 11:30 AM, Mark wrote: On 8/7/10 4:22 AM, Thomas Heller wrote: Ok, I think

Re: Columns limit

2010-08-07 Thread Benjamin Black
certainly it matters: your previous version is not bounded on time, so will grow without bound. ergo, it is not a good fit for cassandra. On Sat, Aug 7, 2010 at 2:51 PM, Mark static.void@gmail.com wrote: On 8/7/10 2:33 PM, Benjamin Black wrote: Right, this is an index row per time

Re: Columns limit

2010-08-07 Thread Mark
On 8/7/10 7:04 PM, Benjamin Black wrote: certainly it matters: your previous version is not bounded on time, so will grow without bound. ergo, it is not a good fit for cassandra. On Sat, Aug 7, 2010 at 2:51 PM, Markstatic.void@gmail.com wrote: On 8/7/10 2:33 PM, Benjamin Black wrote:

Re: Columns limit

2010-08-07 Thread Benjamin Black
Certainly. There is also a performance penalty to unbounded row sizes. That penalty is your nodes OOMing. I strongly recommend you abandon that direction. On Sat, Aug 7, 2010 at 9:06 PM, Mark static.void@gmail.com wrote: On 8/7/10 7:04 PM, Benjamin Black wrote: certainly it matters:

Re: Columns limit

2010-08-06 Thread Software Dev
I'm a little retarded. Can you explain this a little more in depth? What you mean by index rows named Do you mean create a separate ColumnFamily? On Sat, Jul 31, 2010 at 9:32 PM, Benjamin Black b...@b3k.us wrote: Have the TimeUUID as the key, and then index rows named for the time

Re: Columns limit

2010-08-06 Thread Thomas Heller
Howdy, thought I jump in here. I did something similar, meaning I had lots of items coming in per day and wanted to somehow partition them to avoid running into the column limit (it was also logging related). Solution was pretty simple, log data is immutable, so no SuperColumn needed.

Re: Columns limit

2010-08-06 Thread Software Dev
Thanks for the suggestion. I've somewhat understand all that, the point where my head begins to explode is when I want to figure out something like Continuing with your example: Over the last X amount of days give me all the logs for remote_addr:XXX. I'm guessing I would need to create a

Re: Columns limit

2010-08-06 Thread Thomas Heller
Thanks for the suggestion. I've somewhat understand all that, the point where my head begins to explode is when I want to figure out something like Continuing with your example: Over the last X amount of days give me all the logs for remote_addr:XXX. I'm guessing I would need to create a

Re: Columns limit

2010-08-06 Thread Benjamin Black
Yes, it is common to create distinct CFs for indices. On Fri, Aug 6, 2010 at 4:40 PM, Software Dev static.void@gmail.com wrote: Thanks for the suggestion. I've somewhat understand all that, the point where my head begins to explode is when I want to figure out something like Continuing

Re: Columns limit

2010-08-06 Thread Mark
On 8/6/10 4:50 PM, Thomas Heller wrote: Thanks for the suggestion. I've somewhat understand all that, the point where my head begins to explode is when I want to figure out something like Continuing with your example: Over the last X amount of days give me all the logs for remote_addr:XXX. I'm

Re: Columns limit

2010-08-06 Thread Benjamin Black
Same answer as on other thread right now about how to index: http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/ http://www.slideshare.net/benjaminblack/cassandra-basics-indexing On Fri, Aug 6, 2010 at 6:18 PM, Mark static.void@gmail.com wrote: On 8/6/10 4:50

Re: Columns limit

2010-08-06 Thread Mark
On 8/6/10 6:36 PM, Benjamin Black wrote: Same answer as on other thread right now about how to index: http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/ http://www.slideshare.net/benjaminblack/cassandra-basics-indexing On Fri, Aug 6, 2010 at 6:18 PM,

Re: Columns limit

2010-08-06 Thread Mark
On 8/6/10 6:36 PM, Benjamin Black wrote: Same answer as on other thread right now about how to index: http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/ http://www.slideshare.net/benjaminblack/cassandra-basics-indexing On Fri, Aug 6, 2010 at 6:18 PM,

Re: Columns limit

2010-07-31 Thread Benjamin Black
The proper way to handle this is to have a row per time interval such that the number of columns per row is constrained. On Thu, Jul 29, 2010 at 2:39 PM, Mark static.void@gmail.com wrote: Is there any limitations on the number of columns a row can have? Does all the day for a single key

Re: Columns limit

2010-07-31 Thread Benjamin Black
Have the TimeUUID as the key, and then index rows named for the time intervals, each containing columns with TimeUUID names giving the data in those intervals. On Sat, Jul 31, 2010 at 5:13 PM, Mark static.void@gmail.com wrote: So have the TimeUUID as the key? SearchLogs : {    TimeUUID_1

Re: Columns limit

2010-07-30 Thread Jonathan Ellis
On Thu, Jul 29, 2010 at 4:39 PM, Mark static.void@gmail.com wrote: Is there any limitations on the number of columns a row can have? 2GB of data in a row in 0.6, 2 billion columns in 0.7. Does all the day for a single key need to reside on a single host? yes. if the ratio of rows :

Columns limit

2010-07-29 Thread Mark
Is there any limitations on the number of columns a row can have? Does all the day for a single key need to reside on a single host? If so, wouldn't that mean there is an implicit limit on the number of columns one can have... ie the disk size of that machine. What is the proper way to handle