Re: cql query

2013-05-02 Thread Jabbar Azam
, Sri Ramya ramya.1...@gmail.com wrote: hi Can some body tell me is it possible to to do multiple query on cassandra like Select * from columnfamily where name='foo' and age ='21' and timestamp = 'unixtimestamp' ; Please tell me some guidence for these kind of queries Thank you

Re: cql query

2013-05-02 Thread Sri Ramya
columnfamily ( name varchar, age varchar, tstamp timestamp, partition key((name, age), tstamp) ); Thanks Jabbar Azam On 2 May 2013 11:45, Sri Ramya ramya.1...@gmail.com wrote: hi Can some body tell me is it possible to to do multiple query on cassandra like Select * from

Re: Anyway To Query Just The Partition Key?

2013-04-22 Thread Sylvain Lebresne
What you want is https://issues.apache.org/jira/browse/CASSANDRA-4536 I believe. On Sat, Apr 13, 2013 at 8:16 PM, Gareth Collins gareth.o.coll...@gmail.comwrote: Edward, Thanks for the response. This is what I thought. The only reason why I am doing it like this is that I don't know these

Anyway To Query Just The Partition Key?

2013-04-13 Thread Gareth Collins
Hello, If I have a cql3 table like this (I don't have a table with this data - this is just for example): create table ( surname text, city text, country text, event_id timeuuid, data text, PRIMARY KEY ((surname, city, country),event_id)); there is no way of (easily)

Re: Anyway To Query Just The Partition Key?

2013-04-13 Thread Jabbar Azam
With your example you can do an equality search with surname and city and then use in with country Eg. Select * from yourtable where surname=blah and city=blah blah and country in (country1, country2) Hope that helps Jabbar Azam On 13 Apr 2013 07:06, Gareth Collins gareth.o.coll...@gmail.com

Re: Anyway To Query Just The Partition Key?

2013-04-13 Thread Edward Capriolo
You can 'list' or 'select *' the column family and you get them in a pseudo random order. When you say subset it implies you might want a specific range which is something this schema can not do. On Sat, Apr 13, 2013 at 2:05 AM, Gareth Collins gareth.o.coll...@gmail.comwrote: Hello, If I

Re: Anyway To Query Just The Partition Key?

2013-04-13 Thread Gareth Collins
Thank you for the answer. My apologies. I should have been clearer with my question. Say for example, I have a 1000 partition keys and 1 rows per partition key I am trying to avoid bringing back 10 million rows to find the 1000 partition keys. I assume I cannot avoid bringing back the 10

Re: Anyway To Query Just The Partition Key?

2013-04-13 Thread Gareth Collins
Edward, Thanks for the response. This is what I thought. The only reason why I am doing it like this is that I don't know these partition keys in advance (otherwise I would design this differently). So when I need to insert data, it looks like I need to insert to both the data table and the table

Re: Getting NullPointerException while executing query

2013-04-11 Thread Kuldeep Mishra
I am using cassandra 1.2.0, Thanks Kuldeep On Wed, Apr 10, 2013 at 10:40 PM, Sylvain Lebresne sylv...@datastax.comwrote: On which version of Cassandra are you? I can't reproduce the NullPointerException on Cassandra 1.2.3. That being said, that query is not valid, so you will get

Re: describe keyspace or column family query not working

2013-04-11 Thread aaron morton
tables created without COMPACT STORAGE are still visible in cassandra-cli. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 11/04/2013, at 5:40 AM, Tyler Hobbs ty...@datastax.com wrote: On Wed, Apr 10, 2013 at

describe keyspace or column family query not working

2013-04-10 Thread Kuldeep Mishra
Hi , I am trying to execute following query but not working and throwing exception QUERY:-- Cassandra.Client client; client.execute_cql3_query(ByteBuffer.wrap(describe keyspace mykeyspace.getBytes(Constants.CHARSET_UTF8)), Compression.NONE, ConsistencyLevel.ONE

Getting NullPointerException while executing query

2013-04-10 Thread Kuldeep Mishra
Hi , TABLE - CREATE TABLE CQLUSER ( id int PRIMARY KEY, age int, name text ) Query - select * from CQLUSER where token(name) token(deep); ERROR - Bad Request: Failed parsing statement: [select * from CQLUSER where token(name) token(deep);] reason: NullPointerException

Re: describe keyspace or column family query not working

2013-04-10 Thread Tyler Hobbs
DESCRIBE is a cqlsh feature, not a part of the CQL language. On Wed, Apr 10, 2013 at 2:37 AM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote: Hi , I am trying to execute following query but not working and throwing exception QUERY:-- Cassandra.Client client

Re: describe keyspace or column family query not working

2013-04-10 Thread Vivek Mishra
Mishra kuld.cs.mis...@gmail.comwrote: Hi , I am trying to execute following query but not working and throwing exception QUERY:-- Cassandra.Client client; client.execute_cql3_query(ByteBuffer.wrap(describe keyspace mykeyspace.getBytes(Constants.CHARSET_UTF8)), Compression.NONE

Re: Getting NullPointerException while executing query

2013-04-10 Thread Sylvain Lebresne
On which version of Cassandra are you? I can't reproduce the NullPointerException on Cassandra 1.2.3. That being said, that query is not valid, so you will get an error message. There is 2 reasons why it's not valid: 1) in token(deep), deep is not a valid term. So you should have something like

Re: describe keyspace or column family query not working

2013-04-10 Thread Tyler Hobbs
On Wed, Apr 10, 2013 at 11:09 AM, Vivek Mishra mishra.v...@gmail.comwrote: Ok. A column family and keyspace created via cqlsh using cql3 is visible via cassandra-cli or thrift API? The column family will only be visible via cassandra-cli and the Thrift API if it was created WITH COMPACT

Re: Counter batches query

2013-04-08 Thread aaron morton
For #1 Storage Proxy (server wide) metrics are per request, so 1 in your example. CF level metrics are per row, so 5 in your example. Not sure what graph you were looking at in ops centre, probably best to ask on here http://www.datastax.com/support-forums/ Cheers - Aaron

Re: Counter batches query

2013-04-06 Thread Edward Capriolo
For #2 There are tow mutates in thrift batch_mutate and atomic_batch_mutate. The atomic version was just added. If you care more about the performance do not use the atomic version.. On Sat, Apr 6, 2013 at 12:03 AM, Matt K infinitelimittes...@gmail.comwrote: Hi, I have an application that

Re: Data Model and Query

2013-04-05 Thread aaron morton
for modelling the data for the above query accordingly. Regards, Shubham

Re: Data Model and Query

2013-04-05 Thread Hiller, Dean
I would partition either with cassandra's partitioning or PlayOrm partitioning and query like so Where beginOfMonth=x and startDateX and counter Y. This only returns stuff after X in that partition though so you may need to run multiple queries like this and if you have billions of rows

Counter batches query

2013-04-05 Thread Matt K
Hi, I have an application that does batch (counter) writes to multiple CFs. The application itself is multi-threaded and I'm using C* 1.2.2 and Astyanax driver. Could someone share insights on: 1) When I see the cluster write throughput graph in opscenter, the number is not reflective of actual

Re: Unable to prefix in astyanax read query

2013-04-03 Thread aaron morton
combination. I am using the following query to obtain the results: RowSliceQueryString, ApBaseData query = adu.keyspace .prepareQuery(columnFamily) .getKeySlice(timeStamp) .withColumnRange(new RangeBuilder() .setStart(deviceID+deviceName+_\u0) .setEnd(deviceID+deviceName+_\u

Data Model and Query

2013-04-03 Thread shubham srivastava
Hi, Whats the recommendation on querying a data model like StartDate “X” and counter “Y” . Its kind of range queries across multiple columns and key. I have the flexibility for modelling the data for the above query accordingly. Regards, Shubham

Re: Unable to prefix in astyanax read query

2013-04-02 Thread Hiller, Dean
(Device ID:Device Name: ...). So I believe we can use these 2 fields as prefix to obtain all the entries for a particular time-device combination. I am using the following query to obtain the results: RowSliceQueryString, ApBaseData query = adu.keyspace .prepareQuery(columnFamily) .getKeySlice

Unable to prefix in astyanax read query

2013-04-01 Thread Apurva Jalit
timestamp and device, the column names would be in the pattern (Device ID:Device Name: ...). So I believe we can use these 2 fields as prefix to obtain all the entries for a particular time-device combination. I am using the following query to obtain the results: RowSliceQueryString

Re: Digest Query Seems to be corrupt on certain cases

2013-03-31 Thread aaron morton
When I manually inspected this byte array, it seems hold all details correctly, except the super-column name, causing it to fetch the entire wide row. What is the CF definition and what is the exact query you are sending? There does not appear to be anything obvious in the QueryPath serde

Digest Query Seems to be corrupt on certain cases

2013-03-27 Thread Ravikumar Govindarajan
in digest query and not during actual reads. I am pasting the serialized byte array of SliceByNamesReadCommand, which seems to be corrupt on issuing certain digest queries. //Type is SliceByNamesReadCommand body[0] = (byte)1; //This is a digest query here. body[1] = (byte)1; //Table

Re: Digest Query Seems to be corrupt on certain cases

2013-03-27 Thread aaron morton
to be happening only in digest query and not during actual reads. I am pasting the serialized byte array of SliceByNamesReadCommand, which seems to be corrupt on issuing certain digest queries. //Type is SliceByNamesReadCommand body[0] = (byte)1

Re: Digest Query Seems to be corrupt on certain cases

2013-03-27 Thread Ravikumar Govindarajan
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly error stack was containing 2 threads for the same key, stalling on digest query The below bytes which I referred is the actual value of _body variable in org.apache.cassandra.net.Message object got from the heap dump. As I understand from

Re: cql query not giving any result.

2013-03-18 Thread Sylvain Lebresne
Kuldeep On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.com wrote: Hi , Following cql query not returning any result cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep'; I have enabled secondary indexes on both column. Screen shot is attached

Re: cql query not giving any result.

2013-03-18 Thread Vivek Mishra
, Following cql query not returning any result cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep'; I have enabled secondary indexes on both column. Screen shot is attached Please help -- Thanks and Regards Kuldeep Kumar Mishra +919540965199 -- Thanks

Re: cql query not giving any result.

2013-03-18 Thread Sylvain Lebresne
on is column name . No, it shouldn't be possible and that is your problem. How did you created that table? -- Sylvain Thanks and Regards Kuldeep On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.com wrote: Hi , Following cql query not returning any result

Re: cql query not giving any result.

2013-03-15 Thread Kuldeep Mishra
...@gmail.comwrote: Hi , Following cql query not returning any result cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep'; I have enabled secondary indexes on both column. Screen shot is attached Please help -- Thanks and Regards Kuldeep Kumar Mishra +919540965199

Re: cql query not giving any result.

2013-03-15 Thread Jason Wee
, first one is rowkey and second on is column name . Thanks and Regards Kuldeep On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote: Hi , Following cql query not returning any result cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep'; I

Re: cql query not giving any result.

2013-03-15 Thread Sylvain Lebresne
. No, it shouldn't be possible and that is your problem. How did you created that table? -- Sylvain Thanks and Regards Kuldeep On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.comwrote: Hi , Following cql query not returning any result cqlsh:KunderaExamples select

Re: cql query not giving any result.

2013-03-15 Thread Kuldeep Mishra
name . No, it shouldn't be possible and that is your problem. How did you created that table? -- Sylvain Thanks and Regards Kuldeep On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.com wrote: Hi , Following cql query not returning any result

Re: cql query not giving any result.

2013-03-15 Thread Vivek Mishra
problem. How did you created that table? -- Sylvain Thanks and Regards Kuldeep On Fri, Mar 15, 2013 at 4:05 PM, Kuldeep Mishra kuld.cs.mis...@gmail.com wrote: Hi , Following cql query not returning any result cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep

Re: cql query not giving any result.

2013-03-15 Thread Vivek Mishra
, Following cql query not returning any result cqlsh:KunderaExamples select * from DOCTOR where key='kuldeep'; I have enabled secondary indexes on both column. Screen shot is attached Please help -- Thanks and Regards Kuldeep Kumar Mishra +919540965199 -- Thanks and Regards

Re: CQL query issue

2013-03-05 Thread Vivek Mishra
.*table_name *WHERE clause AND clause ...* *ALLOW FILTERING**LIMIT n* *ORDER BY compound_key_2 ASC | DESC* * * *is this an issue?* * * *-Vivek* On Tue, Mar 5, 2013 at 5:21 PM, Vivek Mishra mishra.v...@gmail.com wrote: Hi, I am trying to execute a cql3 query as : SELECT * FROM

Re: CQL query issue

2013-03-05 Thread Vivek Mishra
* * * *is this an issue?* * * *-Vivek* On Tue, Mar 5, 2013 at 5:21 PM, Vivek Mishra mishra.v...@gmail.comwrote: Hi, I am trying to execute a cql3 query as : SELECT * FROM CompositeUser WHERE userId='mevivs' ALLOW FILTERING LIMIT 100 and getting given below error: Caused

Re: CQL query issue

2013-03-05 Thread Sylvain Lebresne
* * * *is this an issue?* * * *-Vivek* On Tue, Mar 5, 2013 at 5:21 PM, Vivek Mishra mishra.v...@gmail.comwrote: Hi, I am trying to execute a cql3 query as : SELECT * FROM CompositeUser WHERE userId='mevivs' ALLOW FILTERING LIMIT 100 and getting given below error: Caused

Re: Column Slice Query performance after deletions

2013-03-03 Thread aaron morton
I need something to keep the deleted columns away from my query fetch. Not only the tombstones. It looks like the min compaction might help on this. But I'm not sure yet on what would be a reasonable value for its threeshold. Your tombstones will not be purged in a compaction until after

Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
are constantly updated. But the write-load is not that intensive. I estimate it as 100w/sec in the column family. - Each column represents a message which is read and processed by another process. After reading it, the column is marked for deletion in order to keep it out from the next query

Re: Column Slice Query performance after deletions

2013-03-02 Thread Michael Kjellman
which is read and processed by another process. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time

Re: Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only few columns

Re: Column Slice Query performance after deletions

2013-03-02 Thread Michael Kjellman
process. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only

Re: Column Slice Query performance after deletions

2013-03-02 Thread Edward Capriolo
. But the write-load is not that intensive. I estimate it as 100w/sec in the column family. - Each column represents a message which is read and processed by another process. After reading it, the column is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been

Re: Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only few columns, lower than 100. So it looks like that the longer is the number of columns being

Re: Column Slice Query performance after deletions

2013-03-02 Thread Michael Kjellman
is marked for deletion in order to keep it out from the next query on this row. Ok, so, I've been figured out that after many insertions plus deletion updates, my queries( column slice query ) are taking more time to be performed. Even if there are only few columns, lower than 100. So it looks

Re: Column Slice Query performance after deletions

2013-03-02 Thread Víctor Hugo Oliveira Molinar
the deleted columns away from my query fetch. Not only the tombstones. It looks like the min compaction might help on this. But I'm not sure yet on what would be a reasonable value for its threeshold. On Sat, Mar 2, 2013 at 4:22 PM, Michael Kjellman mkjell...@barracuda.comwrote: Tombstones stay

Query data in a CF within a timestamp range

2013-02-28 Thread Kasun Weranga
Hi all, I have a column family with some data + timestamp values and I want to query the column family to fetch data within a timestamp range. AFAIK it is not better to use secondary index for timestamp due to high cardinality. Is there a way to achieve this functionality? Thanks, Kasun.

Re: Query data in a CF within a timestamp range

2013-02-28 Thread Edward Capriolo
. Not an easy way around that other then sharding the reverse index. On Thu, Feb 28, 2013 at 5:49 PM, Kasun Weranga kas...@wso2.com wrote: Hi all, I have a column family with some data + timestamp values and I want to query the column family to fetch data within a timestamp range. AFAIK

Re: How to limit query results like from row 50 to 100

2013-02-21 Thread aaron morton
CQL does not support offset but does have limit. See http://www.datastax.com/docs/1.2/cql_cli/cql/SELECT#specifying-rows-returned-using-limit Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 20/02/2013, at 1:47 PM,

How to limit query results like from row 50 to 100

2013-02-19 Thread Mateus Ferreira e Freitas
With CQL or an API.

Re: Secondary index query + 2 Datacenters + Row Cache + Restart = 0 rows

2013-02-05 Thread Alexei Bakanov
I tried to run with tracing, but it says 'Scanned 0 rows and matched 0'. I found existing issue on this bug https://issues.apache.org/jira/browse/CASSANDRA-4973 I made a d-test for reproducing it and attached to the ticket. Alexei On 2 February 2013 23:00, aaron morton aa...@thelastpickle.com

Re: Secondary index query + 2 Datacenters + Row Cache + Restart = 0 rows

2013-02-02 Thread aaron morton
Can you run the select in cqlsh and enabling tracing (see the cqlsh online help). If you can replicate it then place raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA and update email thread. Thanks - Aaron Morton Freelance Cassandra Developer New Zealand

Secondary index query + 2 Datacenters + Row Cache + Restart = 0 rows

2013-02-01 Thread Alexei Bakanov
Hello, I've found a combination that doesn't work: A column family that have a secondary index and caching='ALL' with data in two datacenters and I do a restart of the nodes, then my secondary index queries start returning 0 rows. It happens when amount of data goes over a certain threshold, so I

Re: Perfroming simple CQL Query using pyhton db-api 2.0 fails

2013-01-24 Thread aaron morton
On 24/01/2013, at 7:14 AM, Paul van Hoven paul.van.ho...@googlemail.com wrote: I try to access my local cassandra database via python. Therefore I installed db-api 2.0 and thrift for accessing the database. Opening and closing a connection works fine. But a simply query is not working

Re: Perfroming simple CQL Query using pyhton db-api 2.0 fails

2013-01-24 Thread Paul van Hoven
On 24/01/2013, at 7:14 AM, Paul van Hoven paul.van.ho...@googlemail.com wrote: I try to access my local cassandra database via python. Therefore I installed db-api 2.0 and thrift for accessing the database. Opening and closing a connection works fine. But a simply query is not working

Perfroming simple CQL Query using pyhton db-api 2.0 fails

2013-01-23 Thread Paul van Hoven
I try to access my local cassandra database via python. Therefore I installed db-api 2.0 and thrift for accessing the database. Opening and closing a connection works fine. But a simply query is not working: The script looks like this: c = conn.cursor() c.execute(select * from users

Composite Keys Query

2013-01-17 Thread Renato Marroquín Mogrovejo
Hi all, I am using some composite keys to get just some specific composite columns names which I am using as follows: create column family video_event with comparator = 'CompositeType(UTF8Type,UTF8Type)' and key_validation_class = 'UTF8Type' and default_validation_class = 'UTF8Type';

Re: Query column names

2013-01-16 Thread Renato Marroquín Mogrovejo
family name the event name plus the timestamp of when it occurred. The thing is that now I want to find out the latest event and I don't how to query asking for the last event without a RangeSlicesQuery, getting all rows, and columns, and asking one by one. Is there any other better way of doing

Re: Query column names

2013-01-16 Thread Renato Marroquín Mogrovejo
plus the timestamp of when it occurred. The thing is that now I want to find out the latest event and I don't how to query asking for the last event without a RangeSlicesQuery, getting all rows, and columns, and asking one by one. Is there any other better way of doing this using Hector client

Re: Collecting of tombstones columns during read query fills up heap

2013-01-14 Thread aaron morton
Just so I understand, the file contents are *not* stored in the column value ? No, on that particular CF the columns are SuperColumns with 5 sub columns (size, is_dir, hash, name, revision). Each super column is small, I didn't mention super columns before because they don't seem to be

Collecting of tombstones columns during read query fills up heap

2013-01-10 Thread André Cruz
that there were a lot of tombstones between those 2 columns. Is there anything, other than schema changes or throttling on the application side, than I can do to prevent problems like these? Basically I would like Cassandra to stop a query if the resultset already has X items whether they are tombstones

Re: Collecting of tombstones columns during read query fills up heap

2013-01-10 Thread aaron morton
than schema changes or throttling on the application side, than I can do to prevent problems like these? Basically I would like Cassandra to stop a query if the resultset already has X items whether they are tombstones or not, and return an error. Or maybe it can stop if the resultset

Re: Collecting of tombstones columns during read query fills up heap

2013-01-10 Thread André Cruz
On Jan 10, 2013, at 8:01 PM, aaron morton aa...@thelastpickle.com wrote: So, one column represents a file in that directory and it has no value. Just so I understand, the file contents are *not* stored in the column value ? No, on that particular CF the columns are SuperColumns with 5 sub

Re: Query regarding SSTable timestamps and counts

2012-12-10 Thread B. Todd Burruss
my two cents ... i know this thread is a bit old, but the fact that odd-sized SSTABLEs (usually large ones) will hang around for a while can be very troublesome on disk space and planning. our data is temporal in cassandra, being deleted constantly. we have seen space usage in the 1+ TB range

How to query secondary indexes

2012-11-28 Thread Oren Karmi
greater than. Let's say I have a room with people and every timestamp, I measure the temperature of the room and number of people. I use the timestamp as my key and I want to select all timestamps where temperature was over 50 degrees but I can't seem to be able to do it with a regular query even

Re: How to query secondary indexes

2012-11-28 Thread Blake Eggleston
You're going to have a problem doing this in a single query because you're asking cassandra to select a non-contiguous set of rows. Also, to my knowledge, you can only use non equal operators on clustering keys. The best solution I could come up with would be to define you table like so: CREATE

Re: Query regarding SSTable timestamps and counts

2012-11-20 Thread aaron morton
My understanding of the compaction process was that since data files keep continuously merging we should not have data files with very old last modified timestamps It is perfectly OK to have very old SSTables. But performing an upgradesstables did decrease the number of data files and

Re: Query regarding SSTable timestamps and counts

2012-11-20 Thread Edward Capriolo
On Tue, Nov 20, 2012 at 5:23 PM, aaron morton aa...@thelastpickle.com wrote: My understanding of the compaction process was that since data files keep continuously merging we should not have data files with very old last modified timestamps It is perfectly OK to have very old SSTables. But

Re: Query regarding SSTable timestamps and counts

2012-11-20 Thread Ananth Gundabattula
Thanks a lot Aaron and Edward. The mail thread clarifies some things for me. For letting others know on this thread, running an upgradesstables did decrease our bloom filter false positive ratios a lot. ( upgradesstables was run not to upgrade from a casasndra version to a higher cassandra

Re: Query regarding SSTable timestamps and counts

2012-11-20 Thread aaron morton
upgradetables re-writes every sstable to have the same contents in the newest format. Agree. In the world of compaction, and excluding upgrades, have older sstables is expected. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton

Re: Collections, query for contains?

2012-11-19 Thread Edward Capriolo
This was my first question after I git the inserts working. Hive has udfs like array contains. It also has lateral view syntax that is similar to transposed. On Monday, November 19, 2012, Timmy Turner timm.t...@gmail.com wrote: Is there no option to query for the contents of a collection

Re: Collections, query for contains?

2012-11-19 Thread Sylvain Lebresne
lateral view syntax that is similar to transposed. On Monday, November 19, 2012, Timmy Turner timm.t...@gmail.com wrote: Is there no option to query for the contents of a collection? Something like select * from cf where c_list contains('some_value') or select * from cf where

Re: Query regarding SSTable timestamps and counts

2012-11-19 Thread Rob Coli
On Sun, Nov 18, 2012 at 7:57 PM, Ananth Gundabattula agundabatt...@gmail.com wrote: As per the above url, After running a major compaction, automatic minor compactions are no longer triggered, frequently requiring you to manually run major compactions on a routine basis. ( Just before the

Re: Query regarding SSTable timestamps and counts

2012-11-18 Thread aaron morton
As per datastax documentation, a manual compaction forces the admin to start compaction manually and disables the automated compaction (atleast for major compactions but not minor compactions ) It does not disable compaction. it creates one big file, which will not be compacted until there

Re: Query regarding SSTable timestamps and counts

2012-11-18 Thread Ananth Gundabattula
Hello Aaron, Thanks a lot for the reply. Looks like the documentation is confusing. Here is the link I am referring to: http://www.datastax.com/docs/1.1/operations/tuning#tuning-compaction It does not disable compaction. As per the above url, After running a major compaction, automatic

Re: Strange delay in query

2012-11-13 Thread aaron morton
I don't think that statement is accurate. Which part ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/11/2012, at 6:31 AM, Binh Nguyen binhn...@gmail.com wrote: I don't think that statement is accurate. The minor compaction is still

Re: Strange delay in query

2012-11-13 Thread André Cruz
On Nov 13, 2012, at 8:54 AM, aaron morton aa...@thelastpickle.com wrote: I don't think that statement is accurate. Which part ? Probably this part: After running a major compaction, automatic minor compactions are no longer triggered, frequently requiring you to manually run major compactions

Re: Strange delay in query

2012-11-13 Thread J. D. Jordan
Correct On Nov 13, 2012, at 5:21 AM, André Cruz andre.c...@co.sapo.pt wrote: On Nov 13, 2012, at 8:54 AM, aaron morton aa...@thelastpickle.com wrote: I don't think that statement is accurate. Which part ? Probably this part: After running a major compaction, automatic minor compactions

Re: Strange delay in query

2012-11-13 Thread aaron morton
Minor compactions will still be triggered whenever a size tier gets 4+ sstables (for the default compaction strategy). So it does not affect new data. It just takes longer for the biggest size tier to get to 4 files. So it takes longer to compact the big output from the major compaction.

Re: Strange delay in query

2012-11-12 Thread Binh Nguyen
I don't think that statement is accurate. The minor compaction is still triggered for small sstables but for the big sstables it may or may not. By default Cassandra will wait until it finds 4 sstables of the same size to trigger the compaction so if the sstables are big then it may take a while

Re: Strange delay in query

2012-11-11 Thread André Cruz
On Nov 11, 2012, at 12:01 AM, Binh Nguyen binhn...@gmail.com wrote: FYI: Repair does not remove tombstones. To remove tombstones you need to run compaction. If you have a lot of data then make sure you run compaction on all nodes before running repair. We had a big trouble with our system

Re: Strange delay in query

2012-11-11 Thread aaron morton
If you have a long lived row with a lot of tombstones or overwrites, it's often more efficient to select a known list of columns. There are short circuits in the read path that can avoid older tombstones filled fragments of the row being read. (Obviously this is hard to do if you don't know the

Re: Strange delay in query

2012-11-10 Thread Binh Nguyen
FYI: Repair does not remove tombstones. To remove tombstones you need to run compaction. If you have a lot of data then make sure you run compaction on all nodes before running repair. We had a big trouble with our system regarding tombstone and it took us long time to figure out the reason. It

Re: Strange delay in query

2012-11-09 Thread André Cruz
That must be it. I dumped the sstables to json and there are lots of records, including ones that are returned to my application, that have the deletedAt attribute. I think this is because the regular repair job was not running for some time, surely more than the grace period, and lots of

Re: Strange delay in query

2012-11-08 Thread André Cruz
On Nov 7, 2012, at 12:15 PM, André Cruz andre.c...@co.sapo.pt wrote: This error also happens on my application that uses pycassa, so I don't think this is the same bug. I have narrowed it down to a slice between two consecutive columns. Observe this behaviour using pycassa:

Re: Strange delay in query

2012-11-08 Thread Andrey Ilinykh
What is the size of columns? Probably those two are huge. On Thu, Nov 8, 2012 at 4:01 AM, André Cruz andre.c...@co.sapo.pt wrote: On Nov 7, 2012, at 12:15 PM, André Cruz andre.c...@co.sapo.pt wrote: This error also happens on my application that uses pycassa, so I don't think this is the

Re: Strange delay in query

2012-11-08 Thread Josep Blanquer
Can it be that you have tons and tons of tombstoned columns in the middle of these two? I've seen plenty of performance issues with wide rows littered with column tombstones (you could check with dumping the sstables...) Just a thought... Josep M. On Thu, Nov 8, 2012 at 12:23 PM, André Cruz

Re: Strange delay in query

2012-11-07 Thread André Cruz
this issue on all 3 nodes. Also, I have a replication factor of 3. 2. What's the result when query without limit? This row has 600k columns. I issued a count, and after some 10s: [disco@Disco] count NamespaceRevision[3cd88d97-ffde-44ca-8ae9-5336caaebc4e]; 609054 columns 3. What's the result

Re: Strange delay in query

2012-11-06 Thread Chuan-Heng Hsiao
query without limit? 3. What's the result after doing nodetool repair -pr on that particular column family and that node? btw, there seems to be some minor bug in the 1.1.5 cassandra-cli (but not in 1.1.6). I got error msg after creating an empty keyspace and updating the replication factor as 3

Re: How does Cassandra optimize this query?

2012-11-05 Thread Sylvain Lebresne
On Mon, Nov 5, 2012 at 4:12 PM, Edward Capriolo edlinuxg...@gmail.comwrote: Is this query the equivalent of a full table scan? Without a starting point get_range_slice is just starting at token 0? It is, but that's what you asked for after all. If you want to start at a given token you can

Re: How does Cassandra optimize this query?

2012-11-05 Thread Edward Capriolo
I see. It is fairly misleading because it is a query that does not work at scale. This syntax is only helpful if you have less then a few thousand rows in Cassandra. On Mon, Nov 5, 2012 at 12:24 PM, Sylvain Lebresne sylv...@datastax.com wrote: On Mon, Nov 5, 2012 at 4:12 PM, Edward Capriolo

Re: How does Cassandra optimize this query?

2012-11-05 Thread Sylvain Lebresne
On Mon, Nov 5, 2012 at 6:55 PM, Edward Capriolo edlinuxg...@gmail.comwrote: I see. It is fairly misleading because it is a query that does not work at scale. This syntax is only helpful if you have less then a few thousand rows in Cassandra. Just for the sake of argument, how

Re: How does Cassandra optimize this query?

2012-11-05 Thread Edward Capriolo
from executing some queries that are not efficient, yet it allows this one. If I am new to Cassandra and developing, this query works and produces a result then once my database gets real data produces a different result (likely an empty one). When I first saw this query two things came to my mind

Re: How does Cassandra optimize this query?

2012-11-05 Thread Sylvain Lebresne
. It is misleading because it is not useful in any other context besides someone playing around with a ten row table in cqlsh. CQL stops me from executing some queries that are not efficient, yet it allows this one. If I am new to Cassandra and developing, this query works and produces a result

How does Cassandra optimize this query?

2012-11-04 Thread Edward Capriolo
If we create a column family: CREATE TABLE videos ( videoid uuid, videoname varchar, username varchar, description varchar, tags varchar, upload_date timestamp, PRIMARY KEY (videoid,videoname) ); The CLI views this column like so: create column family videos with column_type =

Re: How does Cassandra optimize this query?

2012-11-04 Thread Sylvain Lebresne
On Sun, Nov 4, 2012 at 7:49 PM, Edward Capriolo edlinuxg...@gmail.comwrote: CQL3 Allows me to search the second component of a primary key. Which really just seems to be component 1 of a composite column. So what thrift operation does this correspond to? This looks like a column slice

<    4   5   6   7   8   9   10   11   12   >