Re: CQL : Date comparison in where clause fails

2013-02-03 Thread Paul van Hoven
Thanks for the answer. If I understand that correctly I had to do the
following to repair my query:

cqlsh:demodb select * from ola where date  '2013-01-01' and date =
'2013-01-01' limit 10;
Bad Request: datum cannot be restricted by more than one relation if
it includes an Equal
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.

So, this still fails. Therefore I'm not shure whether I missunderstand
the issue or if it does not solve my problem.

2013/2/3 Manu Zhang owenzhang1...@gmail.com:
 On Sun 03 Feb 2013 07:36:58 AM CST, Paul van Hoven wrote:

 I've got a table that has a column called date. I created an index on
 the column date with the following command:

 CREATE INDEX date_key ON ola (date);

 Now, I can perform the following command:

 select * from ola where date = '2013-01-01' limit 10;

 The results are correctly displayed.

 But the the following command fails:
 cqlsh:demodb select * from ola where date  '2013-01-01' limit 10;
 Bad Request: No indexed columns present in by-columns clause with Equal
 operator
 Perhaps you meant to use CQL 2? Try using the -2 option when starting
 cqlsh.

 The same happens when using
 cqlsh:demodb select * from ola where date = '2013-01-01' limit 10;
 Bad Request: No indexed columns present in by-columns clause with Equal
 operator
 Perhaps you meant to use CQL 2? Try using the -2 option when starting
 cqlsh.

 Why does this happen?


 because only EQ operator is allowed. There is a similar question in an
 earlier thread, and as pointed out by Sylvain,
 https://issues.apache.org/jira/browse/CASSANDRA-4476 may finally solve it.


CQL : Request did not complete within rpc_timeout

2013-02-03 Thread Paul van Hoven
After figuring out how to use the  operator on an secondary index I
noticed that in a column family of about 5.5 million datasets I get a
rpc_timeout when trying to read data from this table. In the concrete
situation I want to request data younger than January 1 2013. The
number of rows that should be affected are about 1 million. When doing
the request I get a timeout error:

cqlsh:demodb select * from ola where date  '2013-01-01' and hour = 0
limit 10 allow filtering;
Request did not complete within rpc_timeout.

Actually I find this very confusing since I would except an
exceptional performance gain in comparison to a similar sql query.
Therefore, I think the query I'm performing is not appropriate for
cassandra, although I would do a query like that in this manner on a
sql database. So my question now is: How should I perfrom this query
on cassandra?


Re: CQL : Request did not complete within rpc_timeout

2013-02-03 Thread Paul van Hoven
Okay, here is the schema (actually it is in german, but I translated
the column names such that it is easier to read for an international
audience):

cqlsh:demodb describe table offerten_log_archiv;

CREATE TABLE offerten_log_archiv (
  offerte_id int PRIMARY KEY,
  aktionen int,
  angezeigt bigint,
  datum timestamp,
  gutschrift bigint,
  kampagne_id int,
  klicks int,
  klicks_ungueltig int,
  kosten bigint,
  statistik_id bigint,
  stunden int,
  werbeflaeche_id int,
  werbemittel_id int
) WITH
  bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.00 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.10 AND
  replicate_on_write='true' AND
  compaction={'class': 'SizeTieredCompactionStrategy'};

CREATE INDEX datum_key ON offerten_log_archiv (datum);

CREATE INDEX stunden_key ON offerten_log_archiv (stunden);

cqlsh:demodb

This is the query I'm trying to perform:
cqlsh:demodb select * from ola where date  '2013-01-01' and hour = 0
limit 10 allow filtering;
Request did not complete within rpc_timeout.

ola = offerten_log_archiv (table name)
hour = stunde (column name)
date = datum (column name)

I hope this information makes my problem more clear.



2013/2/3 Edward Capriolo edlinuxg...@gmail.com:
 Without seeing your schema it is hard to say, but in some cases ALLOW
 FILTERING might be considered EXPECT THIS COULD BE SLOW. It could
 mean the query is not hitting and index and is going to page through
 large amounts of data.

 On Sun, Feb 3, 2013 at 9:42 AM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:
 After figuring out how to use the  operator on an secondary index I
 noticed that in a column family of about 5.5 million datasets I get a
 rpc_timeout when trying to read data from this table. In the concrete
 situation I want to request data younger than January 1 2013. The
 number of rows that should be affected are about 1 million. When doing
 the request I get a timeout error:

 cqlsh:demodb select * from ola where date  '2013-01-01' and hour = 0
 limit 10 allow filtering;
 Request did not complete within rpc_timeout.

 Actually I find this very confusing since I would except an
 exceptional performance gain in comparison to a similar sql query.
 Therefore, I think the query I'm performing is not appropriate for
 cassandra, although I would do a query like that in this manner on a
 sql database. So my question now is: How should I perfrom this query
 on cassandra?


Re: CQL : Request did not complete within rpc_timeout

2013-02-03 Thread Paul van Hoven
I'm not sure if I understood your answer.

 When you have GB or TB of data any query that adds WITH FILTERING
 will not work at scale.
1. You mean any query that requires with filtering is slow?

 Secondary indexes need at least one equality. If you want to do this
 at scale you might need a different design.
2. And what design would be recommendable then?

3. How should the query look like such that it would scale?



2013/2/3 Edward Capriolo edlinuxg...@gmail.com:
 Secondary indexes need at least one equality. If you want to do this
 at scale you might need a different design.

 Using WITH FILTERING and LIMIT 10 is simply grabbing the first few
 random rows that match your criteria.

 When you have GB or TB of data any query that adds WITH FILTERING
 will not work at scale.

 This is why it was added to the language CQL lets you do some queries
 that seem fast when your developing with 10 rows, without this
 clause you would not know if a query is fast because it hits a
 cassandra index, or it is just fast because the results were found in
 the first 10 rows.

 Edward

 On Sun, Feb 3, 2013 at 10:56 AM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:
 Okay, here is the schema (actually it is in german, but I translated
 the column names such that it is easier to read for an international
 audience):

 cqlsh:demodb describe table offerten_log_archiv;

 CREATE TABLE offerten_log_archiv (
   offerte_id int PRIMARY KEY,
   aktionen int,
   angezeigt bigint,
   datum timestamp,
   gutschrift bigint,
   kampagne_id int,
   klicks int,
   klicks_ungueltig int,
   kosten bigint,
   statistik_id bigint,
   stunden int,
   werbeflaeche_id int,
   werbemittel_id int
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   compaction={'class': 'SizeTieredCompactionStrategy'};

 CREATE INDEX datum_key ON offerten_log_archiv (datum);

 CREATE INDEX stunden_key ON offerten_log_archiv (stunden);

 cqlsh:demodb

 This is the query I'm trying to perform:
 cqlsh:demodb select * from ola where date  '2013-01-01' and hour = 0
 limit 10 allow filtering;
 Request did not complete within rpc_timeout.

 ola = offerten_log_archiv (table name)
 hour = stunde (column name)
 date = datum (column name)

 I hope this information makes my problem more clear.



 2013/2/3 Edward Capriolo edlinuxg...@gmail.com:
 Without seeing your schema it is hard to say, but in some cases ALLOW
 FILTERING might be considered EXPECT THIS COULD BE SLOW. It could
 mean the query is not hitting and index and is going to page through
 large amounts of data.

 On Sun, Feb 3, 2013 at 9:42 AM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:
 After figuring out how to use the  operator on an secondary index I
 noticed that in a column family of about 5.5 million datasets I get a
 rpc_timeout when trying to read data from this table. In the concrete
 situation I want to request data younger than January 1 2013. The
 number of rows that should be affected are about 1 million. When doing
 the request I get a timeout error:

 cqlsh:demodb select * from ola where date  '2013-01-01' and hour = 0
 limit 10 allow filtering;
 Request did not complete within rpc_timeout.

 Actually I find this very confusing since I would except an
 exceptional performance gain in comparison to a similar sql query.
 Therefore, I think the query I'm performing is not appropriate for
 cassandra, although I would do a query like that in this manner on a
 sql database. So my question now is: How should I perfrom this query
 on cassandra?


Re: CQL : Request did not complete within rpc_timeout

2013-02-03 Thread Paul van Hoven
Thanks for the answer. Can anybody else answer my other two questions,
because my problem is not solved yet?

2013/2/3 Edward Capriolo edlinuxg...@gmail.com:
 This was the issue that prompted the WITH FILTERING ALLOWED:

 https://issues.apache.org/jira/browse/CASSANDRA-4915

 Cassandra's storage system can only optimize certain queries.

 On Sun, Feb 3, 2013 at 2:07 PM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:
 I'm not sure if I understood your answer.

 When you have GB or TB of data any query that adds WITH FILTERING
 will not work at scale.
 1. You mean any query that requires with filtering is slow?

 Secondary indexes need at least one equality. If you want to do this
 at scale you might need a different design.
 2. And what design would be recommendable then?

 3. How should the query look like such that it would scale?



 2013/2/3 Edward Capriolo edlinuxg...@gmail.com:
 Secondary indexes need at least one equality. If you want to do this
 at scale you might need a different design.

 Using WITH FILTERING and LIMIT 10 is simply grabbing the first few
 random rows that match your criteria.

 When you have GB or TB of data any query that adds WITH FILTERING
 will not work at scale.

 This is why it was added to the language CQL lets you do some queries
 that seem fast when your developing with 10 rows, without this
 clause you would not know if a query is fast because it hits a
 cassandra index, or it is just fast because the results were found in
 the first 10 rows.

 Edward

 On Sun, Feb 3, 2013 at 10:56 AM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:
 Okay, here is the schema (actually it is in german, but I translated
 the column names such that it is easier to read for an international
 audience):

 cqlsh:demodb describe table offerten_log_archiv;

 CREATE TABLE offerten_log_archiv (
   offerte_id int PRIMARY KEY,
   aktionen int,
   angezeigt bigint,
   datum timestamp,
   gutschrift bigint,
   kampagne_id int,
   klicks int,
   klicks_ungueltig int,
   kosten bigint,
   statistik_id bigint,
   stunden int,
   werbeflaeche_id int,
   werbemittel_id int
 ) WITH
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.00 AND
   gc_grace_seconds=864000 AND
   read_repair_chance=0.10 AND
   replicate_on_write='true' AND
   compaction={'class': 'SizeTieredCompactionStrategy'};

 CREATE INDEX datum_key ON offerten_log_archiv (datum);

 CREATE INDEX stunden_key ON offerten_log_archiv (stunden);

 cqlsh:demodb

 This is the query I'm trying to perform:
 cqlsh:demodb select * from ola where date  '2013-01-01' and hour = 0
 limit 10 allow filtering;
 Request did not complete within rpc_timeout.

 ola = offerten_log_archiv (table name)
 hour = stunde (column name)
 date = datum (column name)

 I hope this information makes my problem more clear.



 2013/2/3 Edward Capriolo edlinuxg...@gmail.com:
 Without seeing your schema it is hard to say, but in some cases ALLOW
 FILTERING might be considered EXPECT THIS COULD BE SLOW. It could
 mean the query is not hitting and index and is going to page through
 large amounts of data.

 On Sun, Feb 3, 2013 at 9:42 AM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:
 After figuring out how to use the  operator on an secondary index I
 noticed that in a column family of about 5.5 million datasets I get a
 rpc_timeout when trying to read data from this table. In the concrete
 situation I want to request data younger than January 1 2013. The
 number of rows that should be affected are about 1 million. When doing
 the request I get a timeout error:

 cqlsh:demodb select * from ola where date  '2013-01-01' and hour = 0
 limit 10 allow filtering;
 Request did not complete within rpc_timeout.

 Actually I find this very confusing since I would except an
 exceptional performance gain in comparison to a similar sql query.
 Therefore, I think the query I'm performing is not appropriate for
 cassandra, although I would do a query like that in this manner on a
 sql database. So my question now is: How should I perfrom this query
 on cassandra?


CQL : Date comparison in where clause fails

2013-02-02 Thread Paul van Hoven
I've got a table that has a column called date. I created an index on
the column date with the following command:

CREATE INDEX date_key ON ola (date);

Now, I can perform the following command:

select * from ola where date = '2013-01-01' limit 10;

The results are correctly displayed.

But the the following command fails:
cqlsh:demodb select * from ola where date  '2013-01-01' limit 10;
Bad Request: No indexed columns present in by-columns clause with Equal operator
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.

The same happens when using
cqlsh:demodb select * from ola where date = '2013-01-01' limit 10;
Bad Request: No indexed columns present in by-columns clause with Equal operator
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.

Why does this happen?


cql: show tables in a keystone

2013-01-28 Thread Paul van Hoven
Is there some way in cql to get a list of all tables or column
families that belong to a keystore like show tables in sql?


Re: Perfroming simple CQL Query using pyhton db-api 2.0 fails

2013-01-24 Thread Paul van Hoven
The reason for the error was that I opened the connection to the database wrong.

I did:
con = cql.connect(host, port, keyspace)

but correct is:
con = cql.connect(host, port, keyspace, cql_version='3.0.0')

Now it works fine. Thanks for reading.

2013/1/24 aaron morton aa...@thelastpickle.com:
 How did you create the table?

 Anyways that looks like a bug, I *think* they should go here
 http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/issues/list

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 24/01/2013, at 7:14 AM, Paul van Hoven paul.van.ho...@googlemail.com
 wrote:

 I try to access my local cassandra database via python. Therefore I
 installed db-api 2.0 and thrift for accessing the database. Opening
 and closing a connection works fine. But a simply query is not
 working:

 The script looks like this:

c = conn.cursor()
c.execute(select * from users;)
data = c.fetchall()
print Query: select * from users; returned the following result:
print str(data)


 The table users looks like this:
 qlsh:demodb select * from users;

 user_name | birth_year | gender | password | session_token | state
 ---+++--+---+---
jsmith |   null |   null |   secret |  null |  null



 But when I try to execute it I get the following error:
 Open connection to localhost:9160 on keyspace demodb
 Traceback (most recent call last):
  File
 /Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py,
 line 56, in module
perfromSimpleCQLQuery()
  File
 /Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py,
 line 46, in perfromSimpleCQLQuery
c.execute(select * from users;)
  File /Library/Python/2.7/site-packages/cql/cursor.py, line 81, in execute
return self.process_execution_results(response, decoder=decoder)
  File /Library/Python/2.7/site-packages/cql/thrifteries.py, line
 116, in process_execution_results
self.get_metadata_info(self.result[0])
  File /Library/Python/2.7/site-packages/cql/cursor.py, line 97, in
 get_metadata_info
name, nbytes, vtype, ctype = self.get_column_metadata(colid)
  File /Library/Python/2.7/site-packages/cql/cursor.py, line 104, in
 get_column_metadata
return self.decoder.decode_metadata_and_type(column_id)
  File /Library/Python/2.7/site-packages/cql/decoders.py, line 45,
 in decode_metadata_and_type
name = self.name_decode_error(e, namebytes,
 comptype.cql_parameterized_type())
  File /Library/Python/2.7/site-packages/cql/decoders.py, line 29,
 in name_decode_error
% (namebytes, expectedtype, err))
 cql.apivalues.ProgrammingError: column name '\x00\x00\x00' can't be
 deserialized as 'org.apache.cassandra.db.marshal.CompositeType':
 global name 'self' is not defined

 I'm not shure if this is the right place to ask for: But am I doing
 here something wrong?




Perfroming simple CQL Query using pyhton db-api 2.0 fails

2013-01-23 Thread Paul van Hoven
I try to access my local cassandra database via python. Therefore I
installed db-api 2.0 and thrift for accessing the database. Opening
and closing a connection works fine. But a simply query is not
working:

The script looks like this:

c = conn.cursor()
c.execute(select * from users;)
data = c.fetchall()
print Query: select * from users; returned the following result:
print str(data)


The table users looks like this:
qlsh:demodb select * from users;

 user_name | birth_year | gender | password | session_token | state
---+++--+---+---
jsmith |   null |   null |   secret |  null |  null



But when I try to execute it I get the following error:
Open connection to localhost:9160 on keyspace demodb
Traceback (most recent call last):
  File 
/Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py,
line 56, in module
perfromSimpleCQLQuery()
  File 
/Users/Tom/Freelancing/Company/Python/ApacheCassandra/src/CassandraDemo.py,
line 46, in perfromSimpleCQLQuery
c.execute(select * from users;)
  File /Library/Python/2.7/site-packages/cql/cursor.py, line 81, in execute
return self.process_execution_results(response, decoder=decoder)
  File /Library/Python/2.7/site-packages/cql/thrifteries.py, line
116, in process_execution_results
self.get_metadata_info(self.result[0])
  File /Library/Python/2.7/site-packages/cql/cursor.py, line 97, in
get_metadata_info
name, nbytes, vtype, ctype = self.get_column_metadata(colid)
  File /Library/Python/2.7/site-packages/cql/cursor.py, line 104, in
get_column_metadata
return self.decoder.decode_metadata_and_type(column_id)
  File /Library/Python/2.7/site-packages/cql/decoders.py, line 45,
in decode_metadata_and_type
name = self.name_decode_error(e, namebytes,
comptype.cql_parameterized_type())
  File /Library/Python/2.7/site-packages/cql/decoders.py, line 29,
in name_decode_error
% (namebytes, expectedtype, err))
cql.apivalues.ProgrammingError: column name '\x00\x00\x00' can't be
deserialized as 'org.apache.cassandra.db.marshal.CompositeType':
global name 'self' is not defined

I'm not shure if this is the right place to ask for: But am I doing
here something wrong?


Creating a keyspace fails

2013-01-22 Thread Paul van Hoven
I just started with cassandra. Currently I'm reading the following
tutorial about cal:
http://www.datastax.com/docs/1.1/dml/using_cql#use-cql

But I already fail when trying to create a keyspace:


$ ./cqlsh --cql3
Connected to Test Cluster at localhost:9160.
[cqlsh 2.3.0 | Cassandra 1.2.0 | CQL spec 3.0.0 | Thrift protocol 19.35.0]
Use HELP for help.
cqlsh CREATE KEYSPACE demodb WITH strategy_class = 'SimpleStrategy'
AND strategy_options:replication_factor='1';
Bad Request: line 1:82 mismatched input ':' expecting '='
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.


What is wrong?


Re: Creating a keyspace fails

2013-01-22 Thread Paul van Hoven
Okay, that worked. Why is the statement from the tutorial wrong. I
mean, why would a company like datastax post somthing like this?

2013/1/22 Jason Wee peich...@gmail.com:
 cqlsh CREATE KEYSPACE demodb WITH replication = {'class': 'SimpleStrategy',
 'replication_factor': 3};
 cqlsh use demodb;
 cqlsh:demodb


 On Tue, Jan 22, 2013 at 7:04 PM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:

 CREATE KEYSPACE demodb WITH strategy_class = 'SimpleStrategy'
 AND strategy_options:replication_factor='1';





Re: Creating a keyspace fails

2013-01-22 Thread Paul van Hoven
Alright. Thanks for you quick help. :)

2013/1/22 Jason Wee peich...@gmail.com:
 maybe typo or forget to update the doc... but anyway, you can use the help
 command when you are in cqlsh.. for example:

 cqlsh HELP CREATE_KEYSPACE;

 CREATE KEYSPACE ksname
 WITH replication = {'class':'strategy' [,'option':val]};



 On Tue, Jan 22, 2013 at 8:06 PM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:

 Okay, that worked. Why is the statement from the tutorial wrong. I
 mean, why would a company like datastax post somthing like this?

 2013/1/22 Jason Wee peich...@gmail.com:
  cqlsh CREATE KEYSPACE demodb WITH replication = {'class':
  'SimpleStrategy',
  'replication_factor': 3};
  cqlsh use demodb;
  cqlsh:demodb
 
 
  On Tue, Jan 22, 2013 at 7:04 PM, Paul van Hoven
  paul.van.ho...@googlemail.com wrote:
 
  CREATE KEYSPACE demodb WITH strategy_class = 'SimpleStrategy'
  AND strategy_options:replication_factor='1';