Thanks Robert for explanation.

 

Please correct me if I am wrong.

 

Currently running a single node cluster of Cassandra. There is the primary
key on object_id column in both RDBMS and Cassandra.

 

As you correctly pointed out RDBMS does not need to touch the base table. It
can just go through the primary key B-tree index to work out the rows

 

 

       |ROOT:EMIT Operator (VA = 2)

       |

       |   |SCALAR AGGREGATE Operator (VA = 1)

       |   |  Evaluate Ungrouped COUNT AGGREGATE.

       |   |

       |   |   |SCAN Operator (VA = 0)

       |   |   |  FROM TABLE

       |   |   |  t

       |   |   |  Using Clustered Index.

       |   |   |  Index : t_ui

       |   |   |  Forward Scan.

       |   |   |  Positioning at index start.

       |   |   |  Index contains all needed columns. Base table will not be
read.

       |   |   |  Using I/O Size 64 Kbytes for index leaf pages.

       |   |   |  With LRU Buffer Replacement Strategy for index leaf pages.

 

 

Total estimated I/O cost for statement 1 (at line 1): 144996.

 

 

-----------

      300000

 

 

Whereas in Cassandra it has to retrieve every row and count the total of the
rows without sending results back?

 

What are the other alternatives to make it faster if any?

 

 

Cheers,

 

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15",
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN
978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and
Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume
one out shortly

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Ltd, its
subsidiaries or their employees, unless expressly so stated. It is the
responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 

From: Robert Wille [mailto:rwi...@fold3.com] 
Sent: 22 April 2015 15:00
To: user@cassandra.apache.org
Subject: Re: OperationTimedOut in selerct count statement in cqlsh

 

I should have been more clear. What I meant was that its about the same
amount of work for the cluster to do a "select count(l)" as it is to do a
"select l" (unlike in the RDBMS world, where count(l) can use the primary
key index). The reason why is the coordinator has to retrieve all the rows
from all the nodes and count them. The only thing you're saving is that the
rows don't have to be sent to the client. 

 

I heard from another Cassandra user that they found "select l" to be faster
than "select count(l)". I don't know why that would be, but I've seen
stranger things.

 

Robert

 

On Apr 22, 2015, at 7:49 AM, Mich Talebzadeh <m...@peridale.co.uk> wrote:





Thanks Robert,

 

In RDBMS select count(1) basically returns the rows.

 

1> select count(1) from t

2> go

 

-----------

      300000

 

(1 row affected)

 

Is count(1) fundamentally different in Cassandra?

 

Does count(1) means return (in my case) 1 three hundred thousand time?

 

Cheers,

 

 

Mich Talebzadeh

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15",
ISBN 978-0-9563693-0-7.

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN
978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and
Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume
one out shortly

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Ltd, its
subsidiaries or their employees, unless expressly so stated. It is the
responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 

From: Robert Wille [mailto:rwi...@fold3.com] 
Sent: 22 April 2015 14:44
To: user@cassandra.apache.org
Subject: Re: OperationTimedOut in selerct count statement in cqlsh

 

Keep in mind that "select count(l)" and "select l" amount to essentially the
same thing.

 

On Apr 22, 2015, at 3:41 AM, Tommy Stendahl <
<mailto:tommy.stend...@ericsson.com> tommy.stend...@ericsson.com> wrote:






Hi,

Checkout CASSANDRA-8899, my guess is that you have to increase the timeout
in cqlsh.

/Tommy

On 2015-04-22 11:15, Mich Talebzadeh wrote:

Hi,

 

I have a table of 300,000 rows.

 

When I try to do a simple

 

cqlsh:ase> select count(1) from t;

OperationTimedOut: errors={}, last_host=127.0.0.1

 

Appreciate any feedback

 

Thanks,

 

Mich

 

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Ltd, its
subsidiaries or their employees, unless expressly so stated. It is the
responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 

Reply via email to