[ 
https://issues.apache.org/jira/browse/CASSANDRA-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mateusz Moneta updated CASSANDRA-8824:
--------------------------------------
    Description: 
When we querying partition with static column that has more than 5000 entities 
some of them has unset static value, however when querying by cqlsh everything 
is fine.

Here is example, {{expire}} is a static column, {{folder_id}} is primary key.
{noformat}
cqlsh> select id, parent_id, expire, mtime from share.entity where 
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2' and mtime < '2015-02-01 
06:21:25+0000';

 id                               | parent_id | expire                   | mtime
----------------------------------+-----------+--------------------------+--------------------------
 68f2af3a2d1e4f95a231d5cb47e57cf2 |      null | 2015-02-22 10:51:27+0000 | 
2015-02-01 06:21:24+0000

cqlsh> select count(*) from share.entity where 
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2';
 count
-------
  5547

In [1]: from django.db import connection

In [2]: ses = connection.connection.session

In [3]: from cassandra.query import SimpleStatement

In [13]: query = "select * from share.entity where 
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'";

In [14]: st = SimpleStatement(query)

In [15]: c, d = 0, 0

In [16]: for e in ses.execute(st):
    if e['expire'] is None:
        c += 1
    else:
        d += 1

In [17]: c
Out[17]: 547

In [18]: d
Out[18]: 5000

{noformat}

After further digging its turned out that this is a problem with fetch_size 
param and this can be easily reproduced:

{noformat}
In [1]: from cassandra.query import SimpleStatement

In [2]: from django.db import connection

In [3]: ses = connection.connection.session

In [4]: ses.execute(SimpleStatement("create table t (k text, s text static, i 
int, primary key(k, i));"))

In [5]: for i in range(1, 500):
   ....:     ses.execute(SimpleStatement("insert into share.t (k, i) values 
('k', %d);" % i))

In [6]: c, d = 0, 0

In [7]: for e in ses.execute(SimpleStatement("select * from share.t", 
fetch_size=100)):
    if e['s'] is None:
        c += 1
    else:
        d += 1
   ....:         

In [8]: c
Out[8]: 400

In [9]: d
Out[9]: 100

{noformat}

  was:
When we querying partition with static column that has more than 5000 entities 
some of them has unset static value, however when querying by cqlsh everything 
is fine.

Here is example, {{expire}} is a static column, {{folder_id}} is primary key.
{noformat}
cqlsh> select id, parent_id, expire, mtime from share.entity where 
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2' and mtime < '2015-02-01 
06:21:25+0000';

 id                               | parent_id | expire                   | mtime
----------------------------------+-----------+--------------------------+--------------------------
 68f2af3a2d1e4f95a231d5cb47e57cf2 |      null | 2015-02-22 10:51:27+0000 | 
2015-02-01 06:21:24+0000

cqlsh> select count(*) from share.entity where 
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2';
 count
-------
  5547

In [1]: from django.db import connection

In [2]: ses = connection.connection.session

In [3]: from cassandra.query import SimpleStatement

In [13]: query = "select * from share.entity where 
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'";

In [14]: st = SimpleStatement(query)

In [15]: c, d = 0, 0

In [16]: for e in ses.execute(st):
    if e['expire'] is None:
        c += 1
    else:
        d += 1

In [17]: c
Out[17]: 547

In [18]: d
Out[18]: 5000

{noformat}


> cassandra python driver return None when querying static column on partition 
> bigger than 5000 entites
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8824
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8824
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Mateusz Moneta
>
> When we querying partition with static column that has more than 5000 
> entities some of them has unset static value, however when querying by cqlsh 
> everything is fine.
> Here is example, {{expire}} is a static column, {{folder_id}} is primary key.
> {noformat}
> cqlsh> select id, parent_id, expire, mtime from share.entity where 
> folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2' and mtime < '2015-02-01 
> 06:21:25+0000';
>  id                               | parent_id | expire                   | 
> mtime
> ----------------------------------+-----------+--------------------------+--------------------------
>  68f2af3a2d1e4f95a231d5cb47e57cf2 |      null | 2015-02-22 10:51:27+0000 | 
> 2015-02-01 06:21:24+0000
> cqlsh> select count(*) from share.entity where 
> folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2';
>  count
> -------
>   5547
> In [1]: from django.db import connection
> In [2]: ses = connection.connection.session
> In [3]: from cassandra.query import SimpleStatement
> In [13]: query = "select * from share.entity where 
> folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'";
> In [14]: st = SimpleStatement(query)
> In [15]: c, d = 0, 0
> In [16]: for e in ses.execute(st):
>     if e['expire'] is None:
>         c += 1
>     else:
>         d += 1
> In [17]: c
> Out[17]: 547
> In [18]: d
> Out[18]: 5000
> {noformat}
> After further digging its turned out that this is a problem with fetch_size 
> param and this can be easily reproduced:
> {noformat}
> In [1]: from cassandra.query import SimpleStatement
> In [2]: from django.db import connection
> In [3]: ses = connection.connection.session
> In [4]: ses.execute(SimpleStatement("create table t (k text, s text static, i 
> int, primary key(k, i));"))
> In [5]: for i in range(1, 500):
>    ....:     ses.execute(SimpleStatement("insert into share.t (k, i) values 
> ('k', %d);" % i))
> In [6]: c, d = 0, 0
> In [7]: for e in ses.execute(SimpleStatement("select * from share.t", 
> fetch_size=100)):
>     if e['s'] is None:
>         c += 1
>     else:
>         d += 1
>    ....:         
> In [8]: c
> Out[8]: 400
> In [9]: d
> Out[9]: 100
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to