[
https://issues.apache.org/jira/browse/CASSANDRA-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mateusz Moneta updated CASSANDRA-8824:
--------------------------------------
Description:
When we querying partition with static column that has more than 5000 entities
some of them has unset static value, however when querying by cqlsh everything
is fine.
Here is example, {{expire}} is a static column, {{folder_id}} is primary key.
{noformat}
cqlsh> select id, parent_id, expire, mtime from share.entity where
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2' and mtime < '2015-02-01
06:21:25+0000';
id | parent_id | expire | mtime
----------------------------------+-----------+--------------------------+--------------------------
68f2af3a2d1e4f95a231d5cb47e57cf2 | null | 2015-02-22 10:51:27+0000 |
2015-02-01 06:21:24+0000
cqlsh> select count(*) from share.entity where
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2';
count
-------
5547
In [1]: from django.db import connection
In [2]: ses = connection.connection.session
In [3]: from cassandra.query import SimpleStatement
In [13]: query = "select * from share.entity where
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'";
In [14]: st = SimpleStatement(query)
In [15]: c, d = 0, 0
In [16]: for e in ses.execute(st):
if e['expire'] is None:
c += 1
else:
d += 1
In [17]: c
Out[17]: 547
In [18]: d
Out[18]: 5000
{noformat}
After further digging its turned out that this is a problem with fetch_size
param and this can be easily reproduced:
{noformat}
In [1]: from cassandra.query import SimpleStatement
In [2]: from django.db import connection
In [3]: ses = connection.connection.session
In [4]: ses.execute(SimpleStatement("create table t (k text, s text static, i
int, primary key(k, i));"))
In [5]: for i in range(1, 500):
....: ses.execute(SimpleStatement("insert into t (k, i) values ('k',
%d);" % i))
In [6]: c, d = 0, 0
In [7]: for e in ses.execute(SimpleStatement("select * from t",
fetch_size=100)):
if e['s'] is None:
c += 1
else:
d += 1
....:
In [8]: c
Out[8]: 400
In [9]: d
Out[9]: 100
{noformat}
was:
When we querying partition with static column that has more than 5000 entities
some of them has unset static value, however when querying by cqlsh everything
is fine.
Here is example, {{expire}} is a static column, {{folder_id}} is primary key.
{noformat}
cqlsh> select id, parent_id, expire, mtime from share.entity where
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2' and mtime < '2015-02-01
06:21:25+0000';
id | parent_id | expire | mtime
----------------------------------+-----------+--------------------------+--------------------------
68f2af3a2d1e4f95a231d5cb47e57cf2 | null | 2015-02-22 10:51:27+0000 |
2015-02-01 06:21:24+0000
cqlsh> select count(*) from share.entity where
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2';
count
-------
5547
In [1]: from django.db import connection
In [2]: ses = connection.connection.session
In [3]: from cassandra.query import SimpleStatement
In [13]: query = "select * from share.entity where
folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'";
In [14]: st = SimpleStatement(query)
In [15]: c, d = 0, 0
In [16]: for e in ses.execute(st):
if e['expire'] is None:
c += 1
else:
d += 1
In [17]: c
Out[17]: 547
In [18]: d
Out[18]: 5000
{noformat}
After further digging its turned out that this is a problem with fetch_size
param and this can be easily reproduced:
{noformat}
In [1]: from cassandra.query import SimpleStatement
In [2]: from django.db import connection
In [3]: ses = connection.connection.session
In [4]: ses.execute(SimpleStatement("create table t (k text, s text static, i
int, primary key(k, i));"))
In [5]: for i in range(1, 500):
....: ses.execute(SimpleStatement("insert into share.t (k, i) values
('k', %d);" % i))
In [6]: c, d = 0, 0
In [7]: for e in ses.execute(SimpleStatement("select * from share.t",
fetch_size=100)):
if e['s'] is None:
c += 1
else:
d += 1
....:
In [8]: c
Out[8]: 400
In [9]: d
Out[9]: 100
{noformat}
> cassandra python driver return None when querying static column on partition
> bigger than 5000 entites
> -----------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-8824
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8824
> Project: Cassandra
> Issue Type: Bug
> Reporter: Mateusz Moneta
>
> When we querying partition with static column that has more than 5000
> entities some of them has unset static value, however when querying by cqlsh
> everything is fine.
> Here is example, {{expire}} is a static column, {{folder_id}} is primary key.
> {noformat}
> cqlsh> select id, parent_id, expire, mtime from share.entity where
> folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2' and mtime < '2015-02-01
> 06:21:25+0000';
> id | parent_id | expire |
> mtime
> ----------------------------------+-----------+--------------------------+--------------------------
> 68f2af3a2d1e4f95a231d5cb47e57cf2 | null | 2015-02-22 10:51:27+0000 |
> 2015-02-01 06:21:24+0000
> cqlsh> select count(*) from share.entity where
> folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2';
> count
> -------
> 5547
> In [1]: from django.db import connection
> In [2]: ses = connection.connection.session
> In [3]: from cassandra.query import SimpleStatement
> In [13]: query = "select * from share.entity where
> folder_id='68f2af3a2d1e4f95a231d5cb47e57cf2'";
> In [14]: st = SimpleStatement(query)
> In [15]: c, d = 0, 0
> In [16]: for e in ses.execute(st):
> if e['expire'] is None:
> c += 1
> else:
> d += 1
> In [17]: c
> Out[17]: 547
> In [18]: d
> Out[18]: 5000
> {noformat}
> After further digging its turned out that this is a problem with fetch_size
> param and this can be easily reproduced:
> {noformat}
> In [1]: from cassandra.query import SimpleStatement
> In [2]: from django.db import connection
> In [3]: ses = connection.connection.session
> In [4]: ses.execute(SimpleStatement("create table t (k text, s text static, i
> int, primary key(k, i));"))
> In [5]: for i in range(1, 500):
> ....: ses.execute(SimpleStatement("insert into t (k, i) values ('k',
> %d);" % i))
> In [6]: c, d = 0, 0
> In [7]: for e in ses.execute(SimpleStatement("select * from t",
> fetch_size=100)):
> if e['s'] is None:
> c += 1
> else:
> d += 1
> ....:
> In [8]: c
> Out[8]: 400
> In [9]: d
> Out[9]: 100
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)