Marco Matarazzo created CASSANDRA-5501:
------------------------------------------
Summary: Missing data on SELECT on secondary index
Key: CASSANDRA-5501
URL: https://issues.apache.org/jira/browse/CASSANDRA-5501
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.2.4
Environment: linux ubuntu 12.04
Reporter: Marco Matarazzo
We have a 3 nodes cluster, and a keyspace with RF = 3.
>From cassandra-cli everything is fine (we actually never use it, I just
>launched it for a check in this particular case).
[default@goh_master] get agents where station_id = ascii(1110129);
-------------------
RowKey: 6c8efeb6-7209-11e2-890a-aacc00000216
=> (column=, value=, timestamp=1364580868176000)
=> (column=character_points, value=, timestamp=1361030686890000)
=> (column=component_id, value=0, timestamp=1364580868176000)
=> (column=corporation_id, value=3efc729e-7209-11e2-890a-aacc00000216,
timestamp=1361030686890000)
=> (column=entity_id, value=0, timestamp=1364580868176000)
=> (column=manufacturing, value=, timestamp=1361030686890000)
=> (column=model, value=500005, timestamp=1361030686890000)
=> (column=name, value=Jenny Olifield, timestamp=1361030686890000)
=> (column=name_check, value=jenny_olifield, timestamp=1361030686890000)
=> (column=station_id, value=1110129, timestamp=1364580868176000)
=> (column=stats_intellect, value=8, timestamp=1361030686890000)
=> (column=stats_reflexes, value=8, timestamp=1361030686890000)
=> (column=stats_stamina, value=7, timestamp=1361030686890000)
=> (column=stats_technology, value=7, timestamp=1361030686890000)
=> (column=trading, value=, timestamp=1361030686890000)
-------------------
RowKey: dc413373-6b06-11e2-8943-aacc00000216
=> (column=, value=, timestamp=1366568185220000)
=> (column=character_points, value=100, timestamp=1364580381651000)
=> (column=component_id, value=, timestamp=1364580381651000)
=> (column=corporation_id, value=574934cc-6b06-11e2-a512-aacc00000200,
timestamp=1364580381651000)
=> (column=entity_id, value=0, timestamp=1364580381651000)
=> (column=manufacturing, value=, timestamp=1364580381651000)
=> (column=model, value=500018, timestamp=1364580381651000)
=> (column=name, value=Darren Matar, timestamp=1364580381651000)
=> (column=name_check, value=darren_matar, timestamp=1364580381651000)
=> (column=station_id, value=1110129, timestamp=1364580381651000)
=> (column=stats_intellect, value=10, timestamp=1364580381651000)
=> (column=stats_reflexes, value=10, timestamp=1364580381651000)
=> (column=stats_stamina, value=10, timestamp=1364580381651000)
=> (column=stats_technology, value=10, timestamp=1364580381651000)
=> (column=trading, value=1, timestamp=1366568185220000)
-------------------
RowKey: 0e7074ac-64bd-11e2-8c38-aacc00000201
=> (column=, value=, timestamp=1364828039093000)
=> (column=character_points, value=, timestamp=1361030686760000)
=> (column=component_id, value=0, timestamp=1364828039093000)
=> (column=corporation_id, value=e398294e-64bc-11e2-8c38-aacc00000201,
timestamp=1361030686760000)
=> (column=entity_id, value=0, timestamp=1364828039093000)
=> (column=manufacturing, value=1, timestamp=1362517535613000)
=> (column=model, value=500008, timestamp=1361030686760000)
=> (column=name, value=Tom Bishop, timestamp=1361030686760000)
=> (column=name_check, value=tom_bishop, timestamp=1361030686760000)
=> (column=station_id, value=1110129, timestamp=1364828039093000)
=> (column=stats_intellect, value=9, timestamp=1361030686760000)
=> (column=stats_reflexes, value=7, timestamp=1361030686760000)
=> (column=stats_stamina, value=5, timestamp=1361030686760000)
=> (column=stats_technology, value=9, timestamp=1361030686760000)
=> (column=trading, value=, timestamp=1361030686760000)
-------------------
RowKey: 1b462f09-65f3-4148-a1a6-536b52b3bcfa
=> (column=, value=, timestamp=1366568185096000)
=> (column=character_points, value=100, timestamp=1364580381537000)
=> (column=component_id, value=, timestamp=1364580381537000)
=> (column=corporation_id, value=1d2a8803-d139-4b50-85eb-92cb1082de2e,
timestamp=1364580381537000)
=> (column=entity_id, value=0, timestamp=1364580381537000)
=> (column=manufacturing, value=, timestamp=1364580381537000)
=> (column=model, value=500003, timestamp=1364580381537000)
=> (column=name, value=Andrea Len, timestamp=1364580381537000)
=> (column=name_check, value=andrea_len, timestamp=1364580381537000)
=> (column=station_id, value=1110129, timestamp=1364580381537000)
=> (column=stats_intellect, value=10, timestamp=1364580381537000)
=> (column=stats_reflexes, value=10, timestamp=1364580381537000)
=> (column=stats_stamina, value=10, timestamp=1364580381537000)
=> (column=stats_technology, value=10, timestamp=1364580381537000)
=> (column=trading, value=1, timestamp=1366568185096000)
4 Rows Returned.
>From CQLSH, hovewer, the result is different, and 2 rows are missing.
cqlsh:goh_master> select agent_id,name,station_id from agents where
station_id='1110129';
agent_id | name | station_id
--------------------------------------+----------------+------------
6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield | 1110129
0e7074ac-64bd-11e2-8c38-aacc00000201 | Tom Bishop | 1110129
cqlsh:goh_master> select agent_id, name, station_id from agents where agent_id
= '1b462f09-65f3-4148-a1a6-536b52b3bcfa';
agent_id | name | station_id
--------------------------------------+------------+------------
1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len | 1110129
Updating one column makes the single row reappear in the index, but just for
that row and that columns/index.
cqlsh:goh_master> update agents set station_id = '1110129' where agent_id =
'1b462f09-65f3-4148-a1a6-536b52b3bcfa';
cqlsh:goh_master> select agent_id,name,station_id from agents where
station_id='1110129';
agent_id | name | station_id
--------------------------------------+----------------+------------
6c8efeb6-7209-11e2-890a-aacc00000216 | Jenny Olifield | 1110129
0e7074ac-64bd-11e2-8c38-aacc00000201 | Tom Bishop | 1110129
1b462f09-65f3-4148-a1a6-536b52b3bcfa | Andrea Len | 1110129
Updating one columns does not make all the row re-appear on all indexes (as it
would be somewhat expected), but just on the updated one.
cqlsh:goh_master> select * from agents where name = 'Andrea Len';
cqlsh:goh_master>
Running nodetool rebuild_index on all three nodes apparently DOES NOT fixes the
problem, neither do nodetool repair.
We also used COPY TO to dump the entire row to check for hidden spaces or
anything like that, but we can't see anything:
....
dc413373-6b06-11e2-8943-aacc00000216,100,,574934cc-6b06-11e2-a512-aacc00000200,0,,500018,Darren
Matar,darren_matar,1110129,10,10,10,10,1
1b462f09-65f3-4148-a1a6-536b52b3bcfa,100,,1d2a8803-d139-4b50-85eb-92cb1082de2e,0,,500003,Andrea
Len,andrea_len,1110129,10,10,10,10,1
....
The situation still persists, so if needed I am available to do what I can to
check the situation.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira