[ 
https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021387#comment-13021387
 ] 

Jonathan Ellis commented on CASSANDRA-2401:
-------------------------------------------

The more I think about it, the more I think that there is a rare race condition 
here -- we do a kind of row lock during updates of indexed data, but we do not 
lock during reads. So it's possible for an index read to say "row X has this 
value" and then have that value deleted (by another client's request) before we 
can read row X.

BUT that does not look like what you are seeing because if I understand 
correctly you are seeing that the index has permanently missed a delete 
operation.

> getColumnFamily() return null, which is not checked in ColumnFamilyStore.java 
> scan() method, causing Timeout Exception in query
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2401
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.7.4
>         Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse
>            Reporter: Tey Kar Shiang
>
> ColumnFamilyStore.java, line near 1680, "ColumnFamily data = 
> getColumnFamily(new QueryFilter(dk, path, firstFilter))", the data is 
> returned null, causing NULL exception in "satisfies(data, clause, primary)" 
> which is not captured. The callback got timeout and return a Timeout 
> exception to Hector.
> The data is empty, as I traced, I have the the columns Count as 0 in 
> removeDeletedCF(), which return the null there. (I am new and trying to 
> understand the logics around still). Instead of crash to NULL, could we 
> bypass the data?
> About my test:
> A stress-test program to add, modify and delete data to keyspace. I have 30 
> threads simulate concurrent users to perform the actions above, and do a 
> query to all rows periodically. I have Column Family with rows (as File) and 
> columns as index (e.g. userID, fileType).
> No issue on the first day of test, and stopped for 3 days. I restart the test 
> on 4th day, 1 of the users failed to query the files (timeout exception 
> received). Most of the users are still okay with the query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to