[ 
https://issues.apache.org/jira/browse/CASSANDRA-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716167#comment-14716167
 ] 

Stefania commented on CASSANDRA-10109:
--------------------------------------

[~benedict], I'm about to start work on this. I would like to spell out the 
algorithm to avoid confusion.

{code}
Up to max_attempts:
- List all files
- Read txn logs, if error retry
- If all tracked files are present, either txn is in progress or retry. If txn 
is in progress return NEW files as temporary and OLD files as final (client 
must handle the case of OLD files disappearing if txn is completed after we've 
read the txn logs).
- If some files are missing:
-- If only some OLD files are missing, either txn is committed or retry. If txn 
is committed return NEW files as final and do not return OLD files.
-- if only some NEW files are missing, either txn is rolled back or retry (here 
there is a window when new files are tracked before they are created). If txn 
is rolled back return OLD files as final and do not return NEW files.
-- if both some OLD and some NEW files are missing, retry (or panic ??)
{code}

Existing logic after MAX_ATTEMPTS have failed stays the same.

bq. We should make behaviour in the case of missing files optional, as in many 
cases missing the new files would be fine,

Can you elaborate a bit more on this? I did not understand why it is necessary. 

> Windows dtest 3.0: ttl_test.py failures
> ---------------------------------------
>
>                 Key: CASSANDRA-10109
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10109
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Joshua McKenzie
>            Assignee: Stefania
>              Labels: Windows
>             Fix For: 3.0.0 rc1
>
>
> ttl_test.py:TestTTL.update_column_ttl_with_default_ttl_test2
> ttl_test.py:TestTTL.update_multiple_columns_ttl_test
> ttl_test.py:TestTTL.update_single_column_ttl_test
> Errors locally are different than CI from yesterday. Yesterday on CI we have 
> timeouts and general node hangs. Today on all 3 tests when run locally I see:
> {noformat}
> Traceback (most recent call last):
>   File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
>     raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
> errors))
> AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 
> 16:53:43,120 NoSpamLogger.java:97 - This platform does not support atomic 
> directory streams (SecureDirectoryStream); race conditions when loading 
> sstable files could occurr']
> {noformat}
> This traces back to the commit for CASSANDRA-7066 today by [~Stefania] and 
> [~benedict].  Stefania - care to take this ticket and also look further into 
> whether or not we're going to have issues with 7066 on Windows? That error 
> message certainly *sounds* like it's not a good thing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to