[jira] [Commented] (CASSANDRA-4190) Apparent data loss using super columns and row cache via ConcurrentLinkedHashCacheProvider
[ https://issues.apache.org/jira/browse/CASSANDRA-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262637#comment-13262637 ] Jonathan Ellis commented on CASSANDRA-4190: --- Javadoc also needs update, otherwise +1 > Apparent data loss using super columns and row cache via > ConcurrentLinkedHashCacheProvider > -- > > Key: CASSANDRA-4190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4190 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.9, 1.1.0 > Environment: Linux 2.6.27 >Reporter: Mina Naguib >Assignee: Sylvain Lebresne > Labels: ConcurrentLinkedHashCacheProvider, cache, supercolumns > Fix For: 1.0.10, 1.1.1 > > Attachments: 4190.txt > > > Tested on a vanilla single-node cassandra 1.0.9 installation. > When using super columns along with row caching via > ConcurrentLinkedHashCacheProvider (default if no JNA available, or explicitly > configured even if JNA available), there's what appears as transient data > loss. > Given this script executed in cassandra-cli: > {quote} > create keyspace Test; > use Test; > create column family Users with column_type='Super' and > key_validation_class='UTF8Type' and comparator='UTF8Type' and > subcomparator='UTF8Type' and default_validation_class='UTF8Type' and > rows_cached=75000 and row_cache_provider='ConcurrentLinkedHashCacheProvider'; > set Users['mina']['attrs']['name'] = 'Mina'; > get Users['mina']; > set Users['mina']['attrs']['country'] = 'Canada'; > get Users['mina']; > set Users['mina']['attrs']['region'] = 'Quebec'; > get Users['mina']; > {quote} > The output from the 3 gets above is as follows: > {quote} > => (super_column=attrs, > (column=name, value=Mina, timestamp=1335377788441000)) > Returned 1 results. > {quote} > {quote} > => (super_column=attrs, > (column=name, value=Mina, timestamp=1335377788441000)) > Returned 1 results. > {quote} > {quote} > => (super_column=attrs, > (column=name, value=Mina, timestamp=1335377788441000)) > Returned 1 results. > {quote} > It's clear that the second and third set commands (country, region) are > missing in the returned results. > If the row cache is explicitly invalidated (in a second terminal, via > `nodetool -h localhost invalidaterowcache Test Users`), the missing data > springs to life on next 'get': > {quote} > [default@Test] get Users['mina']; > => (super_column=attrs, > (column=country, value=Canada, timestamp=1335377839592000) > (column=name, value=Mina, timestamp=1335377788441000) > (column=region, value=Quebec, timestamp=1335377871353000)) > Returned 1 results. > {quote} > From cursory checks, this does not to appear to happen with regular columns, > nor with JNA enabled + SerializingCacheProvider. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4190) Apparent data loss using super columns and row cache via ConcurrentLinkedHashCacheProvider
[ https://issues.apache.org/jira/browse/CASSANDRA-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262005#comment-13262005 ] Mina Naguib commented on CASSANDRA-4190: I can confirm that testing with cassandra 1.1.0 has the same conclusion. To reproduce against cassandra 1.1.0, edit cassandra.yaml and set: {quote} row_cache_provider: ConcurrentLinkedHashCacheProvider row_cache_size_in_mb: 200 {quote} And use this slightly updated script to accomodate for 1.1.0 changes: {quote} create keyspace Test; use Test; create column family Users with column_type='Super' and key_validation_class='UTF8Type' and comparator='UTF8Type' and subcomparator='UTF8Type' and default_validation_class='UTF8Type' and caching='ALL'; set Users['mina']['attrs']['name'] = 'Mina'; get Users['mina']; set Users['mina']['attrs']['country'] = 'Canada'; get Users['mina']; set Users['mina']['attrs']['region'] = 'Quebec'; get Users['mina']; {quote} The rest of the observations are the same as with the cassandra 1.0.9 test. > Apparent data loss using super columns and row cache via > ConcurrentLinkedHashCacheProvider > -- > > Key: CASSANDRA-4190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4190 > Project: Cassandra > Issue Type: Bug >Affects Versions: 1.0.9 > Environment: Linux 2.6.27 >Reporter: Mina Naguib > Labels: ConcurrentLinkedHashCacheProvider, cache, supercolumns > > Tested on a vanilla single-node cassandra 1.0.9 installation. > When using super columns along with row caching via > ConcurrentLinkedHashCacheProvider (default if no JNA available, or explicitly > configured even if JNA available), there's what appears as transient data > loss. > Given this script executed in cassandra-cli: > {quote} > create keyspace Test; > use Test; > create column family Users with column_type='Super' and > key_validation_class='UTF8Type' and comparator='UTF8Type' and > subcomparator='UTF8Type' and default_validation_class='UTF8Type' and > rows_cached=75000 and row_cache_provider='ConcurrentLinkedHashCacheProvider'; > set Users['mina']['attrs']['name'] = 'Mina'; > get Users['mina']; > set Users['mina']['attrs']['country'] = 'Canada'; > get Users['mina']; > set Users['mina']['attrs']['region'] = 'Quebec'; > get Users['mina']; > {quote} > The output from the 3 gets above is as follows: > {quote} > => (super_column=attrs, > (column=name, value=Mina, timestamp=1335377788441000)) > Returned 1 results. > {quote} > {quote} > => (super_column=attrs, > (column=name, value=Mina, timestamp=1335377788441000)) > Returned 1 results. > {quote} > {quote} > => (super_column=attrs, > (column=name, value=Mina, timestamp=1335377788441000)) > Returned 1 results. > {quote} > It's clear that the second and third set commands (country, region) are > missing in the returned results. > If the row cache is explicitly invalidated (in a second terminal, via > `nodetool -h localhost invalidaterowcache Test Users`), the missing data > springs to life on next 'get': > {quote} > [default@Test] get Users['mina']; > => (super_column=attrs, > (column=country, value=Canada, timestamp=1335377839592000) > (column=name, value=Mina, timestamp=1335377788441000) > (column=region, value=Quebec, timestamp=1335377871353000)) > Returned 1 results. > {quote} > From cursory checks, this does not to appear to happen with regular columns, > nor with JNA enabled + SerializingCacheProvider. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira