[
https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168849#comment-13168849
]
Sylvain Lebresne commented on CASSANDRA-3143:
---------------------------------------------
{quote}
bq. Preceding point apart, we would at least need a way to deactivate row
caching on a per-cf basis. We may also want that for key cache, though this
seems less critical. My initial idea would be to either have a boolean flag if
we only want to allow disabling row cache, or some multi-value caches option
that could be "none", "key_only", "row_only" or "all".
This is going to be moved to the separate task.
{quote}
I'm not a fan of that idea. We just cannot release this without a way to
deactivate the row cache as this would make the row cache unusable for most
users. IMHO, that's a good definition of something that should not be moved to
a separate task.
{quote}
bq. Why does the getRowCacheKeysToSave() option disappeared?
Because we don't control that anymore, rely on cache LRU policy instead.
{quote}
I don't understand how "relying on cache LRU policy" has anything to do with
that. The initial motivation for that option is that people don't want to
reload the full extend of the row cache on restart because it takes forever,
but they don't want to start with cold caches either. I don't see how making
the cache global changes anything on that. I agree that the number of row cache
key to save should now be a global option, but I disagree that it should be
removed.
Otherwise:
* The code around CFS.prepareRowForCaching is weird. First the comment seems to
suggest that prepareRowForCaching is used exclusively from CacheService while
it's use below in cacheRow. It also adds a copy of the columns which I don't
think is necessary since we already copy in MappedFileDataInput. Overall I'm
not sure prepareRowForCaching is useful and CacheService.readSavedRowCache
could use cacheRow directly
* I don't think CacheService.reloadKeyCache works correctly. It only populate
the cache with fake values that won't get updated unless you reload the
sstables, which has no reason to happen. I'm fine removing the key cache
reloading altogether, but as an alternative, why not save the value of the key
cache too? The thing is, I'm not very comfortable with the current 'two phase'
key cache loading: if we ever have a bug in the SSTReader.load method, the
actual pre-loading with -1 values will be harmful, which seems unnecessarily
fragile. Saving the values on disk would avoid that.
* Talking of the key cache save, the format used by the patch is really really
not compact. For each key we save the full path to the sstable, which can
easily be > 50 bytes. Maybe we could associate an int to each descriptor during
the save and save the association of descriptor -> id separately. * Still
worth allowing to chose how may keys to save
* The cache sizings don't take the keys into account. For the row cache, one
could make the argument that the overhead of the keys is negligible compared to
the values. For the key cache however, the key are bigger than the values.
* The patch mistakenly remove the help for 'nodetool upgradesstables' (in
NodeCmd.java)
* Would be worth adding a global cache log line in StatusLogger.
* Patch wrongly reintroduces memtable_operations and memtable_throughput to
CliHelp.
* The default row cache provider since 1.0 is the serializing one, this patch
sets the ConcurrentLinkedHashCacheProvider instead.
And a number of nits:
* In CFS, it's probably faster/simpler to use metadata.cfId rather than
Schema.instance.getId(table.name, this.columnFamily)
* In CacheService, calling scheduleSaving with -1 as second argument would be
slightly faster than using Integer.MAX_VALUE.
* In SSTableReader.cacheKey, the assert {{key.key == null}} is useless in trunk
(DK with key == null can't be constructed).
* In AbstractCassandraDaemon, there's a unecessary import of
javax.management.RuntimeErrorException
* There is some comments duplication in the yaml file
* I wonder if the reduce cache capacity thing still makes sense after this
patch?
* In AutosavingCache, I think we could declare AutoSavingCache<K extends
CacheKey, V> and get rid of the translateKey() method
> Global caches (key/row)
> -----------------------
>
> Key: CASSANDRA-3143
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Pavel Yaskevich
> Assignee: Pavel Yaskevich
> Priority: Minor
> Labels: Core
> Fix For: 1.1
>
> Attachments: 0001-global-key-cache.patch,
> 0002-global-row-cache-and-ASC.readSaved-changed-to-abstra.patch,
> 0003-CacheServiceMBean-and-correct-key-cache-loading.patch,
> 0004-key-row-cache-tests-and-tweaks.patch,
> 0005-cleanup-of-the-CFMetaData-and-thrift-avro-CfDef-and-.patch,
> 0006-row-key-cache-improvements-according-to-Sylvain-s-co.patch
>
>
> Caches are difficult to configure well as ColumnFamilies are added, similar
> to how memtables were difficult pre-CASSANDRA-2006.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira