[ 
https://issues.apache.org/jira/browse/CASSANDRA-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888084#comment-15888084
 ] 

Benjamin Roth commented on CASSANDRA-13279:
-------------------------------------------

I can understand your consideration about the deployment issues of centralized 
settings in a non-centralized settings file.

But I have contradict in the second point. By "somewhat hidden" I don't mean it 
does not exist but an average user won't come across the documentation or the 
valueable information (why should I tweak that) that is related to it. 
It is very difficult to find the right resource / doc in the CS ecosystem. 
There is datastax, there is the official CS site (which contains a lot of TODOs 
and empty pages), wiki.apache.org (looks very outdated) and there are zillions 
of distributed and spread resources like blogs all over the net. Finding the 
right information (as a new user) is the famous needle in the haystack. You are 
a user / developer from the early ages and know every corner of the CS universe 
but for new users it is hardly overseeable and 'somewhat hidden'.

To be honest:
When I first installed and tested CS, I was totally lost. I had to test a lot, 
read many many many different resources, go through the hell of trial and 
error, analyzing, debugging, compiling and testing again with a lot of pain to 
get the knowledge I have to day. Tweaking chunk_size was quite the same. I 
tried a lot of stuff, posted on lists, ... and after some days I was like 
"Wait, there was this setting in DevCenter with that 'chunk_size', what does it 
exactly do and what happens if ... AAAAAH it works!".

How about creating a structure in the official cassandra docs with use cases 
and Q&A for performance tuning?
Sth. like a structured version of Al Tobeys tuning guide with a Problem > 
Solution section.


> Table default settings file
> ---------------------------
>
>                 Key: CASSANDRA-13279
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13279
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Configuration
>            Reporter: Romain Hardouin
>            Priority: Minor
>              Labels: config, documentation
>
> Following CASSANDRA-13241 we often see that there is no one-size-fits-all 
> value for settings. We can't find a sweet spot for every use cases.
> It's true for settings in cassandra.yaml but as [~brstgt] said for 
> {{chunk_length_in_kb}}: "this is somewhat hidden for the average user". 
> Many table settings are somewhat hidden for the average user. Some people 
> will think RTFM but if a file - say tables.yaml - contains default values for 
> table settings, more people would pay attention to them. And of course this 
> file could contain useful comments and guidance. 
> Example with SSTable compression options:
> {code}
> # General comments about sstable compression
> compression:
>     # First of all: explain what is it. We split each SSTable into chunks, 
> etc.
>     # Explain when users should lower this value (e.g. 4) or when a higher 
> value like 64 or 128 are recommended.
>     # Explain the trade-off between read latency and off-heap compression 
> metadata size.
>     chunk_length_in_kb: 16
>     
>     # List of available compressor: LZ4Compressor, SnappyCompressor, and 
> DeflateCompressor
>     # Explain trade-offs, some specific use cases (e.g. archives), etc.
>     class: 'LZ4Compressor'
>     
>     # If you want to disable compression by default, uncomment the following 
> line
>     #enabled: false
> {code}
> So instead of hard coded values we would end up with something like 
> TableConfig + TableDescriptor à la Config + DatabaseDescriptor.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to