[18/34] cassandra git commit: Reorganize document

slebresne Mon, 27 Jun 2016 11:34:11 -0700

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/metrics.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/metrics.rst b/doc/source/operating/metrics.rst
new file mode 100644
index 0000000..5884cad
--- /dev/null
+++ b/doc/source/operating/metrics.rst
@@ -0,0 +1,619 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Monitoring
+----------
+
+Metrics in Cassandra are managed using the `Dropwizard Metrics 
<http://metrics.dropwizard.io>`__ library. These metrics
+can be queried via JMX or pushed to external monitoring systems using a number 
of `built in
+<http://metrics.dropwizard.io/3.1.0/getting-started/#other-reporting>`__ and 
`third party
+<http://metrics.dropwizard.io/3.1.0/manual/third-party/>`__ reporter plugins.
+
+Metrics are collected for a single node. It's up to the operator to use an 
external monitoring system to aggregate them.
+
+Metric Types
+^^^^^^^^^^^^
+All metrics reported by cassandra fit into one of the following types.
+
+``Gauge``
+    An instantaneous measurement of a value.
+
+``Counter``
+    A gauge for an ``AtomicLong`` instance. Typically this is consumed by 
monitoring the change since the last call to
+    see if there is a large increase compared to the norm.
+
+``Histogram``
+    Measures the statistical distribution of values in a stream of data.
+
+    In addition to minimum, maximum, mean, etc., it also measures median, 
75th, 90th, 95th, 98th, 99th, and 99.9th
+    percentiles.
+
+``Timer``
+    Measures both the rate that a particular piece of code is called and the 
histogram of its duration.
+
+``Latency``
+    Special type that tracks latency (in microseconds) with a ``Timer`` plus a 
``Counter`` that tracks the total latency
+    accrued since starting. The former is useful if you track the change in 
total latency since the last check. Each
+    metric name of this type will have 'Latency' and 'TotalLatency' appended 
to it.
+
+``Meter``
+    A meter metric which measures mean throughput and one-, five-, and 
fifteen-minute exponentially-weighted moving
+    average throughputs.
+
+Table Metrics
+^^^^^^^^^^^^^
+
+Each table in Cassandra has metrics responsible for tracking its state and 
performance.
+
+The metric names are all appended with the specific ``Keyspace`` and ``Table`` 
name.
+
+Reported name format:
+
+**Metric Name**
+    
``org.apache.cassandra.metrics.Table.{{MetricName}}.{{Keyspace}}.{{Table}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Table keyspace={{Keyspace} 
scope={{Table}} name={{MetricName}}``
+
+.. NOTE::
+    There is a special table called '``all``' without a keyspace. This 
represents the aggregation of metrics across
+    **all** tables and keyspaces on the node.
+
+
+======================================= ============== ===========
+Name                                    Type           Description
+======================================= ============== ===========
+MemtableOnHeapSize                      Gauge<Long>    Total amount of data 
stored in the memtable that resides **on**-heap, including column related 
overhead and partitions overwritten.
+MemtableOffHeapSize                     Gauge<Long>    Total amount of data 
stored in the memtable that resides **off**-heap, including column related 
overhead and partitions overwritten.
+MemtableLiveDataSize                    Gauge<Long>    Total amount of live 
data stored in the memtable, excluding any data structure overhead.
+AllMemtablesOnHeapSize                  Gauge<Long>    Total amount of data 
stored in the memtables (2i and pending flush memtables included) that resides 
**on**-heap.
+AllMemtablesOffHeapSize                 Gauge<Long>    Total amount of data 
stored in the memtables (2i and pending flush memtables included) that resides 
**off**-heap.
+AllMemtablesLiveDataSize                Gauge<Long>    Total amount of live 
data stored in the memtables (2i and pending flush memtables included) that 
resides off-heap, excluding any data structure overhead.
+MemtableColumnsCount                    Gauge<Long>    Total number of columns 
present in the memtable.
+MemtableSwitchCount                     Counter        Number of times flush 
has resulted in the memtable being switched out.
+CompressionRatio                        Gauge<Double>  Current compression 
ratio for all SSTables.
+EstimatedPartitionSizeHistogram         Gauge<long[]>  Histogram of estimated 
partition size (in bytes).
+EstimatedPartitionCount                 Gauge<Long>    Approximate number of 
keys in table.
+EstimatedColumnCountHistogram           Gauge<long[]>  Histogram of estimated 
number of columns.
+SSTablesPerReadHistogram                Histogram      Histogram of the number 
of sstable data files accessed per read.
+ReadLatency                             Latency        Local read latency for 
this table.
+RangeLatency                            Latency        Local range scan 
latency for this table.
+WriteLatency                            Latency        Local write latency for 
this table.
+CoordinatorReadLatency                  Timer          Coordinator read 
latency for this table.
+CoordinatorScanLatency                  Timer          Coordinator range scan 
latency for this table.
+PendingFlushes                          Counter        Estimated number of 
flush tasks pending for this table.
+BytesFlushed                            Counter        Total number of bytes 
flushed since server [re]start.
+CompactionBytesWritten                  Counter        Total number of bytes 
written by compaction since server [re]start.
+PendingCompactions                      Gauge<Integer> Estimate of number of 
pending compactions for this table.
+LiveSSTableCount                        Gauge<Integer> Number of SSTables on 
disk for this table.
+LiveDiskSpaceUsed                       Counter        Disk space used by 
SSTables belonging to this table (in bytes).
+TotalDiskSpaceUsed                      Counter        Total disk space used 
by SSTables belonging to this table, including obsolete ones waiting to be GC'd.
+MinPartitionSize                        Gauge<Long>    Size of the smallest 
compacted partition (in bytes).
+MaxPartitionSize                        Gauge<Long>    Size of the largest 
compacted partition (in bytes).
+MeanPartitionSize                       Gauge<Long>    Size of the average 
compacted partition (in bytes).
+BloomFilterFalsePositives               Gauge<Long>    Number of false 
positives on table's bloom filter.
+BloomFilterFalseRatio                   Gauge<Double>  False positive ratio of 
table's bloom filter.
+BloomFilterDiskSpaceUsed                Gauge<Long>    Disk space used by 
bloom filter (in bytes).
+BloomFilterOffHeapMemoryUsed            Gauge<Long>    Off-heap memory used by 
bloom filter.
+IndexSummaryOffHeapMemoryUsed           Gauge<Long>    Off-heap memory used by 
index summary.
+CompressionMetadataOffHeapMemoryUsed    Gauge<Long>    Off-heap memory used by 
compression meta data.
+KeyCacheHitRate                         Gauge<Double>  Key cache hit rate for 
this table.
+TombstoneScannedHistogram               Histogram      Histogram of tombstones 
scanned in queries on this table.
+LiveScannedHistogram                    Histogram      Histogram of live cells 
scanned in queries on this table.
+ColUpdateTimeDeltaHistogram             Histogram      Histogram of column 
update time delta on this table.
+ViewLockAcquireTime                     Timer          Time taken acquiring a 
partition lock for materialized view updates on this table.
+ViewReadTime                            Timer          Time taken during the 
local read of a materialized view update.
+TrueSnapshotsSize                       Gauge<Long>    Disk space used by 
snapshots of this table including all SSTable components.
+RowCacheHitOutOfRange                   Counter        Number of table row 
cache hits that do not satisfy the query filter, thus went to disk.
+RowCacheHit                             Counter        Number of table row 
cache hits.
+RowCacheMiss                            Counter        Number of table row 
cache misses.
+CasPrepare                              Latency        Latency of paxos 
prepare round.
+CasPropose                              Latency        Latency of paxos 
propose round.
+CasCommit                               Latency        Latency of paxos commit 
round.
+PercentRepaired                         Gauge<Double>  Percent of table data 
that is repaired on disk.
+SpeculativeRetries                      Counter        Number of times 
speculative retries were sent for this table.
+WaitingOnFreeMemtableSpace              Histogram      Histogram of time spent 
waiting for free memtable space, either on- or off-heap.
+DroppedMutations                        Counter        Number of dropped 
mutations on this table.
+======================================= ============== ===========
+
+Keyspace Metrics
+^^^^^^^^^^^^^^^^
+Each keyspace in Cassandra has metrics responsible for tracking its state and 
performance.
+
+These metrics are the same as the ``Table Metrics`` above, only they are 
aggregated at the Keyspace level.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.keyspace.{{MetricName}}.{{Keyspace}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Keyspace scope={{Keyspace}} 
name={{MetricName}}``
+
+ThreadPool Metrics
+^^^^^^^^^^^^^^^^^^
+
+Cassandra splits work of a particular type into its own thread pool.  This 
provides back-pressure and asynchrony for
+requests on a node.  It's important to monitor the state of these thread pools 
since they can tell you how saturated a
+node is.
+
+The metric names are all appended with the specific ``ThreadPool`` name.  The 
thread pools are also categorized under a
+specific type.
+
+Reported name format:
+
+**Metric Name**
+    
``org.apache.cassandra.metrics.ThreadPools.{{MetricName}}.{{Path}}.{{ThreadPoolName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=ThreadPools scope={{ThreadPoolName}} 
type={{Type}} name={{MetricName}}``
+
+===================== ============== ===========
+Name                  Type           Description
+===================== ============== ===========
+ActiveTasks           Gauge<Integer> Number of tasks being actively worked on 
by this pool.
+PendingTasks          Gauge<Integer> Number of queued tasks queued up on this 
pool.
+CompletedTasks        Counter        Number of tasks completed.
+TotalBlockedTasks     Counter        Number of tasks that were blocked due to 
queue saturation.
+CurrentlyBlockedTask  Counter        Number of tasks that are currently 
blocked due to queue saturation but on retry will become unblocked.
+MaxPoolSize           Gauge<Integer> The maximum number of threads in this 
pool.
+===================== ============== ===========
+
+The following thread pools can be monitored.
+
+============================ ============== ===========
+Name                         Type           Description
+============================ ============== ===========
+Native-Transport-Requests    transport      Handles client CQL requests
+CounterMutationStage         request        Responsible for counter writes
+ViewMutationStage            request        Responsible for materialized view 
writes
+MutationStage                request        Responsible for all other writes
+ReadRepairStage              request        ReadRepair happens on this thread 
pool
+ReadStage                    request        Local reads run on this thread pool
+RequestResponseStage         request        Coordinator requests to the 
cluster run on this thread pool
+AntiEntropyStage             internal       Builds merkle tree for repairs
+CacheCleanupExecutor         internal       Cache maintenance performed on 
this thread pool
+CompactionExecutor           internal       Compactions are run on these 
threads
+GossipStage                  internal       Handles gossip requests
+HintsDispatcher              internal       Performs hinted handoff
+InternalResponseStage        internal       Responsible for intra-cluster 
callbacks
+MemtableFlushWriter          internal       Writes memtables to disk
+MemtablePostFlush            internal       Cleans up commit log after 
memtable is written to disk
+MemtableReclaimMemory        internal       Memtable recycling
+MigrationStage               internal       Runs schema migrations
+MiscStage                    internal       Misceleneous tasks run here
+PendingRangeCalculator       internal       Calculates token range
+PerDiskMemtableFlushWriter_0 internal       Responsible for writing a spec 
(there is one of these per disk 0-N)
+Sampler                      internal       Responsible for re-sampling the 
index summaries of SStables
+SecondaryIndexManagement     internal       Performs updates to secondary 
indexes
+ValidationExecutor           internal       Performs validation compaction or 
scrubbing
+============================ ============== ===========
+
+.. |nbsp| unicode:: 0xA0 .. nonbreaking space
+
+Client Request Metrics
+^^^^^^^^^^^^^^^^^^^^^^
+
+Client requests have their own set of metrics that encapsulate the work 
happening at coordinator level.
+
+Different types of client requests are broken down by ``RequestType``.
+
+Reported name format:
+
+**Metric Name**
+    
``org.apache.cassandra.metrics.ClientRequest.{{MetricName}}.{{RequestType}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=ClientRequest scope={{RequestType}} 
name={{MetricName}}``
+
+
+:RequestType: CASRead
+:Description: Metrics related to transactional read requests.
+:Metrics:
+    ===================== ============== 
=============================================================
+    Name                  Type           Description
+    ===================== ============== 
=============================================================
+    Timeouts              Counter        Number of timeouts encountered.
+    Failures              Counter        Number of transaction failures 
encountered.
+    |nbsp|                Latency        Transaction read latency.
+    Unavailables          Counter        Number of unavailable exceptions 
encountered.
+    UnfinishedCommit      Counter        Number of transactions that were 
committed on read.
+    ConditionNotMet       Counter        Number of transaction preconditions 
did not match current values.
+    ContentionHistogram   Histogram      How many contended reads were 
encountered
+    ===================== ============== 
=============================================================
+
+:RequestType: CASWrite
+:Description: Metrics related to transactional write requests.
+:Metrics:
+    ===================== ============== 
=============================================================
+    Name                  Type           Description
+    ===================== ============== 
=============================================================
+    Timeouts              Counter        Number of timeouts encountered.
+    Failures              Counter        Number of transaction failures 
encountered.
+    |nbsp|                Latency        Transaction write latency.
+    UnfinishedCommit      Counter        Number of transactions that were 
committed on write.
+    ConditionNotMet       Counter        Number of transaction preconditions 
did not match current values.
+    ContentionHistogram   Histogram      How many contended writes were 
encountered
+    ===================== ============== 
=============================================================
+
+
+:RequestType: Read
+:Description: Metrics related to standard read requests.
+:Metrics:
+    ===================== ============== 
=============================================================
+    Name                  Type           Description
+    ===================== ============== 
=============================================================
+    Timeouts              Counter        Number of timeouts encountered.
+    Failures              Counter        Number of read failures encountered.
+    |nbsp|                Latency        Read latency.
+    Unavailables          Counter        Number of unavailable exceptions 
encountered.
+    ===================== ============== 
=============================================================
+
+:RequestType: RangeSlice
+:Description: Metrics related to token range read requests.
+:Metrics:
+    ===================== ============== 
=============================================================
+    Name                  Type           Description
+    ===================== ============== 
=============================================================
+    Timeouts              Counter        Number of timeouts encountered.
+    Failures              Counter        Number of range query failures 
encountered.
+    |nbsp|                Latency        Range query latency.
+    Unavailables          Counter        Number of unavailable exceptions 
encountered.
+    ===================== ============== 
=============================================================
+
+:RequestType: Write
+:Description: Metrics related to regular write requests.
+:Metrics:
+    ===================== ============== 
=============================================================
+    Name                  Type           Description
+    ===================== ============== 
=============================================================
+    Timeouts              Counter        Number of timeouts encountered.
+    Failures              Counter        Number of write failures encountered.
+    |nbsp|                Latency        Write latency.
+    Unavailables          Counter        Number of unavailable exceptions 
encountered.
+    ===================== ============== 
=============================================================
+
+
+:RequestType: ViewWrite
+:Description: Metrics related to materialized view write wrtes.
+:Metrics:
+    ===================== ============== 
=============================================================
+    Timeouts              Counter        Number of timeouts encountered.
+    Failures              Counter        Number of transaction failures 
encountered.
+    Unavailables          Counter        Number of unavailable exceptions 
encountered.
+    ViewReplicasAttempted Counter        Total number of attempted view 
replica writes.
+    ViewReplicasSuccess   Counter        Total number of succeded view replica 
writes.
+    ViewPendingMutations  Gauge<Long>    ViewReplicasAttempted - 
ViewReplicasSuccess.
+    ViewWriteLatency      Timer          Time between when mutation is applied 
to base table and when CL.ONE is achieved on view.
+    ===================== ============== 
=============================================================
+
+Cache Metrics
+^^^^^^^^^^^^^
+
+Cassandra caches have metrics to track the effectivness of the caches. Though 
the ``Table Metrics`` might be more useful.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.Cache.{{MetricName}}.{{CacheName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Cache scope={{CacheName}} 
name={{MetricName}}``
+
+========================== ============== ===========
+Name                       Type           Description
+========================== ============== ===========
+Capacity                   Gauge<Long>    Cache capacity in bytes.
+Entries                    Gauge<Integer> Total number of cache entries.
+FifteenMinuteCacheHitRate  Gauge<Double>  15m cache hit rate.
+FiveMinuteCacheHitRate     Gauge<Double>  5m cache hit rate.
+OneMinuteCacheHitRate      Gauge<Double>  1m cache hit rate.
+HitRate                    Gauge<Double>  All time cache hit rate.
+Hits                       Meter          Total number of cache hits.
+Misses                     Meter          Total number of cache misses.
+MissLatency                Timer          Latency of misses.
+Requests                   Gauge<Long>    Total number of cache requests.
+Size                       Gauge<Long>    Total size of occupied cache, in 
bytes.
+========================== ============== ===========
+
+The following caches are covered:
+
+============================ ===========
+Name                         Description
+============================ ===========
+CounterCache                 Keeps hot counters in memory for performance.
+ChunkCache                   In process uncompressed page cache.
+KeyCache                     Cache for partition to sstable offsets.
+RowCache                     Cache for rows kept in memory.
+============================ ===========
+
+.. NOTE::
+    Misses and MissLatency are only defined for the ChunkCache
+
+CQL Metrics
+^^^^^^^^^^^
+
+Metrics specific to CQL prepared statement caching.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.CQL.{{MetricName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=CQL name={{MetricName}}``
+
+========================== ============== ===========
+Name                       Type           Description
+========================== ============== ===========
+PreparedStatementsCount    Gauge<Integer> Number of cached prepared statements.
+PreparedStatementsEvicted  Counter        Number of prepared statements 
evicted from the prepared statement cache
+PreparedStatementsExecuted Counter        Number of prepared statements 
executed.
+RegularStatementsExecuted  Counter        Number of **non** prepared 
statements executed.
+PreparedStatementsRatio    Gauge<Double>  Percentage of statements that are 
prepared vs unprepared.
+========================== ============== ===========
+
+
+DroppedMessage Metrics
+^^^^^^^^^^^^^^^^^^^^^^
+
+Metrics specific to tracking dropped messages for different types of requests.
+Dropped writes are stored and retried by ``Hinted Handoff``
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.DroppedMessages.{{MetricName}}.{{Type}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=DroppedMetrics scope={{Type}} 
name={{MetricName}}``
+
+========================== ============== ===========
+Name                       Type           Description
+========================== ============== ===========
+CrossNodeDroppedLatency    Timer          The dropped latency across nodes.
+InternalDroppedLatency     Timer          The dropped latency within node.
+Dropped                    Meter          Number of dropped messages.
+========================== ============== ===========
+
+The different types of messages tracked are:
+
+============================ ===========
+Name                         Description
+============================ ===========
+BATCH_STORE                  Batchlog write
+BATCH_REMOVE                 Batchlog cleanup (after succesfully applied)
+COUNTER_MUTATION             Counter writes
+HINT                         Hint replay
+MUTATION                     Regular writes
+READ                         Regular reads
+READ_REPAIR                  Read repair
+PAGED_SLICE                  Paged read
+RANGE_SLICE                  Token range read
+REQUEST_RESPONSE             RPC Callbacks
+_TRACE                       Tracing writes
+============================ ===========
+
+Streaming Metrics
+^^^^^^^^^^^^^^^^^
+
+Metrics reported during ``Streaming`` operations, such as repair, bootstrap, 
rebuild.
+
+These metrics are specific to a peer endpoint, with the source node being the 
node you are pulling the metrics from.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.Streaming.{{MetricName}}.{{PeerIP}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Streaming scope={{PeerIP}} 
name={{MetricName}}``
+
+========================== ============== ===========
+Name                       Type           Description
+========================== ============== ===========
+IncomingBytes              Counter        Number of bytes streamed to this 
node from the peer.
+OutgoingBytes              Counter        Number of bytes streamed to the peer 
endpoint from this node.
+========================== ============== ===========
+
+
+Compaction Metrics
+^^^^^^^^^^^^^^^^^^
+
+Metrics specific to ``Compaction`` work.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.Compaction.{{MetricName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Compaction name={{MetricName}}``
+
+========================== ======================================== 
===============================================
+Name                       Type                                     Description
+========================== ======================================== 
===============================================
+BytesCompacted             Counter                                  Total 
number of bytes compacted since server [re]start.
+PendingTasks               Gauge<Integer>                           Estimated 
number of compactions remaining to perform.
+CompletedTasks             Gauge<Long>                              Number of 
completed compactions since server [re]start.
+TotalCompactionsCompleted  Meter                                    Throughput 
of completed compactions since server [re]start.
+PendingTasksByTableName    Gauge<Map<String, Map<String, Integer>>> Estimated 
number of compactions remaining to perform, grouped by keyspace and then table 
name. This info is also kept in ``Table Metrics``.
+========================== ======================================== 
===============================================
+
+CommitLog Metrics
+^^^^^^^^^^^^^^^^^
+
+Metrics specific to the ``CommitLog``
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.CommitLog.{{MetricName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=CommitLog name={{MetricName}}``
+
+========================== ============== ===========
+Name                       Type           Description
+========================== ============== ===========
+CompletedTasks             Gauge<Long>    Total number of commit log messages 
written since [re]start.
+PendingTasks               Gauge<Long>    Number of commit log messages 
written but yet to be fsync'd.
+TotalCommitLogSize         Gauge<Long>    Current size, in bytes, used by all 
the commit log segments.
+WaitingOnSegmentAllocation Timer          Time spent waiting for a 
CommitLogSegment to be allocated - under normal conditions this should be zero.
+WaitingOnCommit            Timer          The time spent waiting on CL fsync; 
for Periodic this is only occurs when the sync is lagging its sync interval.
+========================== ============== ===========
+
+Storage Metrics
+^^^^^^^^^^^^^^^
+
+Metrics specific to the storage engine.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.Storage.{{MetricName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Storage name={{MetricName}}``
+
+========================== ============== ===========
+Name                       Type           Description
+========================== ============== ===========
+Exceptions                 Counter        Number of internal exceptions 
caught. Under normal exceptions this should be zero.
+Load                       Counter        Size, in bytes, of the on disk data 
size this node manages.
+TotalHints                 Counter        Number of hint messages written to 
this node since [re]start. Includes one entry for each host to be hinted per 
hint.
+TotalHintsInProgress       Counter        Number of hints attemping to be sent 
currently.
+========================== ============== ===========
+
+HintedHandoff Metrics
+^^^^^^^^^^^^^^^^^^^^^
+
+Metrics specific to Hinted Handoff.  There are also some metrics related to 
hints tracked in ``Storage Metrics``
+
+These metrics include the peer endpoint **in the metric name**
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.HintedHandOffManager.{{MetricName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=HintedHandOffManager 
name={{MetricName}}``
+
+=========================== ============== ===========
+Name                        Type           Description
+=========================== ============== ===========
+Hints_created-{{PeerIP}}    Counter        Number of hints on disk for this 
peer.
+Hints_not_stored-{{PeerIP}} Counter        Number of hints not stored for this 
peer, due to being down past the configured hint window.
+=========================== ============== ===========
+
+SSTable Index Metrics
+^^^^^^^^^^^^^^^^^^^^^
+
+Metrics specific to the SSTable index metadata.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.Index.{{MetricName}}.RowIndexEntry``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Index scope=RowIndexEntry 
name={{MetricName}}``
+
+=========================== ============== ===========
+Name                        Type           Description
+=========================== ============== ===========
+IndexedEntrySize            Histogram      Histogram of the on-heap size, in 
bytes, of the index across all SSTables.
+IndexInfoCount              Histogram      Histogram of the number of on-heap 
index entries managed across all SSTables.
+IndexInfoGets               Histogram      Histogram of the number index seeks 
performed per SSTable.
+=========================== ============== ===========
+
+BufferPool Metrics
+^^^^^^^^^^^^^^^^^^
+
+Metrics specific to the internal recycled buffer pool Cassandra manages.  This 
pool is meant to keep allocations and GC
+lower by recycling on and off heap buffers.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.BufferPool.{{MetricName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=BufferPool name={{MetricName}}``
+
+=========================== ============== ===========
+Name                        Type           Description
+=========================== ============== ===========
+Size                        Gauge<Long>    Size, in bytes, of the managed 
buffer pool
+Misses                      Meter           The rate of misses in the pool. 
The higher this is the more allocations incurred.
+=========================== ============== ===========
+
+
+Client Metrics
+^^^^^^^^^^^^^^
+
+Metrics specifc to client managment.
+
+Reported name format:
+
+**Metric Name**
+    ``org.apache.cassandra.metrics.Client.{{MetricName}}``
+
+**JMX MBean**
+    ``org.apache.cassandra.metrics:type=Client name={{MetricName}}``
+
+=========================== ============== ===========
+Name                        Type           Description
+=========================== ============== ===========
+connectedNativeClients      Counter        Number of clients connected to this 
nodes native protocol server
+connectedThriftClients      Counter        Number of clients connected to this 
nodes thrift protocol server
+=========================== ============== ===========
+
+JMX
+^^^
+
+Any JMX based client can access metrics from cassandra.
+
+If you wish to access JMX metrics over http it's possible to download 
`Mx4jTool <http://mx4j.sourceforge.net/>`__ and
+place ``mx4j-tools.jar`` into the classpath.  On startup you will see in the 
log::
+
+    HttpAdaptor version 3.0.2 started on port 8081
+
+To choose a different port (8081 is the default) or a different listen address 
(0.0.0.0 is not the default) edit
+``conf/cassandra-env.sh`` and uncomment::
+
+    #MX4J_ADDRESS="-Dmx4jaddress=0.0.0.0"
+
+    #MX4J_PORT="-Dmx4jport=8081"
+
+
+Metric Reporters
+^^^^^^^^^^^^^^^^
+
+As mentioned at the top of this section on monitoring the Cassandra metrics 
can be exported to a number of monitoring
+system a number of `built in 
<http://metrics.dropwizard.io/3.1.0/getting-started/#other-reporting>`__ and 
`third party
+<http://metrics.dropwizard.io/3.1.0/manual/third-party/>`__ reporter plugins.
+
+The configuration of these plugins is managed by the `metrics reporter config 
project
+<https://github.com/addthis/metrics-reporter-config>`__. There is a sample 
configuration file located at
+``conf/metrics-reporter-config-sample.yaml``.
+
+Once configured, you simply start cassandra with the flag
+``-Dcassandra.metricsReporterConfigFile=metrics-reporter-config.yaml``. The 
specified .yaml file plus any 3rd party
+reporter jars must all be in Cassandra's classpath.


http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/read_repair.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/read_repair.rst 
b/doc/source/operating/read_repair.rst
new file mode 100644
index 0000000..0e52bf5
--- /dev/null
+++ b/doc/source/operating/read_repair.rst
@@ -0,0 +1,22 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Read repair
+-----------
+
+.. todo:: todo

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/repair.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/repair.rst b/doc/source/operating/repair.rst
new file mode 100644
index 0000000..97d8ce8
--- /dev/null
+++ b/doc/source/operating/repair.rst
@@ -0,0 +1,22 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Repair
+------
+
+.. todo:: todo

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/security.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/security.rst 
b/doc/source/operating/security.rst
new file mode 100644
index 0000000..80a33f4
--- /dev/null
+++ b/doc/source/operating/security.rst
@@ -0,0 +1,410 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Security
+--------
+
+There are three main components to the security features provided by Cassandra:
+
+- TLS/SSL encryption for client and inter-node communication
+- Client authentication
+- Authorization
+
+TLS/SSL Encryption
+^^^^^^^^^^^^^^^^^^
+Cassandra provides secure communication between a client machine and a 
database cluster and between nodes within a
+cluster. Enabling encryption ensures that data in flight is not compromised 
and is transferred securely. The options for
+client-to-node and node-to-node encryption are managed separately and may be 
configured independently.
+
+In both cases, the JVM defaults for supported protocols and cipher suites are 
used when encryption is enabled. These can
+be overidden using the settings in ``cassandra.yaml``, but this is not 
recommended unless there are policies in place
+which dictate certain settings or a need to disable vulnerable ciphers or 
protocols in cases where the JVM cannot be
+updated.
+
+FIPS compliant settings can be configured at the JVM level and should not 
involve changing encryption settings in
+cassandra.yaml. See `the java document on FIPS 
<https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/FIPS.html>`__
+for more details.
+
+For information on generating the keystore and truststore files used in SSL 
communications, see the
+`java documentation on creating keystores 
<http://download.oracle.com/javase/6/docs/technotes/guides/security/jsse/JSSERefGuide.html#CreateKeystore>`__
+
+Inter-node Encryption
+~~~~~~~~~~~~~~~~~~~~~
+
+The settings for managing inter-node encryption are found in 
``cassandra.yaml`` in the ``server_encryption_options``
+section. To enable inter-node encryption, change the ``internode_encryption`` 
setting from its default value of ``none``
+to one value from: ``rack``, ``dc`` or ``all``.
+
+Client to Node Encryption
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The settings for managing client to node encryption are found in 
``cassandra.yaml`` in the ``client_encryption_options``
+section. There are two primary toggles here for enabling encryption, 
``enabled`` and ``optional``.
+
+- If neither is set to ``true``, client connections are entirely unencrypted.
+- If ``enabled`` is set to ``true`` and ``optional`` is set to ``false``, all 
client connections must be secured.
+- If both options are set to ``true``, both encrypted and unencrypted 
connections are supported using the same port.
+  Client connections using encryption with this configuration will be 
automatically detected and handled by the server.
+
+As an alternative to the ``optional`` setting, separate ports can also be 
configured for secure and unsecure connections
+where operational requirements demand it. To do so, set ``optional`` to false 
and use the ``native_transport_port_ssl``
+setting in ``cassandra.yaml`` to specify the port to be used for secure client 
communication.
+
+.. _operation-roles:
+
+Roles
+^^^^^
+
+Cassandra uses database roles, which may represent either a single user or a 
group of users, in both authentication and
+permissions management. Role management is an extension point in Cassandra and 
may be configured using the
+``role_manager`` setting in ``cassandra.yaml``. The default setting uses 
``CassandraRoleManager``, an implementation
+which stores role information in the tables of the ``system_auth`` keyspace.
+
+See also the :ref:`CQL documentation on roles <roles>`.
+
+Authentication
+^^^^^^^^^^^^^^
+
+Authentication is pluggable in Cassandra and is configured using the 
``authenticator`` setting in ``cassandra.yaml``.
+Cassandra ships with two options included in the default distribution.
+
+By default, Cassandra is configured with ``AllowAllAuthenticator`` which 
performs no authentication checks and therefore
+requires no credentials. It is used to disable authentication completely. Note 
that authentication is a necessary
+condition of Cassandra's permissions subsystem, so if authentication is 
disabled, effectively so are permissions.
+
+The default distribution also includes ``PasswordAuthenticator``, which stores 
encrypted credentials in a system table.
+This can be used to enable simple username/password authentication.
+
+.. _password-authentication:
+
+Enabling Password Authentication
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Before enabling client authentication on the cluster, client applications 
should be pre-configured with their intended
+credentials. When a connection is initiated, the server will only ask for 
credentials once authentication is
+enabled, so setting up the client side config in advance is safe. In contrast, 
as soon as a server has authentication
+enabled, any connection attempt without proper credentials will be rejected 
which may cause availability problems for
+client applications. Once clients are setup and ready for authentication to be 
enabled, follow this procedure to enable
+it on the cluster.
+
+Pick a single node in the cluster on which to perform the initial 
configuration. Ideally, no clients should connect
+to this node during the setup process, so you may want to remove it from 
client config, block it at the network level
+or possibly add a new temporary node to the cluster for this purpose. On that 
node, perform the following steps:
+
+1. Open a ``cqlsh`` session and change the replication factor of the 
``system_auth`` keyspace. By default, this keyspace
+   uses ``SimpleReplicationStrategy`` and a ``replication_factor`` of 1. It is 
recommended to change this for any
+   non-trivial deployment to ensure that should nodes become unavailable, 
login is still possible. Best practice is to
+   configure a replication factor of 3 to 5 per-DC.
+
+::
+
+    ALTER KEYSPACE system_auth WITH replication = {'class': 
'NetworkTopologyStrategy', 'DC1': 3, 'DC2': 3};
+
+2. Edit ``cassandra.yaml`` to change the ``authenticator`` option like so:
+
+::
+
+    authenticator: PasswordAuthenticator
+
+3. Restart the node.
+
+4. Open a new ``cqlsh`` session using the credentials of the default superuser:
+
+::
+
+    cqlsh -u cassandra -p cassandra
+
+5. During login, the credentials for the default superuser are read with a 
consistency level of ``QUORUM``, whereas
+   those for all other users (including superusers) are read at ``LOCAL_ONE``. 
In the interests of performance and
+   availability, as well as security, operators should create another 
superuser and disable the default one. This step
+   is optional, but highly recommended. While logged in as the default 
superuser, create another superuser role which
+   can be used to bootstrap further configuration.
+
+::
+
+    # create a new superuser
+    CREATE ROLE dba WITH SUPERUSER = true AND LOGIN = true AND PASSWORD = 
'super';
+
+6. Start a new cqlsh session, this time logging in as the new_superuser and 
disable the default superuser.
+
+::
+
+    ALTER ROLE cassandra WITH SUPERUSER = false AND LOGIN = false;
+
+7. Finally, set up the roles and credentials for your application users with 
:ref:`CREATE ROLE <create-role-statement>`
+   statements.
+
+At the end of these steps, the one node is configured to use password 
authentication. To roll that out across the
+cluster, repeat steps 2 and 3 on each node in the cluster. Once all nodes have 
been restarted, authentication will be
+fully enabled throughout the cluster.
+
+Note that using ``PasswordAuthenticator`` also requires the use of 
:ref:`CassandraRoleManager <operation-roles>`.
+
+See also: :ref:`setting-credentials-for-internal-authentication`, :ref:`CREATE 
ROLE <create-role-statement>`,
+:ref:`ALTER ROLE <alter-role-statement>`, :ref:`ALTER KEYSPACE 
<calter-keyspace-statement>` and :ref:`GRANT PERMISSION
+<create-permission-statement>`,
+
+Authorization
+^^^^^^^^^^^^^
+
+Authorization is pluggable in Cassandra and is configured using the 
``authorizer`` setting in ``cassandra.yaml``.
+Cassandra ships with two options included in the default distribution.
+
+By default, Cassandra is configured with ``AllowAllAuthorizer`` which performs 
no checking and so effectively grants all
+permissions to all roles. This must be used if ``AllowAllAuthenticator`` is 
the configured authenticator.
+
+The default distribution also includes ``CassandraAuthorizer``, which does 
implement full permissions management
+functionality and stores its data in Cassandra system tables.
+
+Enabling Internal Authorization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Permissions are modelled as a whitelist, with the default assumption that a 
given role has no access to any database
+resources. The implication of this is that once authorization is enabled on a 
node, all requests will be rejected until
+the required permissions have been granted. For this reason, it is strongly 
recommended to perform the initial setup on
+a node which is not processing client requests.
+
+The following assumes that authentication has already been enabled via the 
process outlined in
+:ref:`password-authentication`. Perform these steps to enable internal 
authorization across the cluster:
+
+1. On the selected node, edit ``cassandra.yaml`` to change the ``authorizer`` 
option like so:
+
+::
+
+    authorizer: CassandraAuthorizer
+
+2. Restart the node.
+
+3. Open a new ``cqlsh`` session using the credentials of a role with superuser 
credentials:
+
+::
+
+    cqlsh -u dba -p super
+
+4. Configure the appropriate access privileges for your clients using `GRANT 
PERMISSION <cql.html#grant-permission>`_
+   statements. On the other nodes, until configuration is updated and the node 
restarted, this will have no effect so
+   disruption to clients is avoided.
+
+::
+
+    GRANT SELECT ON ks.t1 TO db_user;
+
+5. Once all the necessary permissions have been granted, repeat steps 1 and 2 
for each node in turn. As each node
+   restarts and clients reconnect, the enforcement of the granted permissions 
will begin.
+
+See also: :ref:`GRANT PERMISSION <grant-permission-statement>`, `GRANT ALL 
<grant-all>` and :ref:`REVOKE PERMISSION
+<revoke-permission-statement>`
+
+Caching
+^^^^^^^
+
+Enabling authentication and authorization places additional load on the 
cluster by frequently reading from the
+``system_auth`` tables. Furthermore, these reads are in the critical paths of 
many client operations, and so has the
+potential to severely impact quality of service. To mitigate this, auth data 
such as credentials, permissions and role
+details are cached for a configurable period. The caching can be configured 
(and even disabled) from ``cassandra.yaml``
+or using a JMX client. The JMX interface also supports invalidation of the 
various caches, but any changes made via JMX
+are not persistent and will be re-read from ``cassandra.yaml`` when the node 
is restarted.
+
+Each cache has 3 options which can be set:
+
+Validity Period
+    Controls the expiration of cache entries. After this period, entries are 
invalidated and removed from the cache.
+Refresh Rate
+    Controls the rate at which background reads are performed to pick up any 
changes to the underlying data. While these
+    async refreshes are performed, caches will continue to serve (possibly) 
stale data. Typically, this will be set to a
+    shorter time than the validity period.
+Max Entries
+    Controls the upper bound on cache size.
+
+The naming for these options in ``cassandra.yaml`` follows the convention:
+
+* ``<type>_validity_in_ms``
+* ``<type>_update_interval_in_ms``
+* ``<type>_cache_max_entries``
+
+Where ``<type>`` is one of ``credentials``, ``permissions``, or ``roles``.
+
+As mentioned, these are also exposed via JMX in the mbeans under the 
``org.apache.cassandra.auth`` domain.
+
+JMX access
+^^^^^^^^^^
+
+Access control for JMX clients is configured separately to that for CQL. For 
both authentication and authorization, two
+providers are available; the first based on standard JMX security and the 
second which integrates more closely with
+Cassandra's own auth subsystem.
+
+The default settings for Cassandra make JMX accessible only from localhost. To 
enable remote JMX connections, edit
+``cassandra-env.sh`` (or ``cassandra-env.ps1`` on Windows) to change the 
``LOCAL_JMX`` setting to ``yes``. Under the
+standard configuration, when remote JMX connections are enabled, 
:ref:`standard JMX authentication <standard-jmx-auth>`
+is also switched on.
+
+Note that by default, local-only connections are not subject to 
authentication, but this can be enabled.
+
+If enabling remote connections, it is recommended to also use :ref:`SSL 
<jmx-with-ssl>` connections.
+
+Finally, after enabling auth and/or SSL, ensure that tools which use JMX, such 
as :ref:`nodetool <nodetool>`, are
+correctly configured and working as expected.
+
+.. _standard-jmx-auth:
+
+Standard JMX Auth
+~~~~~~~~~~~~~~~~~
+
+Users permitted to connect to the JMX server are specified in a simple text 
file. The location of this file is set in
+``cassandra-env.sh`` by the line:
+
+::
+
+    JVM_OPTS="$JVM_OPTS 
-Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password"
+
+Edit the password file to add username/password pairs:
+
+::
+
+    jmx_user jmx_password
+
+Secure the credentials file so that only the user running the Cassandra 
process can read it :
+
+::
+
+    $ chown cassandra:cassandra /etc/cassandra/jmxremote.password
+    $ chmod 400 /etc/cassandra/jmxremote.password
+
+Optionally, enable access control to limit the scope of what defined users can 
do via JMX. Note that this is a fairly
+blunt instrument in this context as most operational tools in Cassandra 
require full read/write access. To configure a
+simple access file, uncomment this line in ``cassandra-env.sh``:
+
+::
+
+    #JVM_OPTS="$JVM_OPTS 
-Dcom.sun.management.jmxremote.access.file=/etc/cassandra/jmxremote.access"
+
+Then edit the access file to grant your JMX user readwrite permission:
+
+::
+
+    jmx_user readwrite
+
+Cassandra must be restarted to pick up the new settings.
+
+See also : `Using File-Based Password Authentication In JMX
+<http://docs.oracle.com/javase/7/docs/technotes/guides/management/agent.html#gdenv>`__
+
+
+Cassandra Integrated Auth
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+An alternative to the out-of-the-box JMX auth is to useeCassandra's own 
authentication and/or authorization providers
+for JMX clients. This is potentially more flexible and secure but it come with 
one major caveat. Namely that it is not
+available until `after` a node has joined the ring, because the auth subsystem 
is not fully configured until that point
+However, it is often critical for monitoring purposes to have JMX access 
particularly during bootstrap. So it is
+recommended, where possible, to use local only JMX auth during bootstrap and 
then, if remote connectivity is required,
+to switch to integrated auth once the node has joined the ring and initial 
setup is complete.
+
+With this option, the same database roles used for CQL authentication can be 
used to control access to JMX, so updates
+can be managed centrally using just ``cqlsh``. Furthermore, fine grained 
control over exactly which operations are
+permitted on particular MBeans can be acheived via :ref:`GRANT PERMISSION 
<cgrant-permission-statement>`.
+
+To enable integrated authentication, edit ``cassandra-env.sh`` to uncomment 
these lines:
+
+::
+
+    #JVM_OPTS="$JVM_OPTS -Dcassandra.jmx.remote.login.config=CassandraLogin"
+    #JVM_OPTS="$JVM_OPTS 
-Djava.security.auth.login.config=$CASSANDRA_HOME/conf/cassandra-jaas.config"
+
+And disable the JMX standard auth by commenting this line:
+
+::
+
+    JVM_OPTS="$JVM_OPTS 
-Dcom.sun.management.jmxremote.password.file=/etc/cassandra/jmxremote.password"
+
+To enable integrated authorization, uncomment this line:
+
+::
+
+    #JVM_OPTS="$JVM_OPTS 
-Dcassandra.jmx.authorizer=org.apache.cassandra.auth.jmx.AuthorizationProxy"
+
+Check standard access control is off by ensuring this line is commented out:
+
+::
+
+   #JVM_OPTS="$JVM_OPTS 
-Dcom.sun.management.jmxremote.access.file=/etc/cassandra/jmxremote.access"
+
+With integrated authentication and authorization enabled, operators can define 
specific roles and grant them access to
+the particular JMX resources that they need. For example, a role with the 
necessary permissions to use tools such as
+jconsole or jmc in read-only mode would be defined as:
+
+::
+
+    CREATE ROLE jmx WITH LOGIN = false;
+    GRANT SELECT ON ALL MBEANS TO jmx;
+    GRANT DESCRIBE ON ALL MBEANS TO jmx;
+    GRANT EXECUTE ON MBEAN 'java.lang:type=Threading' TO jmx;
+    GRANT EXECUTE ON MBEAN 'com.sun.management:type=HotSpotDiagnostic' TO jmx;
+
+    # Grant the jmx role to one with login permissions so that it can access 
the JMX tooling
+    CREATE ROLE ks_user WITH PASSWORD = 'password' AND LOGIN = true AND 
SUPERUSER = false;
+    GRANT jmx TO ks_user;
+
+Fine grained access control to individual MBeans is also supported:
+
+::
+
+    GRANT EXECUTE ON MBEAN 
'org.apache.cassandra.db:type=Tables,keyspace=test_keyspace,table=t1' TO 
ks_user;
+    GRANT EXECUTE ON MBEAN 
'org.apache.cassandra.db:type=Tables,keyspace=test_keyspace,table=*' TO 
ks_owner;
+
+This permits the ``ks_user`` role to invoke methods on the MBean representing 
a single table in ``test_keyspace``, while
+granting the same permission for all table level MBeans in that keyspace to 
the ``ks_owner`` role.
+
+Adding/removing roles and granting/revoking of permissions is handled 
dynamically once the initial setup is complete, so
+no further restarts are required if permissions are altered.
+
+See also: :ref:`Permissions <permissions>`.
+
+.. _jmx-with-ssl:
+
+JMX With SSL
+~~~~~~~~~~~~
+
+JMX SSL configuration is controlled by a number of system properties, some of 
which are optional. To turn on SSL, edit
+the relevant lines in ``cassandra-env.sh`` (or ``cassandra-env.ps1`` on 
Windows) to uncomment and set the values of these
+properties as required:
+
+``com.sun.management.jmxremote.ssl``
+    set to true to enable SSL
+``com.sun.management.jmxremote.ssl.need.client.auth``
+    set to true to enable validation of client certificates
+``com.sun.management.jmxremote.registry.ssl``
+    enables SSL sockets for the RMI registry from which clients obtain the JMX 
connector stub
+``com.sun.management.jmxremote.ssl.enabled.protocols``
+    by default, the protocols supported by the JVM will be used, override with 
a comma-separated list. Note that this is
+    not usually necessary and using the defaults is the preferred option.
+``com.sun.management.jmxremote.ssl.enabled.cipher.suites``
+    by default, the cipher suites supported by the JVM will be used, override 
with a comma-separated list. Note that
+    this is not usually necessary and using the defaults is the preferred 
option.
+``javax.net.ssl.keyStore``
+    set the path on the local filesystem of the keystore containing server 
private keys and public certificates
+``javax.net.ssl.keyStorePassword``
+    set the password of the keystore file
+``javax.net.ssl.trustStore``
+    if validation of client certificates is required, use this property to 
specify the path of the truststore containing
+    the public certificates of trusted clients
+``javax.net.ssl.trustStorePassword``
+    set the password of the truststore file
+
+See also: `Oracle Java7 Docs 
<http://docs.oracle.com/javase/7/docs/technotes/guides/management/agent.html#gdemv>`__,
+`Monitor Java with JMX 
<https://www.lullabot.com/articles/monitor-java-with-jmx>`__

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/snitch.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/snitch.rst b/doc/source/operating/snitch.rst
new file mode 100644
index 0000000..faea0b3
--- /dev/null
+++ b/doc/source/operating/snitch.rst
@@ -0,0 +1,78 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Snitch
+------
+
+In cassandra, the snitch has two functions:
+
+- it teaches Cassandra enough about your network topology to route requests 
efficiently.
+- it allows Cassandra to spread replicas around your cluster to avoid 
correlated failures. It does this by grouping
+  machines into "datacenters" and "racks."  Cassandra will do its best not to 
have more than one replica on the same
+  "rack" (which may not actually be a physical location).
+
+Dynamic snitching
+^^^^^^^^^^^^^^^^^
+
+The dynamic snitch monitor read latencies to avoid reading from hosts that 
have slowed down. The dynamic snitch is
+configured with the following properties on ``cassandra.yaml``:
+
+- ``dynamic_snitch``: whether the dynamic snitch should be enabled or disabled.
+- ``dynamic_snitch_update_interval_in_ms``: controls how often to perform the 
more expensive part of host score
+  calculation.
+- ``dynamic_snitch_reset_interval_in_ms``: if set greater than zero and 
read_repair_chance is < 1.0, this will allow
+  'pinning' of replicas to hosts in order to increase cache capacity.
+- ``dynamic_snitch_badness_threshold:``: The badness threshold will control 
how much worse the pinned host has to be
+  before the dynamic snitch will prefer other replicas over it.  This is 
expressed as a double which represents a
+  percentage.  Thus, a value of 0.2 means Cassandra would continue to prefer 
the static snitch values until the pinned
+  host was 20% worse than the fastest.
+
+Snitch classes
+^^^^^^^^^^^^^^
+
+The ``endpoint_snitch`` parameter in ``cassandra.yaml`` should be set to the 
class the class that implements
+``IEndPointSnitch`` which will be wrapped by the dynamic snitch and decide if 
two endpoints are in the same data center
+or on the same rack. Out of the box, Cassandra provides the snitch 
implementations:
+
+GossipingPropertyFileSnitch
+    This should be your go-to snitch for production use. The rack and 
datacenter for the local node are defined in
+    cassandra-rackdc.properties and propagated to other nodes via gossip. If 
``cassandra-topology.properties`` exists,
+    it is used as a fallback, allowing migration from the PropertyFileSnitch.
+
+SimpleSnitch
+    Treats Strategy order as proximity. This can improve cache locality when 
disabling read repair. Only appropriate for
+    single-datacenter deployments.
+
+PropertyFileSnitch
+    Proximity is determined by rack and data center, which are explicitly 
configured in
+    ``cassandra-topology.properties``.
+
+Ec2Snitch
+    Appropriate for EC2 deployments in a single Region. Loads Region and 
Availability Zone information from the EC2 API.
+    The Region is treated as the datacenter, and the Availability Zone as the 
rack. Only private IPs are used, so this
+    will not work across multiple regions.
+
+Ec2MultiRegionSnitch
+    Uses public IPs as broadcast_address to allow cross-region connectivity 
(thus, you should set seed addresses to the
+    public IP as well). You will need to open the ``storage_port`` or 
``ssl_storage_port`` on the public IP firewall
+    (For intra-Region traffic, Cassandra will switch to the private IP after 
establishing a connection).
+
+RackInferringSnitch
+    Proximity is determined by rack and data center, which are assumed to 
correspond to the 3rd and 2nd octet of each
+    node's IP address, respectively.  Unless this happens to match your 
deployment conventions, this is best used as an
+    example of writing a custom Snitch class and is provided in that spirit.

http://git-wip-us.apache.org/repos/asf/cassandra/blob/54f7335c/doc/source/operating/topo_changes.rst
----------------------------------------------------------------------
diff --git a/doc/source/operating/topo_changes.rst 
b/doc/source/operating/topo_changes.rst
new file mode 100644
index 0000000..9d6a2ba
--- /dev/null
+++ b/doc/source/operating/topo_changes.rst
@@ -0,0 +1,122 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+..
+..     http://www.apache.org/licenses/LICENSE-2.0
+..
+.. Unless required by applicable law or agreed to in writing, software
+.. distributed under the License is distributed on an "AS IS" BASIS,
+.. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+.. See the License for the specific language governing permissions and
+.. limitations under the License.
+
+.. highlight:: none
+
+Adding, replacing, moving and removing nodes
+--------------------------------------------
+
+Bootstrap
+^^^^^^^^^
+
+Adding new nodes is called "bootstrapping". The ``num_tokens`` parameter will 
define the amount of virtual nodes
+(tokens) the joining node will be assigned during bootstrap. The tokens define 
the sections of the ring (token ranges)
+the node will become responsible for.
+
+Token allocation
+~~~~~~~~~~~~~~~~
+
+With the default token allocation algorithm the new node will pick 
``num_tokens`` random tokens to become responsible
+for. Since tokens are distributed randomly, load distribution improves with a 
higher amount of virtual nodes, but it
+also increases token management overhead. The default of 256 virtual nodes 
should provide a reasonable load balance with
+acceptable overhead.
+
+On 3.0+ a new token allocation algorithm was introduced to allocate tokens 
based on the load of existing virtual nodes
+for a given keyspace, and thus yield an improved load distribution with a 
lower number of tokens. To use this approach,
+the new node must be started with the JVM option 
``-Dcassandra.allocate_tokens_for_keyspace=<keyspace>``, where
+``<keyspace>`` is the keyspace from which the algorithm can find the load 
information to optimize token assignment for.
+
+Manual token assignment
+"""""""""""""""""""""""
+
+You may specify a comma-separated list of tokens manually with the 
``initial_token`` ``cassandra.yaml`` parameter, and
+if that is specified Cassandra will skip the token allocation process. This 
may be useful when doing token assignment
+with an external tool or when restoring a node with its previous tokens.
+
+Range streaming
+~~~~~~~~~~~~~~~~
+
+After the tokens are allocated, the joining node will pick current replicas of 
the token ranges it will become
+responsible for to stream data from. By default it will stream from the 
primary replica of each token range in order to
+guarantee data in the new node will be consistent with the current state.
+
+In the case of any unavailable replica, the consistent bootstrap process will 
fail. To override this behavior and
+potentially miss data from an unavailable replica, set the JVM flag 
``-Dcassandra.consistent.rangemovement=false``.
+
+Resuming failed/hanged bootstrap
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+On 2.2+, if the bootstrap process fails, it's possible to resume bootstrap 
from the previous saved state by calling
+``nodetool bootstrap resume``. If for some reason the bootstrap hangs or 
stalls, it may also be resumed by simply
+restarting the node. In order to cleanup bootstrap state and start fresh, you 
may set the JVM startup flag
+``-Dcassandra.reset_bootstrap_progress=true``.
+
+On lower versions, when the bootstrap proces fails it is recommended to wipe 
the node (remove all the data), and restart
+the bootstrap process again.
+
+Manual bootstrapping
+~~~~~~~~~~~~~~~~~~~~
+
+It's possible to skip the bootstrapping process entirely and join the ring 
straight away by setting the hidden parameter
+``auto_bootstrap: false``. This may be useful when restoring a node from a 
backup or creating a new data-center.
+
+Removing nodes
+^^^^^^^^^^^^^^
+
+You can take a node out of the cluster with ``nodetool decommission`` to a 
live node, or ``nodetool removenode`` (to any
+other machine) to remove a dead one. This will assign the ranges the old node 
was responsible for to other nodes, and
+replicate the appropriate data there. If decommission is used, the data will 
stream from the decommissioned node. If
+removenode is used, the data will stream from the remaining replicas.
+
+No data is removed automatically from the node being decommissioned, so if you 
want to put the node back into service at
+a different token on the ring, it should be removed manually.
+
+Moving nodes
+^^^^^^^^^^^^
+
+When ``num_tokens: 1`` it's possible to move the node position in the ring 
with ``nodetool move``. Moving is both a
+convenience over and more efficient than decommission + bootstrap. After 
moving a node, ``nodetool cleanup`` should be
+run to remove any unnecessary data.
+
+Replacing a dead node
+^^^^^^^^^^^^^^^^^^^^^
+
+In order to replace a dead node, start cassandra with the JVM startup flag
+``-Dcassandra.replace_address_first_boot=<dead_node_ip>``. Once this property 
is enabled the node starts in a hibernate
+state, during which all the other nodes will see this node to be down.
+
+The replacing node will now start to bootstrap the data from the rest of the 
nodes in the cluster. The main difference
+between normal bootstrapping of a new node is that this new node will not 
accept any writes during this phase.
+
+Once the bootstrapping is complete the node will be marked "UP", we rely on 
the hinted handoff's for making this node
+consistent (since we don't accept writes since the start of the bootstrap).
+
+.. Note:: If the replacement process takes longer than 
``max_hint_window_in_ms`` you **MUST** run repair to make the
+   replaced node consistent again, since it missed ongoing writes during 
bootstrapping.
+
+Monitoring progress
+^^^^^^^^^^^^^^^^^^^
+
+Bootstrap, replace, move and remove progress can be monitored using ``nodetool 
netstats`` which will show the progress
+of the streaming operations.
+
+Cleanup data after range movements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+As a safety measure, Cassandra does not automatically remove data from nodes 
that "lose" part of their token range due
+to a range movement operation (bootstrap, move, replace). Run ``nodetool 
cleanup`` on the nodes that lost ranges to the
+joining node when you are satisfied the new node is up and working. If you do 
not do this the old data will still be
+counted against the load on that node.

[18/34] cassandra git commit: Reorganize document

Reply via email to