Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "MemtableThresholds_JP" page has been changed by MakiWatanabe.
http://wiki.apache.org/cassandra/MemtableThresholds_JP?action=diff&rev1=12&rev2=13

--------------------------------------------------

  ## page was copied from MemtableThresholds
+ 
+ <<TableOfContents>>
+ 
+ == このダイヤルに触るべからず ==
+ 
ここで解説する設定は、定量化可能な性能問題に直面した時にのみ変更すべきです。個別のユースケースや負荷により、それぞれの設定パラメータがクラスタに与える影響は様々です。とはいえ、デフォルト設定は注意深く、かつ保守的に選択されています。
+ 
+ == JVMヒープサイズ ==
+ デフォルトではCassandraの起動スクリプトはJVMの最大ヒープサイズを1GBに設定しています。これを増やすことを検討してください:但し徐々にです!
+ Cassandraや他のプロセスが過剰にメモリを使用すると、OSのファイルバッファやキャッシュが縮小します。
+ これらはCassandraの性能上、内部データ構造と同じくらい重要です。
+ 
+ 
JVMの最大ヒープを過大に設定することは過小に設定するよりもリスクが高まります。過小な設定はJMXで容易に診断できますが、過大な設定が問題の原因であることを特定するのは困難だからです。
+ 
+ RAMを48GB載せたハイエンドマシンでも、JVMの最大ヒープサイズの設定初期値は4GBから始めるのが適当でしょう。OSは貴方が考えるより頭がいいものです。
+ 大まかには、Cassandraの内部データ構造が必要とするメモリサイズは次の数式で求められます。
+ {{{memtable_throughput_in_mb * 3 * number of hot CFs + 1G + internal caches}}}
+ 
+ Also know that if you're running up against the heap limit under load that's 
probably a symptom of other problems. Diagnose those first.
+ また負荷がかかった状態でヒープサイズの上限に達した場合でも、それは真の問題から派生する症状である可能性もあります。はじめによく原因を分析してください。
+ 
+ == 仮想メモリとスワップ ==
+ 
+ 
Cassandra専用マシンでは、全くスワップを持たないのが最適なスワップ設定です。CassandraのJavaプロセスでスワップアウトが起こるとOSによってkillされることになりますが、ホスト全体がスラッシングでアクセス不能になるよりましです。
+ Linuxユーザはswappiness, overcommit_memory, 
overcommit_ratioパラメータについて完全に理解し、検討すべきです。
+ 
+ == Memtableの閾値 ==
+ 
+ 
write操作を実行すると、CassandraはデータをMemtablesと呼ばれる、カラムファミリ固有のメモリ内データ構造に格納します。Memtablesは設定可能ないくつかの閾値のいずれかを超えた時点でディスクにフラッシュされます。
+ 
初期設定値(64mb/0.3)は意図的に保守的です。メモリ不足によるノード停止を避けるためにはデータを、これらのパラメータを適切にチューニングすることが重要です。
+ 
+ == Configuring Thresholds ==
+ 
+ '''Larger ''''''Memtables take memory away from caches:'''
+ 
+ Since Memtables are storing actual column values, they consume at least as 
much memory as the size of data inserted. However, there is also overhead  
associated with the structures used to index this data. When the number of 
columns and rows is high compared to the size of values, this overhead can 
become quite significant, (possibly greater than the data itself).
+   In other words, which threshold(s) to use, and what to set them to is not 
just a function of how much memory you have, but of how many column families, 
how many columns per column-family, and the size of values  being stored.
+ 
+ '''Larger Memtables don't improve write performance: '''Increasing the 
memtable capacity will cause less-frequent flushes but doesn't improve write 
performance directly: writes go directly to memory regardless. (Actually, if 
your commitlog and sstables share a volume they might contend, so if at all 
possible, put them on separate volumes)
+ 
+ '''Larger memtables do absorb more overwrites''': If your write load sees 
some rows written more often than others (eg upvotes of a front-page story) a 
larger memtable will absorb those overwrites, creating more efficient sstables 
and thus better read performance.  If your write load is batch oriented or if 
you have a massive row set, rows are not likely to be rewritten for a long 
time, and so this benefit will pay a smaller dividend.
+ 
+ '''Larger memtables do lead to more effective compaction''': Since compaction 
is tiered, large sstables are preferable: turning over tons of tiny memtables 
is bad. Again, this impacts read performance (by improving the overall 
io-contention weather), but not writes.
+ 
+ Listed below are the thresholds found in `storage-conf.xml`, along with a 
description.
+ 
+ === MemtableThroughputInMB ===
+ As the name indicates, this sets the max size in megabytes that the  Memtable 
will store before triggering a threshold violation and causing it to be flushed 
to disk. It corresponds to the size of the values inserted, (plus the size of 
the containing column).
+ 
+ If left unconfigured (missing from the config), this defaults to 128MB.
+ 
+ ''Note: This was referred to as MemtableSizeInMB in versons of Casandra 
before 0.6.0. In version 0.7b2+, the value will be applied on a 
[[https://issues.apache.org/jira/browse/CASSANDRA-1007|per column-family 
basis]].''
+ 
+ === MemtableOperationsInMillions ===
+ This directive sets a threshold on the number of columns stored.
+ 
+ Left unconfigured (missing from the config), this defaults to 0.1  (or 
100,000 objects). The config file's inital setting of 0.3 (or 300,000 objects) 
is a conservative starting point.
+ 
+ ''Note: This was referred to as MemtableObjectCountInMillions in versons of 
Casandra before 0.6.0. In version 0.7b2+, the value will be applied on a 
[[https://issues.apache.org/jira/browse/CASSANDRA-1007|per column-family 
basis]].''
+ 
+ == Using Jconsole To Optimize Thresholds ==
+ Cassandra's column-family mbeans have a number of attributes that can prove 
invaluable in determining optimal thresholds. One way to access this 
instrumentation is by using Jconsole, a graphical monitoring and management 
application that ships with your JDK.
+ 
+ Launching Jconsole with no arguments will display the "New Connection" dialog 
box. If you are running Jconsole on the same machine that  Cassandra is running 
on, then you can connect using the PID, otherwise you will need to connect 
remotely. The default startup scripts for Cassandra cause the VM to listen on 
port 8080 using the JVM option:
+ 
+ . -Dcom.sun.management.jmxremote.port=8080
+ 
+ The remote JMX url is then:
+ 
+ service:jmx:rmi:///jndi/rmi://localhost:8080/jmxrmi
+ 
+ This is used internally by: bin/nodetool 
src/java/org/apache/cassandra/tools/nodetool.java
+ 
+ {{attachment:jconsole_connect.png}}
+ 
+ Once connected, select the ''MBeans'' tab, expand the  
''org.apache.cassandra.db'' section, and finally one of your column families.
+ 
+ There are three interesting attributes here.
+ 
+ 1. ''!MemtableColumnsCount'', representing the total number of column entries 
in this table. If you store 100 rows that each have 100 columns, expect to see 
this value increase by 10,000. This attribute is useful in setting the 
[[#MemtableObjectCountInMillions|MemtableObjectCountInMillions]] threshold.
+ 1. ''!MemtableDataSize'', which is used to determine the total size of stored 
data. This is the sum of all the values stored and does not account for 
Memtable overhead, (i.e. it's not indicative of the actual memory used by the 
Memtable). Use this value when adjusting [[#MemtableSizeInMB|MemtableSizeInMB]].
+ 1. Finally there is ''!MemtableSwitchCount'' which increases by one each time 
a column family flushes its Memtable to disk.
+ 
+ ''Note: You'll need to manually mash the `Refresh` button to update these 
values.''
+ 
+ {{attachment:jconsole_attributes.png}}
+ 
+ It is also possible to schedule an immediate flush using the `forceFlush()` 
operation.
+ 
+ {{attachment:jconsole_operations.png}}
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
+ ## old translation ##
+ = 過去の翻訳 =
+ 
  
書き込み操作を行うと、Cassandraはメモリ上の!ColumnFamilyのデータ構造であるMemtableに値を保存します。Memtableは設定された閾値を超えるとディスクへとフラッシュされます。利用可能なシステムメモリを有効活用し、メモリ不足でノードが落ちてしまわないよう、閾値の正しいチューニングを行うことは重要です。
  
  
(bin/cassandra.in.shのデフォルト設定では、最大JVMヒープサイズは1GB(-Xmx1G)となっており、本番環境では少なすぎます。この値を増やすことも考慮すべきです。)

Reply via email to