[Cassandra Wiki] Update of "Operations_JP" by MakiWatanabe

Apache Wiki Fri, 18 Feb 2011 21:11:31 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "Operations_JP" page has been changed by MakiWatanabe.
http://wiki.apache.org/cassandra/Operations_JP?action=diff&rev1=94&rev2=95

--------------------------------------------------

  
  リペアは一度に一台のノードで実行すべきです。 (この制限は0.7で取り除かれています。)
  
- === Frequency of nodetool repair ===
+ === nodetool repairの頻度 ===
  
- Unless your application performs no deletes, it is vital that production 
clusters run `nodetool repair` periodically on all nodes in the cluster. The 
hard requirement for repair frequency is the value used for GCGraceSeconds (see 
[[DistributedDeletes]]). Running nodetool repair often enough to guarantee that 
all nodes have performed a repair in a given period GCGraceSeconds long, 
ensures that deletes are not "forgotten" in the cluster.
+ あなたのアプリケーションが削除をまったく行わないのでない限り、本番環境のクラスタのすべてのノードで定期的に`nodetool 
repair`を実行することは不可欠です。repairの実行間隔の上限はGCGraceSecondsの設定値によって決まります（[[DistributedDeletes|DistributedDelete]]
 
参照）。削除指示されたデータについて「削除の喪失」が発生しないようにするためには、すべてのノードでGCGraceSecondsで指定された期間の間に確実にrepairを実施する必要があります。
  
- Consider how to schedule your repairs. A repair causes additional disk and 
CPU activity on the nodes participating in the repair, and it will typically be 
a good idea to spread repairs out over time so as to minimize the chances of 
repairs running concurrently on many nodes.
+ 
repairの実行スケジュールはよく検討してください。repairに関与するノードでは余分のdiskおよびCPU消費が発生するので、一般的にはrepairの実施を時間的に分散するのが良いでしょう。これによりrepairが多くのノードで同時に実行される可能性を減らすことができます。
  
- ==== Dealing with the consequences of nodetool repair not running within 
GCGraceSeconds ====
+ ==== GCGraceSeconds以内にnodetool repairが実施されなかった場合の対処 ====
  
- If `nodetool repair` has not been run often enough to the pointthat 
GCGraceSeconds has passed, you risk forgotten deletes (see 
[[DistributedDeletes]]). In addition to data popping up that has been deleted, 
you may see inconsistencies in data return from different nodes that will not 
self-heal by read-repair or further `nodetool repair`. Some further details on 
this latter effect is documented in 
[[https://issues.apache.org/jira/browse/CASSANDRA-1316|CASSANDRA-1316]].
+ GCGraceSecondsが経過するまでに`nodetool repair`が一度も実行されない場合、「削除の喪失」が発生する可能性があります 
（[[DistributedDeletes]]参照）。
+ 
このようなケースでは、削除したはずのデータが再び現れる可能性に加えて、複数のレプリカノードから返される値に不整合が発生するかもしれません。このような不整合はread
 repairや`nodetool 
repair`では解消されません。後者の問題については[[https://issues.apache.org/jira/browse/CASSANDRA-1316|CASSANDRA-1316]]に詳説されています。
  
- There are at least three ways to deal with this scenario.
+ このシナリオが発生した場合の対処として少なくとも3つの方法が考えられます。
  
-  1. Treat the node in question as failed, and replace it as described further 
below.
-  2. To minimize the amount of forgotten deletes, first increase 
GCGraceSeconds across the cluster (rolling restart required), perform a full 
repair on all nodes, and then change GCRaceSeconds back again. This has the 
advantage of ensuring tombstones spread as much as possible, minimizing the 
amount of data that may "pop back up" (forgotten delete).
-  3. Yet another option, that will result in more forgotten deletes than the 
previous suggestion but is easier to do, is to ensure 'nodetool repair' has 
been run on all nodes, and then perform a compaction to expire toombstones. 
Following this, read-repair and regular `nodetool repair` should cause the 
cluster to converge.
+  1. 疑わしいノードを障害ノードとみなし、後述する方法で入れ替えを行います。
+  2. 
削除の喪失を最小限にするため、まずGCGraceSecondsの値をクラスタ全体で増やします（ローリングリスタートが必要です）。すべてのノードでフルリペアを実施した後、GCGraceSecondsを元に戻します。この手法ではtombstoneを可能な限り再配布することになるため、「削除したはずのデータが復活する」現象を最小限にすることができます。
+  3. もう一つのオプションは、単純に'nodetool 
repair'を全ノードで実施した後、compactionを実行してtombstoneをexpireするすることです。以降はread 
repairと通常の'nodetool 
repair'によってシステムの整合性が回復します。この方法は前の手法よりも実施が容易ですが、より多くの「削除の喪失」が発生することになるでしょう。
  
- === Handling failure ===
- If a node goes down and comes back up, the ordinary repair mechanisms will be 
adequate to deal with any inconsistent data.  Remember though that if a node 
misses updates and is not repaired for longer than your configured 
GCGraceSeconds (default: 10 days), it could have missed remove operations 
permanently.  Unless your application performs no removes, you should wipe its 
data directory, re-bootstrap it, and removetoken its old entry in the ring (see 
below).
+ === ノード障害への対処 ===
+ 
ノードが一時的な停止の後で回復した場合にデータの整合性を回復するには通常のrepair機構で十分でしょう。しかし注意して頂きたいのは、ノードの停止中に更新が実行され、設定されたGCGraceSeconds（標準値は10日）以内にノードのrepairが行われなかった場合は、その期間の削除操作が完全に失われるということです。あなたのアプリ-ションが削除をまったく行わないのでない限り、このようなケースでは障害ノードのデータを完全に削除し、再ブートストラップし、従来使用していたトークンについてremovetokenを実行する必要があります（下記を参照）。
  
- If a node goes down entirely, then you have two options:
+ ノードが完全に停止し、回復の見込みがない場合は、2つの選択肢があります:
  
-  1. (Recommended approach) Bring up the replacement node with a new IP 
address, and !AutoBootstrap set to true in storage-conf.xml. This will place 
the replacement node in the cluster and find the appropriate position 
automatically. Then the bootstrap process begins. While this process runs, the 
node will not receive reads until finished. Once this process is finished on 
the replacement node, run `nodetool removetoken` once, supplying the token of 
the dead node, and `nodetool cleanup` on each node.
-  1. You can obtain the dead node's token by running `nodetool ring` on any 
live node, unless there was some kind of outage, and the others came up but not 
the down one -- in that case, you can retrieve the token from the live nodes' 
system tables.
+  1. 
（推奨する対処方法）代替のノードを新しいIPアドレスで用意し、設定ファイルの!AutoBootstrapパラメータをtrueに指定します。この設定により、代替ノードはクラスタリング上の適切な位置を自律的に検出し、ブートストラップします。ブートストラップが完了するまで、代替ノードはreadを受け付けません。ブートストラップ完了後、障害ノードに割り当てられていたトークンを`nodetool
 removetoken`でクラスタから除去し、その後各ノードで`nodetool cleanup`を実施します。これは障害ノードに対する古いHinted 
Handoffを削除するためです。
+  1.　(もう一つの手法）稼働しているノードに対して`nodetool 
ring`を実施し、障害ノードに割り当てられていたトークン値を取得します。代替ノードに障害データのトークンを割り当て、障害ノードと同じIPアドレスで立ち上げ、`nodetool
 
repair`を実行します。repairが完了するまで、このノードだけからreadするクライアントにはデータが返りません。readの際に高い!ConsistencyLevelを指定すれば、これを避けることができます。
  
-  1. (Alternative approach) Bring up a replacement node with the same IP and 
token as the old, and run `nodetool repair`. Until the repair process is 
complete, clients reading only from this node may get no data back.  Using a 
higher !ConsistencyLevel on reads will avoid this.
+ == データのバックアップ ==
+ `nodetool 
snapshot`によってオンラインでデータのスナップショットを取ることができます。取得したスナップショットを任意のシステムでバックアップすることもできますが、巨大なクラスタ環境ではそのままスナップショットを取得した場所に残しておくのも選択肢の一つでしょう。
+ `nodetool 
snapshot`は当該ノードの全データをフラッシュさせますので、snapshotコマンド実行以前のすべての書き込みがスナップショットに含まれます。
  
+ OSやJVMの組み合わせによってはスナップショット中にプロセス生成に関連するエラーがレポートされる可能性があります。例えばLinux上の場合
- The reason why you run `nodetool cleanup` on all live nodes is to remove old 
Hinted Handoff writes stored for the dead node.
- 
- == Backing up data ==
- Cassandra can snapshot data while online using `nodetool snapshot`.  You can 
then back up those snapshots using any desired system, although leaving them 
where they are is probably the option that makes the most sense on large 
clusters. `nodetool snapshot` triggers a node-wide flush, so all data written 
before the execution of the snapshot command is contained within the snapshot.
- 
- With some combinations of operating system/jvm you may receive an error 
related to the inability to create a process during the snapshotting, such as 
this on Linux
  
  {{{
  Exception in thread "main" java.io.IOException: Cannot run program "ln": 
java.io.IOException: error=12, Cannot allocate memory
  }}}
- This is caused by the operating system trying to allocate the child "ln" 
process a memory space as large as the parent process (the cassandra server), 
even though '''it's not going to use it'''. So if you have a machine with 8GB 
of RAM and no swap, and you gave 6GB to the cassandra server, it will fail 
during this because the operating system wants 12 GB of virtual memory before 
allowing you to create the process. 
  
- This error can be worked around by either :
+ 
これはOSが子プロセス"ln"のために、"ln"には過大であるに関わらず、親プロセス（Cassandraサーバ）と同じサイズのメモリを確保しようとするためです。つまり、もし8GB
 RAM、スワップなしのシステムでcassandraに6GBを割り当てていた場合、OSは合計12GBの仮想メモリを必要とするため、プロセス生成は失敗します。
  
-  * dropping the jna.jar file into Cassandra's lib folder (requires at least 
Cassandra 0.6.6)
+ このエラーは以下のいずれかの方法で回避可能です:
  
- OR
+  * jna.jarファイルをCassandraのlibディレクトリに配置する（Cassandra 0.6.6以上が必要）
  
+ あるいは
-  * creating a swap file, snapshotting, removing swap file
- OR
-  * turning on "memory overcommit"
  
- To restore a snapshot:
+  * スワップファイルを作成し、スナップショットを取り、スワップファイルを削除する
  
+ あるいは
-  1. shut down the node
-  1. clear out the old commitlog and sstables
-  1. move the sstables from the snapshot location to the live data directory.
  
+  * "memory overcommit"を有効にする
- === Consistent backups ===
- You can get an eventually consistent backup by snapshotting all node; no 
individual node's backup is guaranteed to be consistent but if you restore from 
that snapshot then clients will get eventually consistent behavior as usual.
  
- There is no such thing as a consistent view of the data in the strict sense, 
except in the trivial case of writes with consistency level = ALL.
+ スナップショットをリストアするには:
  
- === Import / export ===
- As an alternative to taking snapshots it's possible to export SSTables to 
JSON format using the `bin/sstable2json` command:
+  1. ノードを停止する
+  1. 古いcommitlogとsstableを削除する
+  1. スナップショットのsstableを本番データ領域に移動する
+ 
+ === 整合性のあるバックアップ ===
+ 
すべてのノードでスナップショットを取ることで、結果整合なバックアップを取得できます。どの個別ノードのバックアップも整合性は保証されませんが、スナップショットからリストアした環境にアクセスしたクライアントは、通常通り結果整合性のある応答を得ることができます。
+ 
+ writeを!ConsitencyLevel=ALLで実行しない限り、厳密な意味で整合性のあるビューは存在しません。
+ 
+ === インポート / エクスポート ===
+ スナップショットを取る代わりに、`bin/sstable2json`コマンドでSSTableをJSONフォーマットでエクスポートすることもできます:
  
  {{{
  Usage: sstable2json [-f outfile] <sstable> [-k key [-k key [...]]]
  }}}
- `bin/sstable2json` accepts as a required argument, the full path to an 
SSTable data file, (files ending in -Data.db), and an optional argument for an 
output file (by default, output is written to stdout). You can also pass the 
names of specific keys using the `-k` argument to limit what is exported.
  
- Note: If you are not running the exporter on in-place SSTables, there are a 
couple of things to keep in mind.
+ 
`bin/sstable2json`は必須引数としてSSTableデータファイルのフルパス（-Data.dbで終わるファイル名を含む）、オプション引数として出力ファイル（標準では標準出力に出力されます。）を取ります。`-k`オプションでエクスポート対象を特定のキーに限定することも可能です。
  
+ 注意: Cassandraの実行環境以外にあるSSTableに対してエクスポートを実行する場合、幾つか気をつけるべき点があります。
-  1. The corresponding configuration must be present (same as it would be to 
run a node).
-  1. SSTables are expected to be in a directory named for the keyspace (same 
as they would be on a production node).
  
- JSON exported SSTables can be "imported" to create new SSTables using 
`bin/json2sstable`:
+  1. 適切な設定がされていること（ノードに対して実行する場合と同様）
+  1. SSTableがキースペースの名前のディレクトリに格納されていること（本番ノードと同様）
+ 
+ JSON形式でエクスポートされたSSTableは`bin/json2sstable`で新しいSSTableとしてインポートできます:
  
  {{{
  Usage: json2sstable -K keyspace -c column_family <json> <sstable>
  }}}
- `bin/json2sstable` takes arguments for keyspace and column family names, and 
full paths for the JSON input file and the destination SSTable file name.
  
- You can also import pre-serialized rows of data using the BinaryMemtable 
interface.  This is useful for importing via Hadoop or another source where you 
want to do some preprocessing of the data to import.
+ `bin/json2sstable` 
は引数としてキースペースとカラムファミリの名前、JSON、JSON形式の入力ファイルのフルパスと出力先SSTableファイルのフルパスを取ります。
  
- NOTE: Starting with version 0.7, json2sstable and sstable2json must be run in 
such a way that the schema can be loaded from system tables.  This means that 
cassandra.yaml must be found in the classpath and refer to valid storage 
directories.
+ 
BinaryMemtableインターフェースを使用してシリアライズされる前の行をインポートすることも可能です。これはHadoopや他のデータソース上で加工されたデータをインポートするのに有用です。
  
- == Monitoring ==
+ 注意: バージョン0.7より、json2sstableとsstable2jsonはschemaをsystem 
tableからロードできるような環境で実行する必要があります。これはcassandra.yamlファイルがclasspath上に存在し、適切なストレージディレクトリを参照するように構成されていることを意味します。
+ 
+ == 監視 ==
+ Cassandraは内部の統計情報をJMXデータとして公開しています。これはJVMの世界では一般的です。OpenNMS、Nagios
  Cassandra exposes internal metrics as JMX data.  This is a common standard in 
the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX 
support. The specifics of the JMX Interface are documented at JmxInterface.
  
  Running `nodetool cfstats` can provide an overview of each Column Family, and 
important metrics to graph your cluster. Some folks prefer having to deal with 
non-jmx clients, there is a JMX-to-REST bridge available at 
http://code.google.com/p/polarrose-jmx-rest-bridge/

[Cassandra Wiki] Update of "Operations_JP" by MakiWatanabe

Reply via email to