On Mon, Jul 23, 2012 at 1:25 PM, Mike Heffner <m...@librato.com> wrote:

> Hi,
>
> We are migrating from a 0.8.8 ring to a 1.1.2 ring and we are noticing
> missing data post-migration. We use pre-built/configured AMIs so our
> preferred route is to leave our existing production 0.8.8 untouched and
> bring up a parallel 1.1.2 ring and migrate data into it. Data is written to
> the rings via batch processes so we can easily assure that both the
> existing and new rings will have the same data post migration.
>
> <snip>


> The steps we are taking are:
>
> 1. Bring up a 1.1.2 ring in the same AZ/data center configuration with
> tokens matching the corresponding nodes in the 0.8.8 ring.
> 2. Create the same keyspace on 1.1.2.
> 3. Create each CF in the keyspace on 1.1.2.
> 4. Flush each node of the 0.8.8 ring.
> 5. Rsync each non-compacted sstable from 0.8.8 to the corresponding node
> in 1.1.2.
> 6. Move each 0.8.8 sstable into the 1.1.2 directory structure by renaming
> the file to the  /cassandra/data/<keyspace>/<cf>/<keyspace>-<cf>... format.
> For example, for the keyspace "Metrics" and CF "epochs_60" we get:
> "cassandra/data/Metrics/epochs_60/Metrics-epochs_60-g-941-Data.db".
> 7. On each 1.1.2 node run `nodetool -h localhost refresh Metrics <CF>` for
> each CF in the keyspace. We notice that storage load jumps accordingly.
> 8. On each 1.1.2 node run `nodetool -h localhost upgradesstables`. This
> takes awhile but appears to correctly rewrite each sstable in the new 1.1.x
> format. Storage load drops as sstables are compressed.
>
>
So, after some further testing we've observed that the `upgradesstables`
command is removing data from the sstables, leading to our missing data.
We've repeated the steps above with several variations:

WORKS refresh -> scrub
WORKS refresh -> scrub -> major compaction

FAILS refresh -> upgradesstables
FAILS refresh -> scrub -> upgradesstables
FAILS refresh -> scrub -> major compaction -> upgradesstables

So, we are able to migrate our test CFs from a 0.8.8 ring to a 1.1.2 ring
when we use scrub. However, whenever we run an upgradesstables command the
sstables are shrunk significantly and our tests show missing data:

 INFO [CompactionExecutor:4] 2012-07-24 04:27:36,837 CompactionTask.java
(line 109) Compacting
[SSTableReader(path='/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-51-Data.db')]
 INFO [CompactionExecutor:4] 2012-07-24 04:27:51,090 CompactionTask.java
(line 221) Compacted to
[/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-58-Data.db,].
 60,449,155 to 2,578,102 (~4% of original) bytes for 4,002 keys at
0.172562MB/s.  Time: 14,248ms.

Is there a scenario where upgradesstables would remove data that a scrub
command wouldn't? According the documentation, it would appear that the
scrub command is actually more destructive than upgradesstables in terms of
removing data. On 1.1.x, upgradesstables is the documented upgrade command
over a scrub.

The keyspace is defined as:

Keyspace: Metrics:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
    Options: [us-east:3]

And the column family above defined as:

    ColumnFamily: metrics_900
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator:
org.apache.cassandra.db.marshal.BytesType
      Columns sorted by:
org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.LongType,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.1
      DC Local Read repair chance: 0.0
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: default
      Built indexes: []
      Compaction Strategy:
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression:
org.apache.cassandra.io.compress.SnappyCompressor

All rows have a TTL of 30 days, so it's possible that, along with the
gc_grace=0, a small number would be removed during a
compaction/scrub/upgradesstables step. However, the majority should still
be kept as their TTL has not expired yet.

We are still experimenting to see under what conditions this happens, but I
thought I'd send out some more info in case there is something clearly
wrong we're doing here.


Thanks,

Mike
-- 

  Mike Heffner <m...@librato.com>
  Librato, Inc.

Reply via email to