Re: Built in trigger: double-write for app migration

2018-10-19 Thread Antonis Papaioannou
It reminds me of “shadow writes” described in [1].
During data migration the coordinator forwards  a copy of any write request 
regarding tokens that are being transferred to the new node.

[1] Incremental Elasticity for NoSQL Data Stores, SRDS’17,  
https://ieeexplore.ieee.org/document/8069080


> On 18 Oct 2018, at 18:53, Carl Mueller  
> wrote:
> 
> tl;dr: a generic trigger on TABLES that will mirror all writes to
> facilitate data migrations between clusters or systems. What is necessary
> to ensure full write mirroring/coherency?
> 
> When cassandra clusters have several "apps" aka keyspaces serving
> applications colocated on them, but the app/keyspace bandwidth and size
> demands begin impacting other keyspaces/apps, then one strategy is to
> migrate the keyspace to its own dedicated cluster.
> 
> With backups/sstableloading, this will entail a delay and therefore a
> "coherency" shortfall between the clusters. So typically one would employ a
> "double write, read once":
> 
> - all updates are mirrored to both clusters
> - writes come from the current most coherent.
> 
> Often two sstable loads are done:
> 
> 1) first load
> 2) turn on double writes/write mirroring
> 3) a second load is done to finalize coherency
> 4) switch the app to point to the new cluster now that it is coherent
> 
> The double writes and read is the sticking point. We could do it at the app
> layer, but if the app wasn't written with that, it is a lot of testing and
> customization specific to the framework.
> 
> We could theoretically do some sort of proxying of the java-driver somehow,
> but all the async structures and complex interfaces/apis would be difficult
> to proxy. Maybe there is a lower level in the java-driver that is possible.
> This also would only apply to the java-driver, and not
> python/go/javascript/other drivers.
> 
> Finally, I suppose we could do a trigger on the tables. It would be really
> nice if we could add to the cassandra toolbox the basics of a write
> mirroring trigger that could be activated "fairly easily"... now I know
> there are the complexities of inter-cluster access, and if we are even
> using cassandra as the target mirror system (for example there is an
> article on triggers write-mirroring to kafka:
> https://dzone.com/articles/cassandra-to-kafka-data-pipeline-part-1).
> 
> And this starts to get into the complexities of hinted handoff as well. But
> fundamentally this seems something that would be a very nice feature
> (especially when you NEED it) to have in the core of cassandra.
> 
> Finally, is the mutation hook in triggers sufficient to track all incoming
> mutations (outside of "shudder" other triggers generating data)



SSTable index format

2016-06-15 Thread Antonis Papaioannou

Hi,

I'm interested in the SSTable index file format and particularly in 
Cassandra 2.2 which uses the SSTable version "ma".
Apart from keys and their corresponding offsets in the data file what 
else is included in each index entry?


I'm trying to trace code when an SSTable is flushed (especially in class 
BigTableWriter.java).
I see that each RowIndexEntry may contain a ColumnIndex which in turn it 
has a list with IndexHelper.IndexInfo entries.

So i would expect the index format to be something like this:


On the other hand it seems that the ColumnIndex does not contain all the 
columns of the data row.


Let me give you an example.
Assume the following schema of a column family
mytable ( y_id varchar primary key, field0 varchar, field1 varchar, 
field2 varchar);


In this case if i execute the queries below:
INSERT INTO ycsb.usertable (y_id, field0, field1, field2) VALUES ('k1', 
'f1a', 'f1b', 'f1c');

INSERT INTO ycsb.usertable (y_id, field0) VALUES ('k2', 'f2a');

and then flush the table, I would expect the index to have the following 
info:

k1, [field0, field1, field2], 
k2, [field0], 

Is this correct?
Is there a documentation page with the file format of the index file?