Hi Julien,
We have such a use case on some clusters. If you want to insert big batches at
fast pace the only viable solution is to generate SSTables on Spark side and
stream them to C*. Last time we benchmarked such a job we achieved 1.3 million
partitions inserted per seconde on a 3 C* nodes test cluster - which is
impossible with regular inserts.
Best,
Romain
Le lundi 5 février 2018 à 03:54:09 UTC+1, kurt greaves
<[email protected]> a écrit :
Would you know if there is evidence that inserting skinny rows in sorted order
(no batching) helps C*?
This won't have any effect as each insert will be handled separately by the
coordinator (or a different coordinator, even). Sorting is also very unlikely
to help even if you did batch.
Also, in the case of wide rows, is there evidence that sorting clustering keys
within partition batches helps ease C*'s job?
No evidence, seems very unlikely.