On the subject of DSBulk, sstableloader is the tool of choice for this
scenario.
+1 to Sergio and I'm confirming that DSBulk is designed as a bulk loader
for CSV/JSON formats. Cheers!
> If I may just loop this back to the question at hand:
>
> I'm curious if there are any gotchas with using sstableloader to restore
> snapshots taken from 256-token nodes into a cluster with 32-token (or your
> preferred number of tokens) nodes (otherwise same # of nodes and same RF).
>
No, there
If I may just loop this back to the question at hand:
I'm curious if there are any gotchas with using sstableloader to restore
snapshots taken from 256-token nodes into a cluster with 32-token (or your
preferred number of tokens) nodes (otherwise same # of nodes and same RF).
On Fri, Jan 24, 2020
Just a thought along those lines. If the memtable flush isn’t keeping up, you
might find that manifested in the I/O queue length and dirty page stats leading
into the time the OOM event took place. If you do see that, then you might
need to do some I/O tuning as well.
From: Jeff Jirsa
Reply-
https://docs.datastax.com/en/dsbulk/doc/dsbulk/reference/dsbulkLoad.html
Just skimming through the docs
I see examples by loading from CSV / JSON
Maybe there is some other command or doc page that I am missing
On Fri, Jan 24, 2020, 9:10 AM Nitan Kainth wrote:
> Dsbulk works same as sstable
Dsbulk works same as sstableloder.
Regards,
Nitan
Cell: 510 449 9629
> On Jan 24, 2020, at 10:40 AM, Sergio wrote:
>
>
> I was wondering if that improvement for token allocation would work even with
> just one rack. It should but I am not sure.
>
> Does Dsbulk support migration cluster to
Ah, I focused too much on the literal meaning of startup. If it's happening
JUST AFTER startup, it's probably getting flooded with hints from the other
hosts when it comes online.
If that's the case, it may be just simply overrunning the memtable, or it
may be a deadlock like https://issues.apache
6 GB of mutations on heap
Startup would replay commitlog, which would re-materialize all of those
mutations and put them into the memtable. The memtable would flush over
time to disk, and clear the commitlog.
It looks like PERHAPS the commitlog replay is faster than the memtable
flush, so you're b
Why? Seems to me that the old Cassandra -> CSV/JSON and CSV/JSON -> new
Cassandra are unnecessary steps in my case.
On Fri, Jan 24, 2020 at 10:34 AM Nitan Kainth wrote:
> Instead of sstableloader consider dsbulk by datastax.
>
> On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback <
> rpinchb...@trip
I was wondering if that improvement for token allocation would work even
with just one rack. It should but I am not sure.
Does Dsbulk support migration cluster to cluster without CSV or JSON export?
Thanks and Regards
On Fri, Jan 24, 2020, 8:34 AM Nitan Kainth wrote:
> Instead of sstableloader
Some environment details like Cassandra version, amount of physical RAM,
JVM configs (heap and others), and any other non-default cassandra.yaaml
configs would help. The amount of data, number of keyspaces & tables,
since you mention "clients", would also be helpful for people to suggest
tuning
Instead of sstableloader consider dsbulk by datastax.
On Fri, Jan 24, 2020 at 10:20 AM Reid Pinchback
wrote:
> Jon Haddad has previously made the case for num_tokens=4. His Accelerate
> 2019 talk is available at:
>
>
>
> https://www.youtube.com/watch?v=swL7bCnolkU
>
>
>
> You might want to chec
Jon Haddad has previously made the case for num_tokens=4. His Accelerate 2019
talk is available at:
https://www.youtube.com/watch?v=swL7bCnolkU
You might want to check that out. Also I think the amount of effort you put
into evening out the token distribution increases as vnode count shrinks.
Running 3.11.x, 4 nodes RF=3, default 256 tokens; moving to a different 4
node RF=3 cluster.
I've read that 256 is not an optimal default num_tokens value, and that 32
is likely a better option.
We have the "opportunity" to switch, as we're migrating environments and
will likely be using sstablel
Hi Anthony !
I have a follow-up question :
Check to make sure that no other node in the cluster is assigned any of the
> four tokens specified above. If there is another node in the cluster that
> is assigned one of the above tokens, increment the conflicting token by
> values of one until no oth
We recently had a lot of OOM in C* and it was generally happening during
startup.
We took some heap dumps but still cannot pin point the exact reason. So, we
need some help from experts.
Our clients are not explicitly deleting data but they have TTL enabled.
C* details:
> show version
[cqlsh 5.
16 matches
Mail list logo