Re: "Failed to enable shuffling" error

Tim Heckman Mon, 08 Sep 2014 13:22:30 -0700

On Mon, Sep 8, 2014 at 11:19 AM, Robert Coli <rc...@eventbrite.com> wrote:
> On Mon, Sep 8, 2014 at 11:08 AM, Tim Heckman <t...@pagerduty.com> wrote:
>>
>> I'm looking to convert our recently upgraded Cassandra cluster from a
>> single token per node to using vnodes. We've determined that based on
>> our data consistency and usage patterns that shuffling will be the
>> best way to convert our live cluster.
>
>
> You apparently haven't read anything else about shuffling, or you would have
> learned that no one has ever successfully done it in a real production
> cluster. ;)


I've definitely seen the horror stories that have come out of shuffle.
:) We plan on giving this a trial run on production-sized data before
actually doing it on our production hardware.

>>
>> Unfortunately, the underlying error is not printed so I'm effectively
>> troubleshooting in the dark.
>
>
> This mysterious error is protecting you from a probably quite negative
> experience with shuffle.

We're still at the exploratory stage on systems that are not
production-facing but contain production-like data. Based on our
placement strategy we have some concerns that the new datacenter
approach may be riskier or more difficult. We're just trying to gauge
both paths and see what works best for us.

>>
>> I've done some mailing list diving, as well as Google skimming, and
>> all the suggestions did not seem to work.
>
>
> What version of Cassandra are you running? I would not be surprised if
> shuffle is in fact completely broken in 2.0.x release, not only hazardous to
> attempt.
>
> Why do you believe that you want to shuffle and/or enable vnodes? How large
> is the cluster and how large is it likely to become?

We're still back on the 1.2 version of Cass, specifically 1.2.16 for
the majority of our clusters with one cluster having seen its
inception after the 1.2.18 release.

The cluster I'm testing this on is a 5 node cluster with a placement
strategy such that all nodes contain 100% of the data. In practice we
have six clusters of similar size that are used for different
services. These different clusters may need additional capacity at
different times, so it's hard to answer the maximum size question. For
now let's just assume that the clusters may never see an 11th
member... but no guarantees.

We're looking to use vnodes to help with easing the administrative
work of scaling out the cluster. The improvements of streaming data
during repairs amongst others.

For shuffle, it looks like it may be easier than adding a new
datacenter and then have to adjust the schema for a new "datacenter"
to come to life. And we weren't sure whether the same pitfalls of
shuffle would effect us while having all data on all nodes.

> =Rob
>

Thanks for the quick reply, Rob.

-Tim

Re: "Failed to enable shuffling" error

Reply via email to