Re: A mini postmortem on snapshot failures

2017-01-04 Thread Erb, Stephan
nder heavy load. Thx From: John Sirois To: dev@aurora.apache.org Sent: Wednesday, October 5, 2016 11:04 AM Subject: Re: A mini postmortem on snapshot failures On Tue, Oct 4, 2016 at 11:55 AM, Joshua Cohen wrote: > Hi Zameer, > &

Re: A mini postmortem on snapshot failures

2016-12-30 Thread meghdoot bhattacharya
Is there a ticket for this? In our large test cluster, we hit into this under heavy load. Thx From: John Sirois To: dev@aurora.apache.org Sent: Wednesday, October 5, 2016 11:04 AM Subject: Re: A mini postmortem on snapshot failures On Tue, Oct 4, 2016 at 11:55 AM, Joshua Cohen

Re: A mini postmortem on snapshot failures

2016-10-05 Thread John Sirois
On Tue, Oct 4, 2016 at 11:55 AM, Joshua Cohen wrote: > Hi Zameer, > > Thanks for this writeup! > > I think one other option to consider would be using a connection for > writing the snapshots that's not bound by the pool's maximum checkout time. > I'm not sure if this is feasible or not, but I wo

Re: A mini postmortem on snapshot failures

2016-10-04 Thread Erb, Stephan
Thanks for the pointers regarding the broken documentation. I will fix that. The configuration options have moved and are now described here http://aurora.apache.org/documentation/latest/operations/configuration/#replicated-log-configuration On 03/10/16 09:05, "meghdoot bhattacharya" wrote:

Re: A mini postmortem on snapshot failures

2016-10-04 Thread Erb, Stephan
An immediate failover seems rather drastic too me. However, I have no anecdotal evidence to back up this feeling or any other default config changes. Maybe Joshua can share what they are using so that we can adjust the default values accordingly? Other thoughts: • Have you tried this magic tric

Re: A mini postmortem on snapshot failures

2016-10-04 Thread Joshua Cohen
Hi Zameer, Thanks for this writeup! I think one other option to consider would be using a connection for writing the snapshots that's not bound by the pool's maximum checkout time. I'm not sure if this is feasible or not, but I worry that there's potentially no upper bound on raising the maximum

Re: A mini postmortem on snapshot failures

2016-10-03 Thread meghdoot bhattacharya
Zameer, thanks for sharing this. For folks who are looking to operate Aurora with HA this is very valuable. Operational insights from aurora experts is always welcome.Not to hijack the conversation on the 3 questions you asked, I found inhttp://aurora.apache.org/documentation/latest/operations/s