Hello!

First of all, this is my first post to this user group.  If I'm in the 
wrong place please don't hesitate to point me in a different direction.

Starting around mid-December I've been unable to complete a backup-push. 
 After running for an hour or so the server stops responding to network 
requests.  The only thing I can do is wait until backup-push finishes and 
then I can ssh back in to the server.  

Once back online I can find the following problems:

   1. dmesg repeats this error: *[1107575.808936] xen_netfront: xennet: skb 
   rides the rocket: 19 slots*
   2. Wal-e complains about HTTP 500 when pushing files to S3 (sorry, I 
   don't have a copy of this error handy)
   
My server is configured as follows (let me know if more info is helpful): 

   - amazon ec2 i2.4xlarge
   - ubuntu 14.04 lts
   - postgres 9.3
   - wal-e 7.3
   - database size is ~2.4TB

>From what I've been able to find so far there may be a bug in the xennet 
driver that is causing the "rides the rocket" error, see here 
<http://www.brendangregg.com/blog/2014-09-11/perf-kernel-line-tracing.html> 
and here <https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317811>. 
 I've tried turning some of the suggested features off with ethtool as 
suggested in the links and it seems to have prevented the "rides to the 
rocket" errors but backup-push still doesn't complete.  

I've since used an older backup-push to get another server going for 
testing and it too has the same problem.

Has anyone else seen this?  If so, were you able to resolve it?

Cheers,
Brian



-- 
You received this message because you are subscribed to the Google Groups 
"wal-e" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to