Re: Xen virtual network (Netfront) driver

2016-01-25 Thread Jonathon Sisson
On Mon, Jan 25, 2016 at 12:04:03PM +0100, Mike Belopuhov wrote:
> On 25 January 2016 at 01:38, Jonathon Sisson <open...@j3z.org> wrote:
> > Not certain if this is debug output put there intentionally or
> 
> Yes.
> 
> > if this is some error condition?
> 
> It is.  Transmission is stuck and these are watchdog timeouts.
> Did it get a lease on c4.8xlarge?  So far it looks like it happens
> on m4.10xlarge instance only.  No idea why.
> 
Here's the console output:

user@host:~$ grep 'tx prod' dmesg/*
dmesg/c4.8xlarge_dmesg.txt:xnf0: tx prod 2 cons 2,0 evt 3,1
dmesg/c4.8xlarge_dmesg.txt:xnf0: tx prod 3 cons 3,0 evt 4,1
dmesg/c4.8xlarge_dmesg.txt:xnf0: tx prod 4 cons 4,0 evt 5,1
dmesg/c4.8xlarge_dmesg.txt:xnf0: tx prod 5 cons 5,0 evt 6,1
dmesg/c4.8xlarge_dmesg.txt:xnf0: tx prod 6 cons 6,0 evt 7,1
dmesg/c4.8xlarge_dmesg.txt:xnf0: tx prod 7 cons 7,0 evt 8,1
dmesg/d2.8xlarge_dmesg.txt:xnf0: tx prod 2 cons 2,0 evt 3,1
dmesg/d2.8xlarge_dmesg.txt:xnf0: tx prod 3 cons 3,0 evt 4,1
dmesg/d2.8xlarge_dmesg.txt:xnf0: tx prod 4 cons 4,0 evt 5,1
dmesg/d2.8xlarge_dmesg.txt:xnf0: tx prod 5 cons 5,0 evt 6,1
dmesg/d2.8xlarge_dmesg.txt:xnf0: tx prod 6 cons 6,0 evt 7,1
dmesg/g2.8xlarge_dmesg.txt:xnf0: tx prod 5 cons 5,0 evt 6,1
dmesg/g2.8xlarge_dmesg.txt:xnf0: tx prod 6 cons 6,0 evt 7,1
dmesg/i2.8xlarge_dmesg.txt:xnf0: tx prod 3 cons 3,0 evt 4,1
dmesg/i2.8xlarge_dmesg.txt:xnf0: tx prod 4 cons 4,0 evt 5,1
dmesg/i2.8xlarge_dmesg.txt:xnf0: tx prod 5 cons 5,0 evt 6,1
dmesg/i2.8xlarge_dmesg.txt:xnf0: tx prod 6 cons 6,0 evt 7,1
dmesg/i2.8xlarge_dmesg.txt:xnf0: tx prod 7 cons 7,0 evt 8,1
dmesg/m4.10xlarge_dmesg.txt:xnf0: tx prod 2 cons 2,0 evt 3,1
dmesg/m4.10xlarge_dmesg.txt:xnf0: tx prod 3 cons 3,0 evt 4,1
dmesg/m4.10xlarge_dmesg.txt:xnf0: tx prod 4 cons 4,0 evt 5,1
dmesg/m4.10xlarge_dmesg.txt:xnf0: tx prod 5 cons 5,0 evt 6,1
dmesg/m4.10xlarge_dmesg.txt:xnf0: tx prod 6 cons 6,0 evt 7,1
dmesg/r3.8xlarge_dmesg.txt:xnf0: tx prod 5 cons 5,0 evt 6,1
dmesg/r3.8xlarge_dmesg.txt:xnf0: tx prod 6 cons 6,0 evt 7,1
dmesg/r3.8xlarge_dmesg.txt:xnf0: tx prod 7 cons 7,0 evt 8,1
dmesg/r3.8xlarge_dmesg.txt:xnf0: tx prod 8 cons 8,0 evt 9,1

It happens on c4.8x, d2.8x, g2.8x, i2.8x, m4.10x, r3.8x.  
Basically all of the largest instances sizes...on newer
gen instance types that support enhanced networking?  

I can re-test these and see if it occurs frequently or
if it was just a fluke.  I'll update in a bit.




Re: Xen virtual network (Netfront) driver

2016-01-24 Thread Jonathon Sisson
On Sun, Jan 24, 2016 at 02:16:37PM +0100, Mike Belopuhov wrote:
> Hi Jonathon,
> 
> Thanks a lot for taking your time to test this.
>
No, thank you guys for all of the work you're doing to get
this working.  I'm just a user heh.
 
> 
> Trying newer kernels would be the most helpful. I've just enabled tcp/udp
> checksum offloading in the xnf on Friday and would welcome any network
> tests.
>
I rebuilt with a source checkout earlier today, and after
rebooting to the new kernel I can't seem to get a dhcp lease.
I'm working on building userland to determine if there is
some issue with dhclient, but I haven't finished that step
yet.  Has anyone else noted the dhcp issue? 
 



Re: Xen virtual network (Netfront) driver

2016-01-24 Thread Jonathon Sisson
On Sun, Jan 24, 2016 at 09:08:32PM +0100, Mike Belopuhov wrote:
> On 24 January 2016 at 20:55, Jonathon Sisson <open...@j3z.org> wrote:
> > On Sun, Jan 24, 2016 at 02:16:37PM +0100, Mike Belopuhov wrote:
> >> Hi Jonathon,
> >>
> >> Thanks a lot for taking your time to test this.
> >>
> > No, thank you guys for all of the work you're doing to get
> > this working.  I'm just a user heh.
> >
> >>
> >> Trying newer kernels would be the most helpful. I've just enabled tcp/udp
> >> checksum offloading in the xnf on Friday and would welcome any network
> >> tests.
> >>
> > I rebuilt with a source checkout earlier today, and after
> > rebooting to the new kernel I can't seem to get a dhcp lease.
> > I'm working on building userland to determine if there is
> > some issue with dhclient, but I haven't finished that step
> > yet.  Has anyone else noted the dhcp issue?
> >
> 
> I haven't seen that on my test box (not AWS), but maybe reverting
> the minimum number of rx slots back to 32 can help?
> 
> http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pv/if_xnf.c.diff?r1=1.9=1.10
> 
Reverting to 32 fixed the dhcp issue.

I'll go ahead and get those dmesgs for you now =)

Thanks again!



Re: Xen virtual network (Netfront) driver

2016-01-23 Thread Jonathon Sisson
On Sat, Jan 23, 2016 at 12:19:29PM +0100, Reyk Floeter wrote:
> No, you have to *enable* SR-IOV in the image.
> 
> Machines with the Intel NIC will not show any netfront in the device list via 
> XenStore (just try Ubuntu).
> 
> Reyk

That's correct, but I think what was being pointed out is that
an instance with SRIOV enabled cannot have it *disabled* (i.e.
to switch back to xnf NICs).  I was able to get xnf operational
on a c3.large (enhanced networking-capable) by creating an instance
with CentOS and swapping the root volume out.  Any AMI constructed
on Amazon Linux or Ubuntu will have enhanced networking enabled
by default, whereas CentOS doesn't appear to have it enabled (unless
you manually enable it).



Re: Xen virtual network (Netfront) driver

2016-01-23 Thread Jonathon Sisson
On Sat, Jan 23, 2016 at 10:57:21PM +0100, Reyk Floeter wrote:
> 
> > On 23.01.2016, at 22:27, Jonathon Sisson <open...@j3z.org> wrote:
> > 
> > On Sat, Jan 23, 2016 at 12:19:29PM +0100, Reyk Floeter wrote:
> >> No, you have to *enable* SR-IOV in the image.
> >> 
> >> Machines with the Intel NIC will not show any netfront in the device list 
> >> via XenStore (just try Ubuntu).
> >> 
> >> Reyk
> > 
> > That's correct, but I think what was being pointed out is that
> > an instance with SRIOV enabled cannot have it *disabled* (i.e.
> > to switch back to xnf NICs).  I was able to get xnf operational
> > on a c3.large (enhanced networking-capable) by creating an instance
> > with CentOS and swapping the root volume out.  Any AMI constructed
> > on Amazon Linux or Ubuntu will have enhanced networking enabled
> > by default, whereas CentOS doesn't appear to have it enabled (unless
> > you manually enable it).
> 
> Ah, OK.
> 
> I recommend to upload new images or to use my public openbsd
> images to bootstrap new AMIs.
> 
> The "dd from Linux" trick is just a hack if you don't want to install the
> aws and ec2 cli tools - but we have ports now.
> 
> Reyk
> 
Fair enough =)

I wasn't certain if the experimental images were considered ready
for testing.  I'll switch to using them for any other testing I do.

Speaking of testing, is there any particular area non-devs could
assist with at this time?  Gathering dmesgs for different instance
types?



Re: Xen virtual network (Netfront) driver

2016-01-23 Thread Jonathon Sisson
On Sat, Jan 23, 2016 at 02:18:17PM -0800, Jonathon Sisson wrote:
> Speaking of testing, is there any particular area non-devs could
> assist with at this time?  Gathering dmesgs for different instance
> types?
> 
I decided to spin up one of each instance type and grab the console
output in case it would be beneficial to the on-going work:

http://update.j3z.org/dmesg/c3.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/c3.4xlarge_dmesg.txt
http://update.j3z.org/dmesg/c3.8xlarge_dmesg.txt
http://update.j3z.org/dmesg/c3.large_dmesg.txt
http://update.j3z.org/dmesg/c3.xlarge_dmesg.txt
http://update.j3z.org/dmesg/c4.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/c4.4xlarge_dmesg.txt
http://update.j3z.org/dmesg/c4.8xlarge_dmesg.txt
http://update.j3z.org/dmesg/c4.large_dmesg.txt
http://update.j3z.org/dmesg/c4.xlarge_dmesg.txt
http://update.j3z.org/dmesg/d2.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/d2.4xlarge_dmesg.txt
http://update.j3z.org/dmesg/d2.8xlarge_dmesg.txt
http://update.j3z.org/dmesg/d2.xlarge_dmesg.txt
http://update.j3z.org/dmesg/g2.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/g2.8xlarge_dmesg.txt
http://update.j3z.org/dmesg/i2.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/i2.4xlarge_dmesg.txt
http://update.j3z.org/dmesg/i2.8xlarge_dmesg.txt
http://update.j3z.org/dmesg/i2.xlarge_dmesg.txt
http://update.j3z.org/dmesg/m3.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/m3.large_dmesg.txt
http://update.j3z.org/dmesg/m3.medium_dmesg.txt
http://update.j3z.org/dmesg/m3.xlarge_dmesg.txt
http://update.j3z.org/dmesg/m4.10xlarge_dmesg.txt
http://update.j3z.org/dmesg/m4.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/m4.4xlarge_dmesg.txt
http://update.j3z.org/dmesg/m4.large_dmesg.txt
http://update.j3z.org/dmesg/m4.xlarge_dmesg.txt
http://update.j3z.org/dmesg/r3.2xlarge_dmesg.txt
http://update.j3z.org/dmesg/r3.4xlarge_dmesg.txt
http://update.j3z.org/dmesg/r3.8xlarge_dmesg.txt
http://update.j3z.org/dmesg/r3.large_dmesg.txt
http://update.j3z.org/dmesg/r3.xlarge_dmesg.txt
http://update.j3z.org/dmesg/t2.large_dmesg.txt
http://update.j3z.org/dmesg/t2.medium_dmesg.txt
http://update.j3z.org/dmesg/t2.micro_dmesg.txt
http://update.j3z.org/dmesg/t2.nano_dmesg.txt
http://update.j3z.org/dmesg/t2.small_dmesg.txt

If it is deemed helpful, I can keep them updated as
new AMIs come out.

Thanks!

-Jonathon