Re: [gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-21 Thread Mick
On Monday, 21 August 2017 11:35:44 BST Ralph Seichter wrote:
> On 21.08.2017 08:08, Neil Bothwick wrote:
> > Ah, so silentoldconfig is effectively the same as old config.
> 
> Silentoldconfig is quite a useful make target, since it only asks about
> newly introduced kernel options.

... and it also takes account of dependencies without asking.  The difference 
between the two is that silentoldconfig does not display any other modules on 
the screen other than those new entries for which user input is required.

I prefer oldconfig because it helps me orientate myself down the kernel tree, 
by looking at modules preceding the new entries.
-- 
Regards,
Mick

signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-21 Thread Ralph Seichter
On 21.08.2017 08:08, Neil Bothwick wrote:

> Ah, so silentoldconfig is effectively the same as old config.

Silentoldconfig is quite a useful make target, since it only asks about
newly introduced kernel options.

> Does your old kernel still work ad before? Just wondering if another
> update could have caused this.

Both affected virtual servers run fine when I boot kernel 4.9.34 while
leaving everything else unchanged. I'm currently comparing .config files
for kernel versions 4.9.34 and 4.12.5 again, but nothing catches my eye.

As for the symptoms, I wonder: Outbound nameserver access is seizing up
completely, while I can still access the servers via SSH at the same
time. Could it be something that kills UDP after the servers are active
for a while, but leaves TCP alive?

-Ralph



Re: [gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-21 Thread Neil Bothwick
Ah, so silentoldconfig is effectively the same as old config. I've not use it 
before and must have been confusing it with olddefconfig. Sorry for the noise. 

Does your old kernel still work ad before? Just wondering if another update 
could have caused this. 

On 20 August 2017 15:25:56 EEST, Ralph Seichter  
wrote:
>On 20.08.2017 08:17, Neil Bothwick wrote:
>
>> I'd try again with a clean kernel tree but using make oldconfig. It's
>> possible the automagic stuff answered n somewhere where you need a y.
>
>As https://wiki.gentoo.org/wiki/Kernel/Upgrade/en#make_silentoldconfig
>describes, "make silentoldconfig" (which I used) asks for a decision
>for
>all newly introduced kernel options.
>
>Most default to "no" anyway, but I have painstakingly read each of the
>new descriptions to figure out if I might need the options. I've done
>it
>several times, and I still cannot figure out if I missed anything. Here
>is a subset of the options I have configured, perhaps you can spot if
>something is amiss? Of course, grep VIRT is not exactly the most
>precise
>approach...
>
>  ### Server 1 (high volume traffic)
>  $ grep VIRT .config | sort
>  CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
>  CONFIG_BLK_MQ_VIRTIO=y
>  # CONFIG_DEBUG_VIRTUAL is not set
>  # CONFIG_DMA_VIRT_OPS is not set
>  CONFIG_DMA_VIRTUAL_CHANNELS=y
>  # CONFIG_DRM_VIRTIO_GPU is not set
>  # CONFIG_FB_VIRTUAL is not set
>  CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
>  CONFIG_HW_RANDOM_VIRTIO=y
>  CONFIG_PARAVIRT_CLOCK=y
>  # CONFIG_PARAVIRT_DEBUG is not set
>  CONFIG_PARAVIRT_SPINLOCKS=y
>  # CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
>  CONFIG_PARAVIRT=y
>  CONFIG_SCSI_VIRTIO=y
>  # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
>  # CONFIG_VIRT_DRIVERS is not set
>  CONFIG_VIRTIO_BALLOON=m
>  # CONFIG_VIRTIO_BLK_SCSI is not set
>  CONFIG_VIRTIO_BLK=y
>  CONFIG_VIRTIO_CONSOLE=y
>  # CONFIG_VIRTIO_INPUT is not set
>  # CONFIG_VIRTIO_MMIO is not set
>  CONFIG_VIRTIO_NET=y
>  CONFIG_VIRTIO_PCI_LEGACY=y
>  CONFIG_VIRTIO_PCI=y
>  CONFIG_VIRTIO=y
>  CONFIG_VIRT_TO_BUS=y
>  # CONFIG_VIRTUALIZATION is not set
>
>I have since updated a second virtual Gentoo server to Kernel 4.12.
>This
>server sees a lot less network traffic, but after a couple of hours it
>runs into the same timeouts when attempting to contact resolvers.
>Kernel
>settings include:
>
>  ### Server 2 (low volume traffic)
>  $ grep VIRT .config | sort
>  CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
>  CONFIG_BLK_MQ_VIRTIO=y
>  # CONFIG_DEBUG_VIRTUAL is not set
>  # CONFIG_DMA_VIRT_OPS is not set
>  CONFIG_DMA_VIRTUAL_CHANNELS=y
>  # CONFIG_DRM_VIRTIO_GPU is not set
>  # CONFIG_FB_VIRTUAL is not set
>  CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
>  CONFIG_HW_RANDOM_VIRTIO=y
>  CONFIG_PARAVIRT_CLOCK=y
>  # CONFIG_PARAVIRT_DEBUG is not set
>  CONFIG_PARAVIRT_SPINLOCKS=y
>  # CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
>  CONFIG_PARAVIRT=y
>  CONFIG_SCSI_VIRTIO=y
>  # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
>  # CONFIG_VIRT_DRIVERS is not set
>  CONFIG_VIRTIO_BALLOON=m
>  # CONFIG_VIRTIO_BLK_SCSI is not set
>  CONFIG_VIRTIO_BLK=y
>  CONFIG_VIRTIO_CONSOLE=y
>  # CONFIG_VIRTIO_INPUT is not set
>  # CONFIG_VIRTIO_MMIO is not set
>  CONFIG_VIRTIO_NET=y
>  CONFIG_VIRTIO_PCI_LEGACY=y
>  CONFIG_VIRTIO_PCI=y
>  CONFIG_VIRTIO=y
>  CONFIG_VIRT_TO_BUS=y
>  CONFIG_VIRTUALIZATION=y
>
>As you can see, I used CONFIG_VIRTUALIZATION=y in this case, even
>though
>I believe this only affects running as a VM host. I carried this option
>over from the previous 4.9 kernel.
>
>-Ralph

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-20 Thread Ralph Seichter
On 20.08.2017 08:17, Neil Bothwick wrote:

> I'd try again with a clean kernel tree but using make oldconfig. It's
> possible the automagic stuff answered n somewhere where you need a y.

As https://wiki.gentoo.org/wiki/Kernel/Upgrade/en#make_silentoldconfig
describes, "make silentoldconfig" (which I used) asks for a decision for
all newly introduced kernel options.

Most default to "no" anyway, but I have painstakingly read each of the
new descriptions to figure out if I might need the options. I've done it
several times, and I still cannot figure out if I missed anything. Here
is a subset of the options I have configured, perhaps you can spot if
something is amiss? Of course, grep VIRT is not exactly the most precise
approach...

  ### Server 1 (high volume traffic)
  $ grep VIRT .config | sort
  CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
  CONFIG_BLK_MQ_VIRTIO=y
  # CONFIG_DEBUG_VIRTUAL is not set
  # CONFIG_DMA_VIRT_OPS is not set
  CONFIG_DMA_VIRTUAL_CHANNELS=y
  # CONFIG_DRM_VIRTIO_GPU is not set
  # CONFIG_FB_VIRTUAL is not set
  CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
  CONFIG_HW_RANDOM_VIRTIO=y
  CONFIG_PARAVIRT_CLOCK=y
  # CONFIG_PARAVIRT_DEBUG is not set
  CONFIG_PARAVIRT_SPINLOCKS=y
  # CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
  CONFIG_PARAVIRT=y
  CONFIG_SCSI_VIRTIO=y
  # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
  # CONFIG_VIRT_DRIVERS is not set
  CONFIG_VIRTIO_BALLOON=m
  # CONFIG_VIRTIO_BLK_SCSI is not set
  CONFIG_VIRTIO_BLK=y
  CONFIG_VIRTIO_CONSOLE=y
  # CONFIG_VIRTIO_INPUT is not set
  # CONFIG_VIRTIO_MMIO is not set
  CONFIG_VIRTIO_NET=y
  CONFIG_VIRTIO_PCI_LEGACY=y
  CONFIG_VIRTIO_PCI=y
  CONFIG_VIRTIO=y
  CONFIG_VIRT_TO_BUS=y
  # CONFIG_VIRTUALIZATION is not set

I have since updated a second virtual Gentoo server to Kernel 4.12. This
server sees a lot less network traffic, but after a couple of hours it
runs into the same timeouts when attempting to contact resolvers. Kernel
settings include:

  ### Server 2 (low volume traffic)
  $ grep VIRT .config | sort
  CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
  CONFIG_BLK_MQ_VIRTIO=y
  # CONFIG_DEBUG_VIRTUAL is not set
  # CONFIG_DMA_VIRT_OPS is not set
  CONFIG_DMA_VIRTUAL_CHANNELS=y
  # CONFIG_DRM_VIRTIO_GPU is not set
  # CONFIG_FB_VIRTUAL is not set
  CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
  CONFIG_HW_RANDOM_VIRTIO=y
  CONFIG_PARAVIRT_CLOCK=y
  # CONFIG_PARAVIRT_DEBUG is not set
  CONFIG_PARAVIRT_SPINLOCKS=y
  # CONFIG_PARAVIRT_TIME_ACCOUNTING is not set
  CONFIG_PARAVIRT=y
  CONFIG_SCSI_VIRTIO=y
  # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
  # CONFIG_VIRT_DRIVERS is not set
  CONFIG_VIRTIO_BALLOON=m
  # CONFIG_VIRTIO_BLK_SCSI is not set
  CONFIG_VIRTIO_BLK=y
  CONFIG_VIRTIO_CONSOLE=y
  # CONFIG_VIRTIO_INPUT is not set
  # CONFIG_VIRTIO_MMIO is not set
  CONFIG_VIRTIO_NET=y
  CONFIG_VIRTIO_PCI_LEGACY=y
  CONFIG_VIRTIO_PCI=y
  CONFIG_VIRTIO=y
  CONFIG_VIRT_TO_BUS=y
  CONFIG_VIRTUALIZATION=y

As you can see, I used CONFIG_VIRTUALIZATION=y in this case, even though
I believe this only affects running as a VM host. I carried this option
over from the previous 4.9 kernel.

-Ralph



Re: [gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-20 Thread Neil Bothwick
I'd try again with a clean kernel tree but using make oldconfig. It's possible 
the automagic stuff answered n somewhere where you need a y. 

On 19 August 2017 21:28:05 EEST, Ralph Seichter  
wrote:
>It seems strange to me as I write it, but since I updated one of my
>virtual servers from Kernel version 4.9.34 to 4.12.5, the server
>(Gentoo
>Linux running as a KVM guest) is experiencing timeouts when trying to
>connect to DNS resolvers. For the Kernel update, I followed the same
>steps I used for years, like
>
>  cd /usr/src/linux
>  zcat /proc/config.gz >.config
>  make silentoldconfig (answering "no" whereever possible)
>  make ...
>
>After booting with Kernel 4.12, commands like "dig +trace www.ibm.com"
>work just fine for a while, duration depending on server load, but
>after
>some threshold is passed, all further attempts to contact resolvers
>fail
>due to timeouts.
>
>I have tried running a local, caching resolver (BIND 9) on the server,
>like I usually do, and also tried using the hoster's dedicated
>resolvers.
>With Kernel 4.12, I see timeouts in both cases. These problems do not
>occur when I boot with the 4.9 Kernel which I have been using for the
>past two months.
>
>It is also worth noting that I updated two other servers to Kernel 4.12
>without any issues, but these are "real" servers, not VMs. At this
>point
>I am searching for ways to debug the issue, vaguely suspecting some KVM
>magic behind it (without any proof). I know that Kernel 4.11 introduced
>several KVM related changes, but that's about it.
>
>I appreciate all pointers.
>
>-Ralph

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-19 Thread Ralph Seichter
On 20.08.17 00:15, Craig MacKinder wrote:

> It sounds like this vmxnet3 bug causing intermittent interface problems
> https://bugzilla.kernel.org/show_bug.cgi?id=191201
> Try adding this to the VMware guest advanced settings.
> vmxnet3.rev.30 = FALSE
> And restart the guests.

Interesting idea. However, the host server is running KVM, not VMware.
If I am not mistaken, the bug you mentioned is specific to VMware? I
also don't have the option to define VM host server settings beyond the
choice of network adapter type, as the host is run by a third party.

I have included support for both Intel E1000 and VirtIO networking in my
Linux kernels, with the hosting company recommending VirtIO. Choosing
one over the other does not seem to affect the nameserver timeout
problem.

-Ralph




Re: [gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-19 Thread Craig MacKinder
It sounds like this vmxnet3 bug causing intermittent interface problems
https://bugzilla.kernel.org/show_bug.cgi?id=191201
Try adding this to the VMware guest advanced settings.
vmxnet3.rev.30 = FALSE
And restart the guests.

--
Craig

On Aug 19, 2017, at 11:28 AM, Ralph Seichter 
> wrote:

It seems strange to me as I write it, but since I updated one of my
virtual servers from Kernel version 4.9.34 to 4.12.5, the server (Gentoo
Linux running as a KVM guest) is experiencing timeouts when trying to
connect to DNS resolvers. For the Kernel update, I followed the same
steps I used for years, like

 cd /usr/src/linux
 zcat /proc/config.gz >.config
 make silentoldconfig (answering "no" whereever possible)
 make ...

After booting with Kernel 4.12, commands like "dig +trace 
www.ibm.com"
work just fine for a while, duration depending on server load, but after
some threshold is passed, all further attempts to contact resolvers fail
due to timeouts.

I have tried running a local, caching resolver (BIND 9) on the server,
like I usually do, and also tried using the hoster's dedicated resolvers.
With Kernel 4.12, I see timeouts in both cases. These problems do not
occur when I boot with the 4.9 Kernel which I have been using for the
past two months.

It is also worth noting that I updated two other servers to Kernel 4.12
without any issues, but these are "real" servers, not VMs. At this point
I am searching for ways to debug the issue, vaguely suspecting some KVM
magic behind it (without any proof). I know that Kernel 4.11 introduced
several KVM related changes, but that's about it.

I appreciate all pointers.

-Ralph



[gentoo-user] Nameserver lookups fail on virtual server after Kernel upgrade from version 4.9 to 4.12

2017-08-19 Thread Ralph Seichter
It seems strange to me as I write it, but since I updated one of my
virtual servers from Kernel version 4.9.34 to 4.12.5, the server (Gentoo
Linux running as a KVM guest) is experiencing timeouts when trying to
connect to DNS resolvers. For the Kernel update, I followed the same
steps I used for years, like

  cd /usr/src/linux
  zcat /proc/config.gz >.config
  make silentoldconfig (answering "no" whereever possible)
  make ...

After booting with Kernel 4.12, commands like "dig +trace www.ibm.com"
work just fine for a while, duration depending on server load, but after
some threshold is passed, all further attempts to contact resolvers fail
due to timeouts.

I have tried running a local, caching resolver (BIND 9) on the server,
like I usually do, and also tried using the hoster's dedicated resolvers.
With Kernel 4.12, I see timeouts in both cases. These problems do not
occur when I boot with the 4.9 Kernel which I have been using for the
past two months.

It is also worth noting that I updated two other servers to Kernel 4.12
without any issues, but these are "real" servers, not VMs. At this point
I am searching for ways to debug the issue, vaguely suspecting some KVM
magic behind it (without any proof). I know that Kernel 4.11 introduced
several KVM related changes, but that's about it.

I appreciate all pointers.

-Ralph