Re: Stability connection problems in ath9k kernel 4.7

2016-09-09 Thread Valerio Passini
On venerdì 9 settembre 2016 10:38:32 CEST Joerg Roedel wrote:
> Hi Valerio,
> 
> On Thu, Sep 08, 2016 at 09:07:56PM +0200, Valerio Passini wrote:
> > I'm hoping having done it right and I can try your first suggestion, but I
> > really cannot solve this problem by myself: sorry, I have no capabilities
> > in programming in any known and unknown computer language. Surely, I can
> > test all the patches you want and report the results but this is the best
> > I can do. Best regards
> 
> Can you please send me the full dmesg after boot? The Intel-IOMMU is not
> enabled by default, so I want to check if it is either enabled by
> kernel-config or kernel command-line.
> 
> Thanks,
> 
>   Joerg

Hi Joerg,

It is enabled by kernel config. Indeed, unchecking that option makes the 
problem disappearing.

Valerio

Full dmesg as you asked me.
[0.00] microcode: microcode updated early to revision 0x20, date = 
2016-03-16
[0.00] Linux version 4.7.3 (valerio@Automatix) (gcc version 6.2.0 
20160901 (Debian 6.2.0-3) ) #1 SMP PREEMPT Fri Sep 9 16:15:07 CEST 2016
[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.7.3 
root=UUID=917286ea-e4a0-4a3e-9307-102636ba2a20 ro quiet acpi_osi=
[0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 
bytes, using 'standard' format.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x00057fff] usable
[0.00] BIOS-e820: [mem 0x00058000-0x00058fff] reserved
[0.00] BIOS-e820: [mem 0x00059000-0x0009dfff] usable
[0.00] BIOS-e820: [mem 0x0009e000-0x0009] reserved
[0.00] BIOS-e820: [mem 0x0010-0xb5df7fff] usable
[0.00] BIOS-e820: [mem 0xb5df8000-0xb5dfefff] ACPI NVS
[0.00] BIOS-e820: [mem 0xb5dff000-0xb6719fff] usable
[0.00] BIOS-e820: [mem 0xb671a000-0xb69b7fff] reserved
[0.00] BIOS-e820: [mem 0xb69b8000-0xc6045fff] usable
[0.00] BIOS-e820: [mem 0xc6046000-0xc624] reserved
[0.00] BIOS-e820: [mem 0xc625-0xc6402fff] usable
[0.00] BIOS-e820: [mem 0xc6403000-0xc6b08fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xc6b09000-0xc6f59fff] reserved
[0.00] BIOS-e820: [mem 0xc6f5a000-0xc6ffefff] type 20
[0.00] BIOS-e820: [mem 0xc6fff000-0xc6ff] usable
[0.00] BIOS-e820: [mem 0xc7c0-0xcfdf] reserved
[0.00] BIOS-e820: [mem 0xf800-0xfbff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed03fff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed1] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00022f1f] usable
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.31 by American Megatrends
[0.00] efi:  ESRT=0xc6f58798  ACPI 2.0=0xc648b000  ACPI=0xc648b000  
SMBIOS=0xc6f58398 
[0.00] esrt: Reserving ESRT space from 0xc6f58798 to 
0xc6f587d0.
[0.00] SMBIOS 2.7 present.
[0.00] DMI: ASUSTeK COMPUTER INC. N551JW/N551JW, BIOS N551JW.207 
08/03/2015
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x22f200 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-E uncachable
[0.00]   F-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 00 mask 7E write-back
[0.00]   1 base 02 mask 7FE000 write-back
[0.00]   2 base 022000 mask 7FF000 write-back
[0.00]   3 base 00E000 mask 7FE000 uncachable
[0.00]   4 base 00D000 mask 7FF000 uncachable
[0.00]   5 base 00C800 mask 7FF800 uncachable
[0.00]   6 base 00C7C0 mask 7FFFC0 uncachable
[0.00]   7 base 022F80 mask 7FFF80 uncachable
[0.00]   8 base 022F40 mask 7FFFC0 uncachable
[0.00]   9 base 022F20 

Re: Stability connection problems in ath9k kernel 4.7

2016-09-09 Thread Joerg Roedel
Hi Valerio,

On Thu, Sep 08, 2016 at 09:07:56PM +0200, Valerio Passini wrote:
> I'm hoping having done it right and I can try your first suggestion, but I
> really cannot solve this problem by myself: sorry, I have no capabilities in
> programming in any known and unknown computer language. Surely, I can test all
> the patches you want and report the results but this is the best I can do. 
> Best
> regards

Can you please send me the full dmesg after boot? The Intel-IOMMU is not
enabled by default, so I want to check if it is either enabled by
kernel-config or kernel command-line.

Thanks,

Joerg


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Stability connection problems in ath9k kernel 4.7

2016-09-08 Thread Valerio Passini
I'm hoping having done it right and I can try your first suggestion, but I
really cannot solve this problem by myself: sorry, I have no capabilities
in programming in any known and unknown computer language. Surely, I can
test all the patches you want and report the results but this is the best I
can do. Best regards

Valerio

Il 08/Set/2016 19:25, "Kalle Valo"  ha scritto:

> Valerio Passini  writes:
>
> > On mercoledì 7 settembre 2016 11:32:24 CEST Kalle Valo wrote:
> >> Valerio Passini  writes:
> >> > I have found some connection problems since 4.7 release using ath9k
> that
> >> > turn the wifi pretty useless, I think it might be something in the
> power
> >> > management because the signal seems really low. Previously, up to
> kernel
> >> > 4.6.7 everything worked very well.
> >> >
> >> > This is a sample of dmesg in kernel 4.7.2:
> >> >  239.898935] wlp4s0: authenticate with XX:XX:XX:XX:XX:XX
> >> >
> >> > [  239.919995] wlp4s0: send auth to XX:XX:XX:XX:XX:XX  (try 1/3)
> >> > [  239.931877] wlp4s0: authenticated
> >> > [  239.932357] wlp4s0: associate with XX:XX:XX:XX:XX:XX  (try 1/3)
> >> > [  239.942171] wlp4s0: RX AssocResp from XX:XX:XX:XX:XX:XX
> (capab=0x431
> >> > status=0 aid=2)
> >> > [  239.942301] wlp4s0: associated
> >> > [  244.802853] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x0024
> >> > AR_DIAG_SW=0x0220 DMADBG_7=0x
> >> > 6100
> >> > [  245.931832] wlp4s0: authenticate with XX:XX:XX:XX:XX:XX
> >> > [  245.953028] wlp4s0: send auth to XX:XX:XX:XX:XX:XX  (try 1/3)
> >> > [  245.958702] wlp4s0: authenticated
> >> > [  245.960386] wlp4s0: associate withXX:XX:XX:XX:XX:XX  (try 1/3)
> >> > [  245.980543] wlp4s0: RX AssocResp from XX:XX:XX:XX:XX:XX
> (capab=0x431
> >> > status=0 aid=2)
> >> >
> >> > lspci on 4.6.7 kernel:
> >> > 04:00.0 Network controller: Qualcomm Atheros AR9485 Wireless Network
> >> > Adapter (rev 01)
> >> >
> >> > Subsystem: AzureWave AR9485 Wireless Network Adapter
> >> > Flags: bus master, fast devsel, latency 0, IRQ 18
> >> > Memory at f790 (64-bit, non-prefetchable) [size=512K]
> >> > Expansion ROM at f798 [disabled] [size=64K]
> >> > Capabilities: [40] Power Management version 2
> >> > Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+
> >> > Capabilities: [70] Express Endpoint, MSI 00
> >> > Capabilities: [100] Advanced Error Reporting
> >> > Capabilities: [140] Virtual Channel
> >> > Capabilities: [160] Device Serial Number
> 00-00-00-00-00-00-00-00
> >> > Kernel driver in use: ath9k
> >> > Kernel modules: ath9k
> >> >
> >> > Probably you need some debugging output, but before recompiling the
> kernel
> >> > I would like to know if you are interested in any kind of help from me
> >> > and what steps I should take (I'm able to help in testing patches but
> I'm
> >> > not familiar with git). Thank you
> >>
> >> Usually it's really helpful if you can find the commit id which broke
> >> it. 'git bisect' is a great tool to do that and this seems to be a nice
> >> tutorial how to use it:
> >>
> >> http://webchick.net/node/99
> >>
> >> Instead of commit ids you can use release tags like v4.6 and v4.7 to
> >> make it easier to start the bisect. Just make sure that v4.7 is really
> >> broken and v4.6 works before you start the bisection.
> >
> > Hi Kalle,
> >
> > I tried to understand the whole procedure related to git and git bisect,
> and
> > this is the first time I try it, so I can have done some mistake. In the
> git
> > log you'll find the commit that could be guilty for the behaviour I
> reported
> > yesterday. Anyhow, the resulting commit doesn't make any sense to me.
>
> So your bisect found this as the bad commit:
>
> commit 9257b4a206fc0229dd5f84b78e4d1ebf3f91d270
> Author: Omer Peleg 
> Date:   Wed Apr 20 11:34:11 2016 +0300
>
> iommu/iova: introduce per-cpu caching to iova allocation
>
> The ath9k log you provided has a DMA warning and iommu problems can
> cause DMA problems but I cannot make any conclusions yet. To confirm
> that this commit really is the problem you could try to revert it with
> 'git revert -n 9257b4a206fc0229dd5f84b78e4d1ebf3f91d270'. For some
> reason I got conflicts but if you are good enough with C you could try
> to fix those yourself. Another option is that you disable iommu and see
> if that helps.
>
> I'm adding more people and mailing lists related to this commit,
> hopefully they have better ideas.
>
> This is Valerio's bisect log:
>
> git bisect start
> # good: [2dcd0af568b0cf583645c8a317dd12e344b1c72a] Linux 4.6
> git bisect good 2dcd0af568b0cf583645c8a317dd12e344b1c72a
> # bad: [523d939ef98fd712632d93a5a2b588e477a7565e] Linux 4.7
> git bisect bad 523d939ef98fd712632d93a5a2b588e477a7565e
> # good: [0694f0c9e20c47063e4237e5f6649ae5ce5a369a] radix tree test suite:
> remove dependencies on height
> git 

Re: Stability connection problems in ath9k kernel 4.7

2016-09-08 Thread Kalle Valo
Valerio Passini  writes:

> On mercoledì 7 settembre 2016 11:32:24 CEST Kalle Valo wrote:
>> Valerio Passini  writes:
>> > I have found some connection problems since 4.7 release using ath9k that
>> > turn the wifi pretty useless, I think it might be something in the power
>> > management because the signal seems really low. Previously, up to kernel
>> > 4.6.7 everything worked very well.
>> > 
>> > This is a sample of dmesg in kernel 4.7.2:
>> >  239.898935] wlp4s0: authenticate with XX:XX:XX:XX:XX:XX
>> > 
>> > [  239.919995] wlp4s0: send auth to XX:XX:XX:XX:XX:XX  (try 1/3)
>> > [  239.931877] wlp4s0: authenticated
>> > [  239.932357] wlp4s0: associate with XX:XX:XX:XX:XX:XX  (try 1/3)
>> > [  239.942171] wlp4s0: RX AssocResp from XX:XX:XX:XX:XX:XX  (capab=0x431
>> > status=0 aid=2)
>> > [  239.942301] wlp4s0: associated
>> > [  244.802853] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x0024
>> > AR_DIAG_SW=0x0220 DMADBG_7=0x
>> > 6100
>> > [  245.931832] wlp4s0: authenticate with XX:XX:XX:XX:XX:XX
>> > [  245.953028] wlp4s0: send auth to XX:XX:XX:XX:XX:XX  (try 1/3)
>> > [  245.958702] wlp4s0: authenticated
>> > [  245.960386] wlp4s0: associate withXX:XX:XX:XX:XX:XX  (try 1/3)
>> > [  245.980543] wlp4s0: RX AssocResp from XX:XX:XX:XX:XX:XX  (capab=0x431
>> > status=0 aid=2)
>> > 
>> > lspci on 4.6.7 kernel:
>> > 04:00.0 Network controller: Qualcomm Atheros AR9485 Wireless Network
>> > Adapter (rev 01)
>> > 
>> > Subsystem: AzureWave AR9485 Wireless Network Adapter
>> > Flags: bus master, fast devsel, latency 0, IRQ 18
>> > Memory at f790 (64-bit, non-prefetchable) [size=512K]
>> > Expansion ROM at f798 [disabled] [size=64K]
>> > Capabilities: [40] Power Management version 2
>> > Capabilities: [50] MSI: Enable- Count=1/4 Maskable+ 64bit+
>> > Capabilities: [70] Express Endpoint, MSI 00
>> > Capabilities: [100] Advanced Error Reporting
>> > Capabilities: [140] Virtual Channel
>> > Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00
>> > Kernel driver in use: ath9k
>> > Kernel modules: ath9k
>> > 
>> > Probably you need some debugging output, but before recompiling the kernel
>> > I would like to know if you are interested in any kind of help from me
>> > and what steps I should take (I'm able to help in testing patches but I'm
>> > not familiar with git). Thank you
>> 
>> Usually it's really helpful if you can find the commit id which broke
>> it. 'git bisect' is a great tool to do that and this seems to be a nice
>> tutorial how to use it:
>> 
>> http://webchick.net/node/99
>> 
>> Instead of commit ids you can use release tags like v4.6 and v4.7 to
>> make it easier to start the bisect. Just make sure that v4.7 is really
>> broken and v4.6 works before you start the bisection.
>
> Hi Kalle,
>
> I tried to understand the whole procedure related to git and git bisect, and 
> this is the first time I try it, so I can have done some mistake. In the git 
> log you'll find the commit that could be guilty for the behaviour I reported 
> yesterday. Anyhow, the resulting commit doesn't make any sense to me.

So your bisect found this as the bad commit:

commit 9257b4a206fc0229dd5f84b78e4d1ebf3f91d270
Author: Omer Peleg 
Date:   Wed Apr 20 11:34:11 2016 +0300

iommu/iova: introduce per-cpu caching to iova allocation

The ath9k log you provided has a DMA warning and iommu problems can
cause DMA problems but I cannot make any conclusions yet. To confirm
that this commit really is the problem you could try to revert it with
'git revert -n 9257b4a206fc0229dd5f84b78e4d1ebf3f91d270'. For some
reason I got conflicts but if you are good enough with C you could try
to fix those yourself. Another option is that you disable iommu and see
if that helps.

I'm adding more people and mailing lists related to this commit,
hopefully they have better ideas.

This is Valerio's bisect log:

git bisect start
# good: [2dcd0af568b0cf583645c8a317dd12e344b1c72a] Linux 4.6
git bisect good 2dcd0af568b0cf583645c8a317dd12e344b1c72a
# bad: [523d939ef98fd712632d93a5a2b588e477a7565e] Linux 4.7
git bisect bad 523d939ef98fd712632d93a5a2b588e477a7565e
# good: [0694f0c9e20c47063e4237e5f6649ae5ce5a369a] radix tree test suite: 
remove dependencies on height
git bisect good 0694f0c9e20c47063e4237e5f6649ae5ce5a369a
# good: [e4f7bdc2ec0d0dcc27f7d70db27a620dfdc1f697] Merge branch 'for-4.7-zac' 
of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
git bisect good e4f7bdc2ec0d0dcc27f7d70db27a620dfdc1f697
# bad: [049ec1b5a76d34a6980cccdb7c0baeb4eed7a993] Merge tag 'drm-fixes-for-
v4.7-rc2' of git://people.freedesktop.org/~airlied/linux
git bisect bad 049ec1b5a76d34a6980cccdb7c0baeb4eed7a993
# good: [a10c38a4f385f5d7c173a263ff6bb2d36021b3bb] Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
git bisect