from:"Anatoly Pugachev"

Re: Mozilla Software on Sparc64/Linux

2021-11-16 Thread Anatoly Pugachev

On Mon, Nov 15, 2021 at 5:21 PM Connor McLaughlan  wrote:
>
> Hello Anatoly,
>
> i am using the highest available gdb for sparc64:
> root@SunUltra25:/work/firefox62# aptitude versions gdb
> i   10.1-2
> unstable  
> 500
>
> Should i use something else?

version 10.1-2 of gdb has a crucial bug in printing backtrace output
or just printing variables,
please see explanation in
https://sourceware.org/bugzilla/show_bug.cgi?id=27147#c16

So use either older version of gdb 9.x (could be installed from
snapshot.d.o) or more recent one, i.e. 10.2 or even 11.x (not
available to debian package yet, but could be compiled from sources).

Re: Mozilla Software on Sparc64/Linux

2021-11-15 Thread Anatoly Pugachev

On Mon, Nov 15, 2021 at 5:10 PM Anatoly Pugachev  wrote:
>
> On Mon, Nov 15, 2021 at 5:00 PM Connor McLaughlan  wrote:
> >
> > i installed firefox_62.0.3-1_sparc64.deb. On start i get a bus error, no 
> > window comes up.
> >
> > gdb output:
> >
> > connor@SunUltra25:/usr/lib/firefox$ gdb firefox
> > GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
>
> make sure to use recent gdb (11+) or more ancient one (9.x), since
> version 10.x is buggy on sparc64
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988711

sorry for misinformation, gdb-10.2.x will work on sparc64, please see
https://sourceware.org/bugzilla/show_bug.cgi?id=27147

Re: Mozilla Software on Sparc64/Linux

2021-11-15 Thread Anatoly Pugachev

On Mon, Nov 15, 2021 at 5:00 PM Connor McLaughlan  wrote:
>
> i installed firefox_62.0.3-1_sparc64.deb. On start i get a bus error, no 
> window comes up.
>
> gdb output:
>
> connor@SunUltra25:/usr/lib/firefox$ gdb firefox
> GNU gdb (Debian 10.1-2) 10.1.90.20210103-git

make sure to use recent gdb (11+) or more ancient one (9.x), since
version 10.x is buggy on sparc64

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988711

Re: [sparc64] locking/atomic, kernel OOPS on running stress-ng

2021-07-07 Thread Anatoly Pugachev

On Tue, Jul 6, 2021 at 3:00 PM Mark Rutland  wrote:
> On Tue, Jul 06, 2021 at 02:51:06PM +0300, Anatoly Pugachev wrote:
> > On Tue, Jul 6, 2021 at 12:11 PM Mark Rutland  wrote:
> > > Fixes: ff5b4f1ed580c59d ("locking/atomic: sparc: move to ARCH_ATOMIC")
> > > Signed-off-by: Mark Rutland 
> > > Reported-by: Anatoly Pugachev 
> > > Cc: "David S. Miller" 
> > > Cc: Peter Zijlstra 
> > > ---
> > >  arch/sparc/include/asm/cmpxchg_64.h | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/arch/sparc/include/asm/cmpxchg_64.h 
> > > b/arch/sparc/include/asm/cmpxchg_64.h
> > > index 8c39a9981187..12d00a42c0a3 100644
> > > --- a/arch/sparc/include/asm/cmpxchg_64.h
> > > +++ b/arch/sparc/include/asm/cmpxchg_64.h
> > > @@ -201,7 +201,7 @@ static inline unsigned long __cmpxchg_local(volatile 
> > > void *ptr,
> > >  #define arch_cmpxchg64_local(ptr, o, n)  
> > >   \
> > >({   \
> > > BUILD_BUG_ON(sizeof(*(ptr)) != 8);  \
> > > -   cmpxchg_local((ptr), (o), (n)); \
> > > +   arch_cmpxchg_local((ptr), (o), (n));  
> > >   \
> > >})
> > >  #define arch_cmpxchg64(ptr, o, n)  arch_cmpxchg64_local((ptr), (o), 
> > > (n))
> >
> >
> > Mark, thanks, fixed...
> > tested on git kernel 5.13.0-11788-g79160a603bdb-dirty (dirty - cause
> > patch has been applied).
>
> Great! Thanks for confirming.
>
> Peter, are you happy to pick that (full commit in last mail), or should
> I send a new copy?

It would be nice if patch could hit the kernel before v5.14-rc1

Thanks.

Re: [sparc64] locking/atomic, kernel OOPS on running stress-ng

2021-07-06 Thread Anatoly Pugachev

On Tue, Jul 6, 2021 at 12:11 PM Mark Rutland  wrote:
> Fixes: ff5b4f1ed580c59d ("locking/atomic: sparc: move to ARCH_ATOMIC")
> Signed-off-by: Mark Rutland 
> Reported-by: Anatoly Pugachev 
> Cc: "David S. Miller" 
> Cc: Peter Zijlstra 
> ---
>  arch/sparc/include/asm/cmpxchg_64.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/sparc/include/asm/cmpxchg_64.h 
> b/arch/sparc/include/asm/cmpxchg_64.h
> index 8c39a9981187..12d00a42c0a3 100644
> --- a/arch/sparc/include/asm/cmpxchg_64.h
> +++ b/arch/sparc/include/asm/cmpxchg_64.h
> @@ -201,7 +201,7 @@ static inline unsigned long __cmpxchg_local(volatile void 
> *ptr,
>  #define arch_cmpxchg64_local(ptr, o, n)  
>   \
>({   \
> BUILD_BUG_ON(sizeof(*(ptr)) != 8);  \
> -   cmpxchg_local((ptr), (o), (n)); \
> +   arch_cmpxchg_local((ptr), (o), (n));  
>   \
>})
>  #define arch_cmpxchg64(ptr, o, n)  arch_cmpxchg64_local((ptr), (o), (n))


Mark, thanks, fixed...
tested on git kernel 5.13.0-11788-g79160a603bdb-dirty (dirty - cause
patch has been applied).

[sparc64] locking/atomic, kernel OOPS on running stress-ng

2021-07-05 Thread Anatoly Pugachev

Hello!

latest sparc64 git kernel produces the following OOPS on running stress-ng as :

$ stress-ng -v --mmap 1 -t 30s

kernel OOPS (console logs):

[   27.276719] Unable to handle kernel NULL pointer dereference
[   27.276782] tsk->{mm,active_mm}->context = 03cb
[   27.276818] tsk->{mm,active_mm}->pgd = fff83a2a
[   27.276853]   \|/  \|/
[   27.276853]   "@'/ .. \`@"
[   27.276853]   /_| \__/ |_\
[   27.276853]  \__U_/
[   27.276927] stress-ng(928): Oops [#1]
[   27.276961] CPU: 0 PID: 928 Comm: stress-ng Tainted: GE
5.13.0-rc1-00111-g8e6a4b3afe64 #257
[   27.277021] TSTATE: 009911001603 TPC: 0044f3c4 TNPC:
0044f3c8 Y: Tainted: GE
[   27.277084] TPC: 
[   27.277129] g0:  g1: 3d004d800653 g2:
 g3: 0006
[   27.277180] g4: fff8370a9c00 g5: fff8000229666000 g6:
fff847404000 g7: 00090014
[   27.277231] o0: 0001 o1:  o2:
fff8370aa4b8 o3: 
[   27.277283] o4: 2ec90910 o5: 00f86c00 sp:
fff847406ec1 ret_pc: 004d197c
[   27.277337] RPC: 
[   27.277377] l0: 0119b1c0 l1:  l2:
01205e48 l3: 81123ddb8627a322
[   27.277432] l4: c269484aab0b613a l5: 0148f800 l6:
7c8086d1 l7: 01204788
[   27.277487] i0: fff843c9ce40 i1: fff800010480 i2:
fff8424d9048 i3: bd004d800653
[   27.277540] i4: 3d004d800653 i5: bd004d800653 i6:
fff847406f71 i7: 0069e4ac
[   27.277593] I7: <__split_huge_pmd_locked+0x1ec/0x5e0>
[   27.277639] Call Trace:
[   27.277662] [<0069e4ac>] __split_huge_pmd_locked+0x1ec/0x5e0
[   27.277708] [<0069ff48>] __split_huge_pmd+0x288/0x2e0
[   27.277751] [<006a0638>] split_huge_pmd_address+0x78/0xa0
[   27.277797] [<006a0780>] vma_adjust_trans_huge+0x120/0x160
[   27.277843] [<0065ac70>] __vma_adjust+0x1d0/0xb00
[   27.277887] [<0065bfb0>] __split_vma+0xf0/0x180
[   27.277927] [<00675704>] madvise_behavior+0x224/0x2a0
[   27.277972] [<00677418>] do_madvise+0x478/0x600
[   27.278011] [<0068>] sys_madvise+0x18/0x40
[   27.278050] [<00406274>] linux_sparc_syscall+0x34/0x44
[   27.278100] Disabling lock debugging due to kernel taint
[   27.278112] Caller[0069e4ac]: __split_huge_pmd_locked+0x1ec/0x5e0
[   27.278130] Caller[0069ff48]: __split_huge_pmd+0x288/0x2e0
[   27.278146] Caller[006a0638]: split_huge_pmd_address+0x78/0xa0
[   27.278161] Caller[006a0780]: vma_adjust_trans_huge+0x120/0x160
[   27.278177] Caller[0065ac70]: __vma_adjust+0x1d0/0xb00
[   27.278191] Caller[0065bfb0]: __split_vma+0xf0/0x180
[   27.278208] Caller[00675704]: madvise_behavior+0x224/0x2a0
[   27.278222] Caller[00677418]: do_madvise+0x478/0x600
[   27.278237] Caller[0068]: sys_madvise+0x18/0x40
[   27.278253] Caller[00406274]: linux_sparc_syscall+0x34/0x44
[   27.278267] Caller[010bd02c]: 0x10bd02c
[   27.278281] Instruction DUMP:
[   27.278285]  ba10001b
[   27.278295]  8210001c
[   27.278304]  84102000
[   27.278313] 
[   27.278321]  80a0401d
[   27.278330]  2264
[   27.278338]  d05e2040
[   27.278346]  106a
[   27.278354]  fa5e8000
[   27.278363]

tried to bisect this OOPS, but was unable to find the latest commit
id, without cherry-pick:

linux-2.6$ git describe
v5.13-rc1-111-gb9b12978a8e9

linux-2.6$ make
  CC  kernel/bounds.s
In file included from ./include/linux/atomic.h:87,
 from ./include/asm-generic/bitops/lock.h:5,
 from ./arch/sparc/include/asm/bitops_64.h:52,
 from ./arch/sparc/include/asm/bitops.h:5,
 from ./include/linux/bitops.h:32,
 from ./include/linux/kernel.h:12,
 from ./include/asm-generic/bug.h:20,
 from ./arch/sparc/include/asm/bug.h:25,
 from ./include/linux/bug.h:5,
 from ./include/linux/page-flags.h:10,
 from kernel/bounds.c:10:
./include/asm-generic/atomic-long.h: In function ‘atomic_long_add_return’:
./include/asm-generic/atomic-long.h:59:9: error: implicit declaration
of function ‘atomic64_add_return’; did you mean ‘atomic64_dec_return’?
[-Werror=implicit-function-declaration]
   59 |  return atomic64_add_return(i, v);
  | ^~~
  | atomic64_dec_return
./include/asm-generic/atomic-long.h: In function ‘atomic_long_fetch_add’:
./include/asm-generic/atomic-long.h:83:9: error: implicit declaration
of function ‘atomic64_fetch_add’; did you mean ‘atomic64_fetch_dec’?
[-Werror=implicit-function-declaration]
   83 |  return atomic64_fetch_add(i, v);
  | ^~
  | atomic64_fetch_dec


$ git bisect log
# bad:

Re: [sparc64] kernel panic from running a program in userspace

2021-06-19 Thread Anatoly Pugachev

On Sat, Jun 19, 2021 at 12:31 PM Colin Ian King
 wrote:
>
> Hi,
>
> I suspect this issue was fixed with the following commit:
>
> commit e5e8b80d352ec999d2bba3ea584f541c83f4ca3f
> Author: Rob Gardner 
> Date:   Sun Feb 28 22:48:16 2021 -0700
>
> sparc64: Fix opcode filtering in handling of no fault loads

Colin,

yes, but I believe that it was quite a different kernel bug.
Besides, my current kernel test is based on git kernel 5.13.0-rc6
(released last monday), which already includes the mentioned 'opcode'
fix.

> > stress-ng.git$ ./stress-ng --verbose --timeout 10m --opcode -1
> > stress-ng: debug: [480950] stress-ng 0.12.10 g27f90a2276bd
> > stress-ng: debug: [480950] system: Linux ttip 5.13.0-rc6 #229 SMP Tue
> > Jun 15 12:30:23 MSK 2021 sparc64
> > stress-ng: debug: [480950] RAM total: 7.8G, RAM free: 7.0G, swap free: 
> > 768.7M
> > stress-ng: debug: [480950] 8 processors online, 256 processors configured
> > stress-ng: info:  [480950] dispatching hogs: 8 opcode

[sparc64] kernel panic from running a program in userspace

2021-06-19 Thread Anatoly Pugachev

Hello!

Getting the following in logs:
(reproducible with almost every run, tried different kernel as well -
debian packaged 5.10.0-7-sparc64-smp )

[  863.344843] stress-ng[593992]: bad register window fault: SP
fcd023ff (orig_sp fcd01c00) TPC fff80001000237fc O7
fff800010003e008
[  890.782498] CPU[4]: SUN4V mondo timeout, cpu(5) made no forward
progress after 51 retries. Total target cpus(7).
[  890.782539] CPU[3]: SUN4V mondo timeout, cpu(5) made no forward
progress after 51 retries. Total target cpus(7).
[  890.782590] Kernel panic - not syncing: SUN4V mondo timeout panic
[  890.782664] CPU: 4 PID: 480951 Comm: stress-ng Tainted: G
 E 5.13.0-rc6 #229
[  890.782713] Call Trace:
[  890.782733] [<00c806c8>] panic+0xf4/0x2d4
[  890.782773] [<0043f3a8>] hypervisor_xcall_deliver+0x288/0x320
[  890.782816] [<0043efb8>] xcall_deliver+0xf8/0x120
[  890.782860] [<00440518>] smp_flush_tlb_page+0x38/0x60
[  890.782898] [<0044ee44>] flush_tlb_pending+0x64/0xa0
[  890.782938] [<0044f1c4>] arch_leave_lazy_mmu_mode+0x24/0x40
[  890.782977] [<00651b4c>] copy_pte_range+0x5ac/0x860
[  890.783013] [<00655974>] copy_pud_range+0x1f4/0x260
[  890.783049] [<00655b2c>] copy_page_range+0x14c/0x1c0
[  890.783083] [<004613b4>] dup_mmap+0x374/0x4a0
[  890.783123] [<00461530>] dup_mm+0x50/0x200
[  890.783157] [<00462384>] copy_process+0x704/0x1280
[  890.783196] [<004631a8>] kernel_clone+0x88/0x380
[  890.783231] [<0042d170>] sparc_clone+0xb0/0xe0
[  890.783274] [<00406274>] linux_sparc_syscall+0x34/0x44
[  890.784106] CPU[7]: SUN4V mondo timeout, cpu(5) made no forward
progress after 52 retries. Total target cpus(7).
[  890.784119] CPU[6]: SUN4V mondo timeout, cpu(5) made no forward
progress after 53 retries. Total target cpus(7).
[  890.784876] Press Stop-A (L1-A) from sun keyboard or send break
[  890.784876] twice on console to return to the boot prom
[  890.784897] ---[ end Kernel panic - not syncing: SUN4V mondo
timeout panic ]---

(and machine halt)

after running stress-ng :

stress-ng.git$ ./stress-ng --verbose --timeout 10m --opcode -1
stress-ng: debug: [480950] stress-ng 0.12.10 g27f90a2276bd
stress-ng: debug: [480950] system: Linux ttip 5.13.0-rc6 #229 SMP Tue
Jun 15 12:30:23 MSK 2021 sparc64
stress-ng: debug: [480950] RAM total: 7.8G, RAM free: 7.0G, swap free: 768.7M
stress-ng: debug: [480950] 8 processors online, 256 processors configured
stress-ng: info:  [480950] dispatching hogs: 8 opcode
stress-ng: debug: [480950] cache allocate: using cache maximum level L2
stress-ng: debug: [480950] cache allocate: shared cache buffer size: 128K
stress-ng: debug: [480950] starting stressors
stress-ng: debug: [480951] stress-ng-opcode: started [480951] (instance 0)
stress-ng: debug: [480952] stress-ng-opcode: started [480952] (instance 1)
stress-ng: debug: [480953] stress-ng-opcode: started [480953] (instance 2)
stress-ng: debug: [480955] stress-ng-opcode: started [480955] (instance 3)
stress-ng: debug: [480957] stress-ng-opcode: started [480957] (instance 4)
stress-ng: debug: [480959] stress-ng-opcode: started [480959] (instance 5)
stress-ng: debug: [480961] stress-ng-opcode: started [480961] (instance 6)
stress-ng: debug: [480950] 8 stressors started
stress-ng: debug: [480963] stress-ng-opcode: started [480963] (instance 7)
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
Inconsistency detected by ld.so: dl-runtime.c: 80: _dl_fixup:
Assertion `ELFW(R_TYPE)(reloc->r_info) == ELF_MACHINE_JMP_SLOT'
failed!
*** stack smashing detected ***: terminated
munmap_chunk(): invalid pointer
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
Inconsistency detected by ld.so: : 422: Assertion `�' failed!
*** stack smashing detected ***: terminated


Machine is my testing LDOM (virtual machine), installed and running
the latest sparc4 debian sid (unstable).

Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-04-01 Thread Anatoly Pugachev

On Thu, Apr 1, 2021 at 2:40 PM Riccardo Mottola
 wrote:
> multix@narya:~/code/linux-stable$ time sudo make install
> sh ./arch/sparc/boot/install.sh 5.12.0-rc5+ arch/sparc/boot/zImage \
> System.map "/boot"
> run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 5.12.0-rc5+
> /boot/vmlinuz-5.12.0-rc5+
> run-parts: executing /etc/kernel/postinst.d/initramfs-tools 5.12.0-rc5+
> /boot/vmlinuz-5.12.0-rc5+
> update-initramfs: Generating /boot/initrd.img-5.12.0-rc5+
> run-parts: executing /etc/kernel/postinst.d/zz-update-grub 5.12.0-rc5+
> /boot/vmlinuz-5.12.0-rc5+
> Generating grub configuration file ...
> Found linux image: /boot/vmlinuz-5.12.0-rc5+
> Found initrd image: /boot/initrd.img-5.12.0-rc5+
> Found linux image: /boot/vmlinuz-5.12.0-rc5+.old
> Found initrd image: /boot/initrd.img-5.12.0-rc5+
> Found linux image: /boot/vmlinux-5.10.0-4-sparc64-smp
> Found initrd image: /boot/initrd.img-5.10.0-4-sparc64-smp
> Found linux image: /boot/vmlinux-5.10.0-trunk-sparc64-smp
> Found initrd image: /boot/initrd.img-5.10.0-trunk-sparc64-smp
> Found linux image: /boot/vmlinux-5.9.0-5-sparc64-smp
> Found initrd image: /boot/initrd.img-5.9.0-5-sparc64-smp
> done
>
> At boot:
>
> Loading Linux 5.12.0-rc5+ ...
> error: premature end of file /vmlinuz-5.12.0-rc5+.
> Loading initial ramdisk ...
> error: you need to load the kernel first.

current grub2 version does not support compressed image kernels, do
the following:

gzip -dc /boot/vmlinuz-5.12.0-rc5+ > /boot/vmlinux-5.12.0-rc5+
rm /boot/vmlinuz-5.12.0-rc5+
update-grub

and reboot

Re: 5.10.0-4-sparc64-smp #1 Debian 5.10.19-1 crashes on T2000

2021-04-01 Thread Anatoly Pugachev

On Thu, Apr 1, 2021 at 12:59 PM Riccardo Mottola
 wrote:
> > This seems to only happen when the machines do a long run with high
> > workload and seemingly not when i just power them off again for night
> > with no high workload.
>
> I have a limited experience and can only share that the kernel I
> currently am running on this Fire T2000
>
> Linux narya 5.9.0-5-sparc64-smp #1 SMP Debian 5.9.15-1 (2020-12-17)
> sparc64 GNU/Linux
>
> Is quite stable for me.
> However, i did not try to run for several days compiling, so I don't
> know if it is stable for a long time.

Riccardo,

if you would like to check sparc64 kernel stability, you might want to run
stress-ng tests, like:

$ ./stress-ng --sequential 4 -v --timeout 3m --metrics-brief

it still successfully kills the latest (git) kernel (5.12.0-rc5) on my
sparc64 test LDOM running on a T5-2 hardware server.
But please take stress-ng from git repo [1] , since it has a few
recent fixes for sparc, not yet packaged into debian.

Thanks.

1. https://github.com/ColinIanKing/stress-ng/

Re: Regression in 028abd92 for Sun UltraSPARC T1

2021-03-24 Thread Anatoly Pugachev

On Wed, Mar 24, 2021 at 4:19 PM Frank Scheiner  wrote:
> On 24.03.21 14:16, John Paul Adrian Glaubitz wrote:
> > On 3/24/21 2:09 PM, Frank Scheiner wrote:> Kernel sources are not available 
> > on the T1000.
> >>
> >> If need be, where do they need to exist and how should the directory be
> >> named - `/usr/src/[...]`?
> >
> > Try installing "linux-source" and the "-dbg" package for your Debian kernel.
>
> But don't I need the source for the kernel at 028abd92? I figured, I
> need the sources in `/usr/src/linux-source-5.9.0-rc1+` because
> "5.9.0-rc1+" is the version the corresponding modules are installed -
> could that be correct?

Frank,

i'm using gdb from kernel sources directory (from which kernel is
installed), like:

$ uname -a
Linux ttip 5.12.0-rc4 #203 SMP Wed Mar 24 15:50:29 MSK 2021 sparc64 GNU/Linux
$ cd linux-2.6
linux-2.6$ git describe
v5.12-rc4
linux-2.6$ gdb -q vmlinux
Reading symbols from vmlinux...
(gdb) l *(sys_mount+0x114/0x1e0)
0x6dd7c0 is in __se_sys_mount (fs/namespace.c:3431).
3426/* ... and return the root of (sub)tree on it */
3427return path.dentry;
3428}
3429EXPORT_SYMBOL(mount_subtree);
3430
3431SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name,
3432char __user *, type, unsigned long, flags,
void __user *, data)
3433{
3434int ret;
3435char *kernel_type;
(gdb)

Re: Regression in 028abd92 for Sun UltraSPARC T1

2021-03-24 Thread Anatoly Pugachev

On Wed, Mar 24, 2021 at 3:31 PM Frank Scheiner  wrote:
> Sorry, but I can't install `gdb` on my T1000 ATM, because it depends on
> "libpython3.8" for sparc64 (see [1]) and "libpython3.9" for the other
> architectures, but "libpython3.8" is actually not available for sparc64,
> "libpython3.9" is available for sparc64 though:
> ...
> The following packages have unmet dependencies:
>   gdb : Depends: libpython3.8 (>= 3.8.2) but it is not installable
> Recommends: libc-dbg
> E: Unable to correct problems, you have held broken packages.
> ```
> Something wrong with the dependencies. Any suggestions?

Frank,

you could use http://snapshot.debian.org to install old versions of
packages, i.e. gdb and libpython-3.8

Re: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [systemd:1]

2021-03-15 Thread Anatoly Pugachev

On Fri, Mar 12, 2021 at 5:27 PM Dennis Clarke  wrote:
>
>
> I have seen this for a few months now. The old old netra machine will
> run just fine endlessly but if I attempt to perform a package update
> then I am always assured to see :
>
>
> ceres# apt-get update
> Get:1 http://deb.debian.org/debian-ports sid InRelease [55.3 kB]
> Get:2 http://deb.debian.org/debian-ports sid/main sparc64 Packages [21.6 MB]
> Get:3 http://deb.debian.org/debian-ports sid/main all Packages [8,682
> kB]
> Fetched 30.3 MB in 1min 24s (361 kB/s)
>
> Reading package lists... Done
> ceres#
>
> Then try "upgrade" and the machine drops off the network :
>
> Setting up systemd (247.3-1) ...
> Timeout, server 172.16.35.61 not responding.

Dennis,

did you tried to test machine with stress-ng ? There's a lot of tests
in it, it could trigger your issue and probably would be easier to
hunt down the issue.

Re: update-grub and then grub-mkconfig leads to the watchdog: BUG: soft lockup

2021-03-15 Thread Anatoly Pugachev

On Mon, Mar 15, 2021 at 4:59 AM Dennis Clarke  wrote:
>
>
> While digging around here I saw that update-grub will lead to a lockup
> every time. So I simply changed  /usr/sbin/grub-mkconfig  script to
> allow me to see everything that happens.
>
> That gets me to :
>
>  /usr/sbin/grub-probe --device /dev/sda2 --target=fs_uuid
>
> which falls to pieces perfectly :
>
> root@eros:~#
> root@eros:~# uptime
>  01:09:40 up 20 min,  2 users,  load average: 0.07, 0.14, 0.48
> root@eros:~# /usr/sbin/grub-mkconfig -o /boot/grub/grub.cfg
> + prefix=/usr
> + exec_prefix=/usr
> + datarootdir=/usr/share
> + prefix=/usr
> + exec_prefix=/usr
> + sbindir=/usr/sbin
> + bindir=/usr/bin
> + sysconfdir=/etc
> + PACKAGE_NAME=GRUB
> + PACKAGE_VERSION=2.04-16
> + host_os=linux-gnu
> + datadir=/usr/share
> + [ x = x ]
> + pkgdatadir=/usr/share/grub
> + export pkgdatadir
> + grub_cfg=
> + grub_mkconfig_dir=/etc/grub.d
> + basename /usr/sbin/grub-mkconfig
> + self=grub-mkconfig
> + grub_probe=/usr/sbin/grub-probe
> + grub_file=/usr/bin/grub-file
> + grub_editenv=/usr/bin/grub-editenv
> + grub_script_check=/usr/bin/grub-script-check
> + export TEXTDOMAIN=grub
> + export TEXTDOMAINDIR=/usr/share/locale
> + . /usr/share/grub/grub-mkconfig_lib
> + prefix=/usr
> + exec_prefix=/usr
> + datarootdir=/usr/share
> + datadir=/usr/share
> + bindir=/usr/bin
> + sbindir=/usr/sbin
> + [ x/usr/share/grub = x ]
> + test x/usr/sbin/grub-probe = x
> + test x/usr/bin/grub-file = x
> + test x = x
> + grub_mkrelpath=/usr/bin/grub-mkrelpath
> + which gettext
> + :
> + grub_tab=
> + test 2 -gt 0
> + option=-o
> + shift
> + argument -o /boot/grub/grub.cfg
> + opt=-o
> + shift
> + test 1 -eq 0
> + echo /boot/grub/grub.cfg
> + grub_cfg=/boot/grub/grub.cfg
> + shift
> + test 0 -gt 0
> + [ x = x ]
> + id -u
> + EUID=0
> + [ 0 != 0 ]
> + set /usr/sbin/grub-probe dummy
> + test -f /usr/sbin/grub-probe
> + :
> + /usr/sbin/grub-probe --target=device /
> + GRUB_DEVICE=/dev/sda2
> + /usr/sbin/grub-probe --device /dev/sda2 --target=fs_uuid
> [ 1330.951329] watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
> [grub-probe:443]
> [ 1331.046350] Modules linked in: drm(E) drm_panel_orientation_quirks(E)
> i2c_core(E) sg(E) envctrl(E) display7seg(E) flash(E) fuse(E) configfs(E)
> ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E)
> crc32c_generic(E) sd_mod(E) t10_pi(E) crc_t10dif(E) st(E)
> crct10dif_generic(E) crct10dif_common(E) sym53c8xx(E)
> scsi_transport_spi(E) scsi_mod(E) sunhme(E)
> [ 1331.475596] CPU: 0 PID: 443 Comm: grub-probe Tainted: GE
> 5.10.0-4-sparc64 #1 Debian 5.10.19-1
> [ 1331.606055] TSTATE: 009911001601 TPC: 00950920 TNPC:
> 00950924 Y: Tainted: GE
> [ 1331.753728] TPC: 
> [ 1331.804124] g0: f800065e3140 g1: 1005a830 g2:
>  g3: 0149fa90
> [ 1331.918504] g4: f80009bde780 g5: 604a4edc g6:
> f8000a1ac000 g7: 0fa664c8
> [ 1332.032984] o0: 00f2c960 o1: f8000a1af8ec o2:
> f80004275b50 o3: 
> [ 1332.147464] o4:  o5:  sp:
> f8000a1aef81 ret_pc: 00950900
> [ 1332.266539] RPC: 
> [ 1332.316950] l0: 00f2c800 l1:  l2:
> 00668200 l3: 00064b73605f
> [ 1332.431439] l4: 0002 l5: f8000a1af8f0 l6:
> 00e1a000 l7: 0001
> [ 1332.545917] i0: f8000b813d90 i1: f80009bad100 i2:
> 00f2c800 i3: 00f2c978
> [ 1332.660398] i4: 00ec i5: 1005a818 i6:
> f8000a1af031 i7: 00668e38
> [ 1332.774892] I7: 
> [ 1332.826436] Call Trace:
> [ 1332.858473] [<00668e38>] chrdev_open+0x98/0x1e0
> [ 1332.927215] [<0065e430>] do_dentry_open+0x170/0x420
> [ 1333.000505] [<00660068>] vfs_open+0x28/0x40
> [ 1333.064670] [<00674948>] path_openat+0x988/0x1100
> [ 1333.135679] [<006773d0>] do_filp_open+0x50/0x100
> [ 1333.205549] [<00660330>] do_sys_openat2+0x70/0x180
> [ 1333.277710] [<00660868>] sys_openat+0x48/0xc0
> [ 1333.344164] [<00406174>] linux_sparc_syscall+0x34/0x44
> ~
> Type  'go' to resume
> ok


This kernel OOPS (backtrace) should be reported to sparclinux@ ,
linux-kernel@ (lkml) and linux-fsdevel@ (vfs) linux kernel mailing
lists.
Thanks.

Re: [sparc64] current git kernel networking is broken

2020-12-08 Thread Anatoly Pugachev

On Tue, Dec 8, 2020 at 3:42 AM Al Viro  wrote:
>
> On Tue, Dec 08, 2020 at 03:09:47AM +0300, Anatoly Pugachev wrote:
> > Hello!
> >
> > Sorry for the late report, being 5.10-rc7 is out, but current git
> > kernel 
> > (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git)
> > is broken with the networking. It affects my openvpn tunnel and even
> > internet networking.
> >
> > ping to a local ethernet network (i.e. gateway ping) works, but i
> > cannot download files from the internet.
> > openvpn tunnel does not work.
>
> 
> 
> Could you check if the following typo fix is sufficient for your
> reproducer?
>
> diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
> index 0c0268e77155..d839956407a7 100644
> --- a/arch/sparc/lib/csum_copy.S
> +++ b/arch/sparc/lib/csum_copy.S
> @@ -71,7 +71,7 @@
>  FUNC_NAME: /* %o0=src, %o1=dst, %o2=len */
> LOAD(prefetch, %o0 + 0x000, #n_reads)
> xor %o0, %o1, %g1
> -   mov 1, %o3
> +   mov -1, %o3
> clr %o4
> andcc   %g1, 0x3, %g0
> bne,pn  %icc, 95f


Thanks Al, this patch fixes networking for me.

Re: [sparc64] current git kernel networking is broken

2020-12-07 Thread Anatoly Pugachev

On Tue, Dec 8, 2020 at 3:09 AM Anatoly Pugachev  wrote:
> bisected kernel to the following commit:
>
> linux-2.6$ git bisect good
> fdf8bee96f9aeaac4559725c2dfae6e1bd7b7043 is the first bad commit

forgot to add, that reverting this commit, fixes networking for me.

[sparc64] current git kernel networking is broken

2020-12-07 Thread Anatoly Pugachev

Hello!

Sorry for the late report, being 5.10-rc7 is out, but current git
kernel (git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git)
is broken with the networking. It affects my openvpn tunnel and even
internet networking.

ping to a local ethernet network (i.e. gateway ping) works, but i
cannot download files from the internet.
openvpn tunnel does not work.

bisected kernel to the following commit:

linux-2.6$ git bisect good
fdf8bee96f9aeaac4559725c2dfae6e1bd7b7043 is the first bad commit
commit fdf8bee96f9aeaac4559725c2dfae6e1bd7b7043
Author: Al Viro 
Date:   Sun Jul 19 18:31:07 2020 -0400

sparc64: propagate the calling convention changes down to
__csum_partial_copy_...()

... and rename them into csum_and_copy_...() - the wrappers become
pointless.
[braino fixed]

Signed-off-by: Al Viro 

 arch/sparc/include/asm/checksum.h|  1 +
 arch/sparc/include/asm/checksum_32.h |  2 --
 arch/sparc/include/asm/checksum_64.h | 41 +++-
 arch/sparc/lib/csum_copy.S   |  5 +++--
 arch/sparc/lib/csum_copy_from_user.S |  4 ++--
 arch/sparc/lib/csum_copy_to_user.S   |  4 ++--
 6 files changed, 11 insertions(+), 46 deletions(-)



full git bisect log:

$ git bisect log
git bisect start
# bad: [0477e92881850d44910a7e94fc2c46f96faa131f] Linux 5.10-rc7
git bisect bad 0477e92881850d44910a7e94fc2c46f96faa131f
# good: [bbf5c979011a099af5dc76498918ed7df445635b] Linux 5.9
git bisect good bbf5c979011a099af5dc76498918ed7df445635b
# bad: [4d0e9df5e43dba52d38b251e3b909df8fa1110be] lib, uaccess: add
failure injection to usercopy functions
git bisect bad 4d0e9df5e43dba52d38b251e3b909df8fa1110be
# bad: [f888bdf9823c85fe945c4eb3ba353f749dec3856] Merge tag
'devicetree-for-5.10' of
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
git bisect bad f888bdf9823c85fe945c4eb3ba353f749dec3856
# bad: [57218d7f2e87069f73c7a841b6ed6c1cc7acf616] Merge tag
'regmap-v5.10' of
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
git bisect bad 57218d7f2e87069f73c7a841b6ed6c1cc7acf616
# bad: [39a5101f989e8d2be557136704d53990f9b402c8] Merge branch 'linus'
of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
git bisect bad 39a5101f989e8d2be557136704d53990f9b402c8
# good: [ed016af52ee3035b4799ebd7d53f9ae59d5782c4] Merge tag
'locking-core-2020-10-12' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good ed016af52ee3035b4799ebd7d53f9ae59d5782c4
# good: [50d228345a03c882dfe11928ab41b42458b3f922] Merge tag
'docs-5.10' of git://git.lwn.net/linux
git bisect good 50d228345a03c882dfe11928ab41b42458b3f922
# good: [0f5e8323777bfc1c1d2cba71242db6a361de03b6] crypto:
arm/sha512-neon - avoid ADRL pseudo instruction
git bisect good 0f5e8323777bfc1c1d2cba71242db6a361de03b6
# good: [c2fb644638ae45cc4a34aa51a18d687d4781f8a1] hwrng: npcm -
modify readl to readb
git bisect good c2fb644638ae45cc4a34aa51a18d687d4781f8a1
# bad: [85ed13e78dbedf9433115a62c85429922bc5035c] Merge branch
'work.iov_iter' of
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect bad 85ed13e78dbedf9433115a62c85429922bc5035c
# good: [1cd95ab85df730b140156baac92fd2640290a5e5] mips: propagate the
calling convention change down into __csum_partial_copy_..._user()
git bisect good 1cd95ab85df730b140156baac92fd2640290a5e5
# good: [598b3cec831fd6ccb3cbe4919a722e868c6364a8] fs: remove
compat_sys_vmsplice
git bisect good 598b3cec831fd6ccb3cbe4919a722e868c6364a8
# bad: [70d65cd555c5e43c613700f604a47f7ebcf7b6f1] ppc: propagate the
calling conventions change down to csum_partial_copy_generic()
git bisect bad 70d65cd555c5e43c613700f604a47f7ebcf7b6f1
# bad: [fdf8bee96f9aeaac4559725c2dfae6e1bd7b7043] sparc64: propagate
the calling convention changes down to __csum_partial_copy_...()
git bisect bad fdf8bee96f9aeaac4559725c2dfae6e1bd7b7043
# good: [2a5d2bd159f33ef34484ee14705dcf8634061f2c] xtensa: propagate
the calling conventions change down into csum_partial_copy_generic()
git bisect good 2a5d2bd159f33ef34484ee14705dcf8634061f2c
# first bad commit: [fdf8bee96f9aeaac4559725c2dfae6e1bd7b7043]
sparc64: propagate the calling convention changes down to
__csum_partial_copy_...()

"No support for PMU type" or early "NMI appears to be stuck (0->0)"

2020-12-05 Thread Anatoly Pugachev

Hello!

Just to share my current experience with updated solaris being used as
a hypervisor for linux LDOMs.

Using sparc T5-2 server as a hypervisor (solaris 11.4 for primary
domain) for various LDOMs, with ones being used under linux OS (debian
sid unstable).

Recently, updated solaris on primary domain to latest version and some
of my linux domains started to report the following logs on kernel
boot (full log at [1]):

$ dmesg
...
[0.401140] smp: Brought up 1 node, 8 CPUs
[0.403154] devtmpfs: initialized
[0.403758] Performance events:
[0.403771] Testing NMI watchdog ...
[0.483850] WARNING: CPU#0: NMI appears to be stuck (0->0)!
[0.483861] Please report this to bugzilla.kernel.org,
[0.483872] and attach the output of the 'dmesg' command.
[0.483885] WARNING: CPU#1: NMI appears to be stuck (0->0)!
[0.483896] Please report this to bugzilla.kernel.org,
[0.483907] and attach the output of the 'dmesg' command.
[0.483925] WARNING: CPU#2: NMI appears to be stuck (0->0)!
[0.483940] Please report this to bugzilla.kernel.org,
[0.483954] and attach the output of the 'dmesg' command.
[0.483972] WARNING: CPU#3: NMI appears to be stuck (0->0)!
[0.483986] Please report this to bugzilla.kernel.org,
[0.484001] and attach the output of the 'dmesg' command.
[0.484018] WARNING: CPU#4: NMI appears to be stuck (0->0)!
[0.484032] Please report this to bugzilla.kernel.org,
[0.484047] and attach the output of the 'dmesg' command.
[0.484064] WARNING: CPU#5: NMI appears to be stuck (0->0)!
[0.484078] Please report this to bugzilla.kernel.org,
[0.484093] and attach the output of the 'dmesg' command.
[0.484110] WARNING: CPU#6: NMI appears to be stuck (0->0)!
[0.484124] Please report this to bugzilla.kernel.org,
[0.484138] and attach the output of the 'dmesg' command.
[0.484154] WARNING: CPU#7: NMI appears to be stuck (0->0)!
[0.484169] Please report this to bugzilla.kernel.org,
[0.484183] and attach the output of the 'dmesg' command.
[0.484207] No support for PMU type 'niagara5'
[0.484409] ldc.c:v1.1 (July 22, 2008)
[0.484766] clocksource: jiffies: mask: 0x max_cycles:
0x, max_idle_ns: 764504178510 ns

versus old behavior on the same domain :
$ journalctl -k -b -2 -o short-monotonic --no-hostname
...
[0.427406] kernel: smp: Brought up 1 node, 24 CPUs
[0.429746] kernel: devtmpfs: initialized
[0.430558] kernel: Performance events:
[0.430577] kernel: Testing NMI watchdog ...
[0.510652] kernel: OK.
[0.510669] kernel: Supported PMU type is 'niagara5'
[0.511025] kernel: ldc.c:v1.1 (July 22, 2008)
[0.511485] kernel: clocksource: jiffies: mask: 0x
max_cycles: 0x, max_idle_ns: 764504178510 ns


while checking what has changed , found that domains which report "NMI
appears to be stuck" being a bit different in a LDOM configuration for
the domain, they have empty perf-counters [2]:

$ ldm list -l ldg0 | grep perf
perf-counters=

setting "perf-counters" to any value [ "strand" or "htstrand" ] ,
removes this error messages and gets back to the older behaviour.

Not sure if this info will be useful to anyone, but posting anyway

Thanks.

1. https://gist.github.com/mator/19769bf36625bdd1d27cecf38591ea75
2. https://docs.oracle.com/cd/E93612_01/html/E93617/useperfcounterprops.html

PS: I didn't found perf-counter being used (declared) in a ldom
configuration on older machines, like T3-2 or T5240

Re: Setting up systemd throws a fit. Actually any update does the same.

2020-12-04 Thread Anatoly Pugachev

On Fri, Dec 4, 2020 at 9:10 AM Dennis Clarke  wrote:
>
>
> So a few weeks ago I installed a fresh copy from the cool installer
> images :
>
> https://lists.debian.org/debian-sparc/2020/11/msg3.html
>
> Works great but I simply can not update anything.

Dennis, can you try with older kernel version (unpack and install
without dpkg) like 5.7.x?

Re: mkfs.ext2 - state D partitioning stops at 33% /boot

2020-06-22 Thread Anatoly Pugachev

On Mon, Jun 22, 2020 at 6:37 PM Connor McLaughlan  wrote:
>
> Can you by any chance tell me how i could obtain a list of all PCI
> cards that are possibly supported  and might work on debian sparc?

Probably most PCI cards supported by kernel ?

Re: Network not detected

2020-03-05 Thread Anatoly Pugachev

On Thu, Mar 5, 2020 at 5:03 PM Alexandre Bencz  wrote:
>
> Inside /etc/network/interfaces
>
> # This file describes the network interfaces available on your system
> # and how to activate them. For more information, see interfaces(5).
>
> source /etc/network/interfaces.d/*
>
> # The loopback network interface
> auto lo
> iface lo inet loopback
>
> # The primary network interface
> allow-hotplug enp2s0
> iface enp2s0 inet dhcp
>
> The dmesg log:
>
> $ dmesg
> [0.001496] PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.24 1999/01/01 01:01'
...
> [   11.300463] ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker
> [   11.403802] ne2k-pci :02:00.0 eth0: RealTek RTL-8029 found at 
> 0x1fe0280, IRQ 8, 52:54:00:12:34:56.
> [   12.636822] ne2k-pci :02:00.0 enp2s0: renamed from eth0

First, dmesg shows that driver for network card is loaded, you could
check with "ip link show" to see is there enp2s0 available.

Second, check with "ip addr show" does your enp2s0 gets an IP address...

Re: Network not detected

2020-03-05 Thread Anatoly Pugachev

On Thu, Mar 5, 2020 at 4:09 PM Alexandre Bencz  wrote:
>
> Hi
> I just installed Debian on a QEMU virtual machine, during the installation, 
> the internet connection was detected and etc, but after starting the system, 
> the connection does not exist
>
> Ps: I'm using Windows
>
> My QEMU command line is:
> qemu-system-sparc64.exe -machine sun4u,accel=tcg,usb=off -m 2048 -nographic 
> -hda debian_sparc64.qcow2 -net nic,model=ne2k_pci -net 
> user,hostfwd=tcp::-:22 -monitor telnet::4440,server,nowait -serial 
> mon:telnet::,server,wait

Alexandre,

what is in your /etc/network/interfaces file ? or do you use systemd-networkd ?
do you have dmesg log from this machine ?
have you tried to do "modprobe ne2k-pci" after machine boots?
what is "ip link show" output?

Thanks.

[sparc64] debian unstable/sid kernel update/upgrade to 5.2.x

2019-08-26 Thread Anatoly Pugachev

JFYI,

since debian unstable/sid got kernel update from 4.19.x series to
5.2.x , there could be some userspace / kernel issues (OOPS, hangs)
after git kernel patch 7b9afb86b6328f10dc2cad9223d7def12d60e505 [1]
[2]. So please keep installed your older kernel (4.19) along with the
newer 5.2.x (or just mark it as hold with "apt-mark hold
linux-image-4.19.0-5-sparc64-smp" ) in case of a problems with 5.2.x

1. https://marc.info/?l=linux-sparc=156340081413375
2. https://marc.info/?l=linux-sparc=156596177731501

Re: [sparc64] nft bus error

2019-07-13 Thread Anatoly Pugachev

On Sat, Jul 13, 2019 at 10:03 PM Florian Westphal  wrote:
> Anatoly Pugachev  wrote:
> > Program received signal SIGBUS, Bus error.
> > 0xfff8000100946490 in nftnl_udata_get_u32 (attr=0x1106e30) at 
> > udata.c:127
> > 127 return *data;
>
> struct nftnl_udata {
>uint8_t type;
>uint8_t len;
>unsigned char   value[];
> } __attribute__((__packed__));
>
> Sparc doesn't like doing:
>
> uint32_t nftnl_udata_get_u32(const struct nftnl_udata *attr)
> {
> uint32_t *data = (uint32_t *)attr->value;
>
> return *data;
> }
>
> Anatoly, does this help?
>
> diff --git a/src/udata.c b/src/udata.c
> --- a/src/udata.c
> +++ b/src/udata.c
> @@ -122,9 +122,11 @@ void *nftnl_udata_get(const struct nftnl_udata *attr)
>  EXPORT_SYMBOL(nftnl_udata_get_u32);
>  uint32_t nftnl_udata_get_u32(const struct nftnl_udata *attr)
>  {
> -   uint32_t *data = (uint32_t *)attr->value;
> +   uint32_t data;
>
> -   return *data;
> +   memcpy(, attr->value, sizeof(data));
> +
> +   return data;
>  }
>
>  EXPORT_SYMBOL(nftnl_udata_next);

Florian,

yes, works beautifully!

Thanks!

PS: missed CC list

Re: [sparc64] nft bus error

2019-07-13 Thread Anatoly Pugachev

On Sat, Jul 13, 2019 at 10:03 PM Florian Westphal  wrote:
>
> Anatoly Pugachev  wrote:
> > Program received signal SIGBUS, Bus error.
> > 0xfff8000100946490 in nftnl_udata_get_u32 (attr=0x1106e30) at 
> > udata.c:127
> > 127 return *data;
>
> struct nftnl_udata {
>uint8_t type;
>uint8_t len;
>unsigned char   value[];
> } __attribute__((__packed__));
>
> Sparc doesn't like doing:
>
> uint32_t nftnl_udata_get_u32(const struct nftnl_udata *attr)
> {
> uint32_t *data = (uint32_t *)attr->value;
>
> return *data;
> }
>
> Anatoly, does this help?
>
> diff --git a/src/udata.c b/src/udata.c
> --- a/src/udata.c
> +++ b/src/udata.c
> @@ -122,9 +122,11 @@ void *nftnl_udata_get(const struct nftnl_udata *attr)
>  EXPORT_SYMBOL(nftnl_udata_get_u32);
>  uint32_t nftnl_udata_get_u32(const struct nftnl_udata *attr)
>  {
> -   uint32_t *data = (uint32_t *)attr->value;
> +   uint32_t data;
>
> -   return *data;
> +   memcpy(, attr->value, sizeof(data));
> +
> +   return data;
>  }
>
>  EXPORT_SYMBOL(nftnl_udata_next);


Florian,

yes, works beautifully!

Thanks!

[sparc64] nft bus error

2019-07-13 Thread Anatoly Pugachev

Hello!

Getting nft (libnftnl) bus error with sparc64 linux machine.

using git version of libnftnl and nftables (installed under /opt/nft):
mator@ttip:/1/mator/libnftnl$ git desc
libnftnl-1.1.3-7-ga6a2d0c
mator@ttip:/1/mator/libnftnl$ cd ../nftables/
mator@ttip:/1/mator/nftables$ git desc
v0.9.1-25-g87c0bee

# which nft
/opt/nft/sbin/nft
# nft  list tables
table ip sshguard
table ip6 sshguard
(loading some rules)
# nft -f /etc/nft.rules
# nft  list tables
Bus error
(run under gdb)
# gdb -q /opt/nft/sbin/nft
Reading symbols from /opt/nft/sbin/nft...done.
(gdb) set args list tables
(gdb) run
Starting program: /opt/nft/sbin/nft list tables

Program received signal SIGBUS, Bus error.
0xfff8000100946490 in nftnl_udata_get_u32 (attr=0x1106e30) at udata.c:127
127 return *data;
(gdb) bt
#0  0xfff8000100946490 in nftnl_udata_get_u32 (attr=0x1106e30) at
udata.c:127
#1  0xfff8000100168db8 in netlink_delinearize_set (ctx=0x7feee08,
nls=0x11076e0) at netlink.c:568
#2  0xfff800010016929c in list_set_cb (nls=0x11076e0,
arg=0x7feee08) at netlink.c:647
#3  0xfff800010094083c in nftnl_set_list_foreach
(set_list=0x1107640, cb=0xfff8000100169278 ,
data=0x7feee08) at set.c:780
#4  0xfff80001001693a4 in netlink_list_sets (ctx=0x7feee08,
h=0x1107160) at netlink.c:668
#5  0xfff800010013ba90 in cache_init_objects (ctx=0x7feee08,
flags=127) at rule.c:161
#6  0xfff800010013be98 in cache_init (ctx=0x7feee08, flags=127) at
rule.c:220
#7  0xfff800010013c0b8 in cache_update (nft=0x1106a20, flags=127,
msgs=0x7fef140) at rule.c:258
#8  0xfff800010018cca4 in nft_evaluate (nft=0x1106a20,
msgs=0x7fef140, cmds=0x7fef130) at libnftables.c:406
#9  0xfff800010018cf4c in nft_run_cmd_from_buffer (nft=0x1106a20,
buf=0x1106d40 "list tables") at libnftables.c:447
#10 0x01002088 in main (argc=3, argv=0x7fef618) at main.c:316
(gdb)




# cat /etc/nft.rules
# Translated by iptables-restore-translate v1.8.3 on Sat Jul 13 10:53:36 2019
add table ip filter
add chain ip filter INPUT { type filter hook input priority 0; policy accept; }
add chain ip filter FORWARD { type filter hook forward priority 0;
policy accept; }
add chain ip filter OUTPUT { type filter hook output priority 0;
policy accept; }
# -t filter -A INPUT -p tcp --dport 22 -m set --match-set sshguard4 src -j DROP
add rule ip filter INPUT iifname "lo" counter accept
add rule ip filter INPUT ct state related,established counter accept
add rule ip filter INPUT ct state new  tcp dport 22 counter accept
add rule ip filter INPUT ip saddr 10.8.1.0/24 counter accept
add rule ip filter INPUT ip protocol icmp counter accept
add rule ip filter INPUT ip protocol udp udp dport 33434-33529 counter accept
add rule ip filter INPUT iifname "eth0" ip saddr 10.190.2.0/24 ct
state new  tcp dport 445 counter accept
add rule ip filter INPUT iifname "eth0" ip saddr 10.190.2.0/24 ct
state new  udp dport 445 counter accept
add rule ip filter INPUT iifname "eth0" ip saddr 192.168.158.0/24 ct
state new  tcp dport 445 counter accept
add rule ip filter INPUT iifname "eth0" ip saddr 192.168.158.0/24 ct
state new  udp dport 445 counter accept
add rule ip filter INPUT ip protocol tcp tcp dport { 80,443} counter accept
add rule ip filter INPUT counter drop
add table ip nat
add chain ip nat PREROUTING { type nat hook prerouting priority -100;
policy accept; }
add chain ip nat INPUT { type nat hook input priority 100; policy accept; }
add chain ip nat OUTPUT { type nat hook output priority -100; policy accept; }
add chain ip nat POSTROUTING { type nat hook postrouting priority 100;
policy accept; }
# Completed on Sat Jul 13 10:53:36 2019


machine info:

nftables$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/sparc64-linux-gnu/8/lto-wrapper
Target: sparc64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian
8.3.0-7' --with-bugurl=file:///usr/share/doc/gcc-8/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --prefix=/usr
--with-gcc-major-version-only --program-suffix=-8
--program-prefix=sparc64-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --libdir=/usr/lib
--enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-libquadmath
--disable-libquadmath-support --enable-plugin --enable-default-pie
--with-system-zlib --disable-libphobos --enable-objc-gc=auto
--enable-multiarch --disable-werror --with-cpu-32=ultrasparc
--enable-targets=all --with-long-double-128 --enable-multilib
--enable-checking=release --build=sparc64-linux-gnu
--host=sparc64-linux-gnu --target=sparc64-linux-gnu
Thread model: posix
gcc version 8.3.0 (Debian 8.3.0-7)

nftables$ ld -V
GNU ld (GNU Binutils for Debian) 2.32.51.20190707
  Supported emulations:
   elf64_sparc
   elf32_sparc

# ldconfig -V
ldconfig (Debian GLIBC

[sparc64] possible circular locking / deadlock

2019-06-17 Thread Anatoly Pugachev

Hello!

Getting the following git kernel trace on boot with rc.local having :

ipset create sshguard4 hash:net
iptables -A INPUT -p tcp --dport 22 -m set --match-set sshguard4 src -j DROP

current git kernel:

$ uname -a
Linux ttip 5.2.0-rc5 #981 SMP Mon Jun 17 09:52:04 MSK 2019 sparc64 GNU/Linux
linux-2.6$ git desc
v5.2-rc5


$ dmesg

[   10.356388] Adding 787176k swap on /dev/vdiska4.  Priority:-2
extents:1 across:787176k FS
[   10.471900] EXT4-fs (vdiska1): mounting ext3 file system using the
ext4 subsystem
[   10.487226] EXT4-fs (vdiska1): mounted filesystem with ordered data
mode. Opts: (null)
[   11.158102] random: crng init done
[   11.158155] random: 7 urandom warning(s) missed due to ratelimiting

[   11.697866] ==
[   11.697875] WARNING: possible circular locking dependency detected
[   11.697886] 5.2.0-rc5 #981 Not tainted
[   11.697894] --
[   11.697902] iptables/732 is trying to acquire lock:
[   11.697913] 4f61aa56 ([i].mutex){+.+.}, at:
nfnl_lock+0x24/0x40 [nfnetlink]
[   11.697937]
   but task is already holding lock:
[   11.697946] 0d652829 (>nft.commit_mutex){+.+.}, at:
nf_tables_valid_genid+0x18/0x60 [nf_tables]
[   11.697973]
   which lock already depends on the new lock.

[   11.697983]
   the existing dependency chain (in reverse order) is:
[   11.697992]
   -> #1 (>nft.commit_mutex){+.+.}:
[   11.698012]__mutex_lock+0x48/0x920
[   11.698021]mutex_lock_nested+0x1c/0x40
[   11.698033]nf_tables_valid_genid+0x18/0x60 [nf_tables]
[   11.698043]nfnetlink_rcv_batch+0x24c/0x620 [nfnetlink]
[   11.698053]nfnetlink_rcv+0x110/0x140 [nfnetlink]
[   11.698067]netlink_unicast+0x12c/0x1e0
[   11.698076]netlink_sendmsg+0x324/0x360
[   11.698091]sock_sendmsg+0x34/0x80
[   11.698099]___sys_sendmsg+0x228/0x240
[   11.698108]__sys_sendmsg+0x4c/0x80
[   11.698116]sys_sendmsg+0x18/0x40
[   11.698131]linux_sparc_syscall+0x34/0x44
[   11.698138]
   -> #0 ([i].mutex){+.+.}:
[   11.698157]lock_acquire+0x1a4/0x1c0
[   11.698165]__mutex_lock+0x48/0x920
[   11.698173]mutex_lock_nested+0x1c/0x40
[   11.698181]nfnl_lock+0x24/0x40 [nfnetlink]
[   11.698196]ip_set_nfnl_get_byindex+0x19c/0x280 [ip_set]
[   11.698207]set_match_v1_checkentry+0x14/0xc0 [xt_set]
[   11.698222]xt_check_match+0x238/0x260 [x_tables]
[   11.698234]__nft_match_init+0x160/0x180 [nft_compat]
[   11.698244]nft_match_init+0x18/0x40 [nft_compat]
[   11.698256]nf_tables_newrule+0x57c/0x7a0 [nf_tables]
[   11.698266]nfnetlink_rcv_batch+0x3f8/0x620 [nfnetlink]
[   11.698275]nfnetlink_rcv+0x110/0x140 [nfnetlink]
[   11.698284]netlink_unicast+0x12c/0x1e0
[   11.698292]netlink_sendmsg+0x324/0x360
[   11.698300]sock_sendmsg+0x34/0x80
[   11.698309]___sys_sendmsg+0x228/0x240
[   11.698317]__sys_sendmsg+0x4c/0x80
[   11.698325]sys_sendmsg+0x18/0x40
[   11.698334]linux_sparc_syscall+0x34/0x44
[   11.698340]
   other info that might help us debug this:

[   11.698351]  Possible unsafe locking scenario:

[   11.698359]CPU0CPU1
[   11.698366]
[   11.698372]   lock(>nft.commit_mutex);
[   11.698381]lock([i].mutex);
[   11.698390]lock(>nft.commit_mutex);
[   11.698400]   lock([i].mutex);
[   11.698408]
*** DEADLOCK ***

[   11.698418] 1 lock held by iptables/732:
[   11.698424]  #0: 0d652829 (>nft.commit_mutex){+.+.},
at: nf_tables_valid_genid+0x18/0x60 [nf_tables]
[   11.698444]
   stack backtrace:
[   11.698454] CPU: 6 PID: 732 Comm: iptables Not tainted 5.2.0-rc5 #981
[   11.698463] Call Trace:
[   11.698471]  [004cfde0] print_circular_bug+0x2e0/0x320
[   11.698480]  [004d4bd8] __lock_acquire+0x1d38/0x2900
[   11.698489]  [004d6084] lock_acquire+0x1a4/0x1c0
[   11.698498]  [00a06508] __mutex_lock+0x48/0x920
[   11.698506]  [00a06dfc] mutex_lock_nested+0x1c/0x40
[   11.698516]  [1071c024] nfnl_lock+0x24/0x40 [nfnetlink]
[   11.698527]  [107568dc] ip_set_nfnl_get_byindex+0x19c/0x280 [ip_set]
[   11.698537]  [1078e5d4] set_match_v1_checkentry+0x14/0xc0 [xt_set]
[   11.698549]  [10310ed8] xt_check_match+0x238/0x260 [x_tables]
[   11.698559]  [1077cc00] __nft_match_init+0x160/0x180 [nft_compat]
[   11.698569]  [1077ccb8] nft_match_init+0x18/0x40 [nft_compat]
[   11.698582]  [10731c3c] nf_tables_newrule+0x57c/0x7a0 [nf_tables]
[   11.698592]  [1071d238] nfnetlink_rcv_batch+0x3f8/0x620 [nfnetlink]
[   11.698602]  [1071d570] nfnetlink_rcv+0x110/0x140 [nfnetlink]
[

Re: Fwd: Fail to boot Fujitsu M10-4

2019-06-04 Thread Anatoly Pugachev

On Tue, Jun 4, 2019 at 12:42 PM Sonnie Hook  wrote:
>
> I inserted a USB disk on M10-4 and partitioned it with grub bios in rescue 
> mode. The installer automatically installed GRUB to /dev/sda. I chroot to 
> /target and executed
>
> # grub-install --force-extra-removable "/dev/sdb"
>
> Installing for sparc64-ieee1275 platform.
>
> Installation finished. No error reported.
>
>
> Then something different happened:
>
> {0} ok boot /pci@8000/pci@4/pci@0/pci@1/pci@0/usb@4,1/disk@0,0:a
>
> Boot device: /pci@8000/pci@4/pci@0/pci@1/pci@0/usb@4,1/disk@0,0:a  File and 
> args:
>
> GRUB Loading kernel.
>
> vitual-device not found.
>
> ERROR: Last Trap: Fast Data Access MMU Miss


JFYI

Last time i've seen this error on 220R  (like 10 years ago) , the only
way to boot linux on it was to hard power reset

Re: Updated installation images for Debian Ports 2019-05-09

2019-05-29 Thread Anatoly Pugachev

On Mon, May 13, 2019 at 3:51 AM Gregor Riepl  wrote:
> - Using the default partitioning scheme, the installer configures a /boot
> partition that is only 100MB (on a 120GB ATA disk). This is too small to hold
> more than one kernel plus initrd, apparently - I had trouble upgrading the
> kernel package at one point.

https://salsa.debian.org/installer-team/partman-auto/merge_requests/5

Re: installing current ports iso

2018-05-29 Thread Anatoly Pugachev

On Tue, May 29, 2018 at 5:53 PM, John Paul Adrian Glaubitz
 wrote:
> Hi Anatoly!
>
> I’m pretty sure you used expert mode which has triggered this problem in the 
> past.

Yes, used expert mode at first, rebooted once again without choosing
anything fancy. Successfully installed T5120 with debian sid.

Thanks.

installing current ports iso

2018-05-29 Thread Anatoly Pugachev

Hello!

JFYI

Tried to install current ports iso taken from

http://cdimage.debian.org/cdimage/ports/current/sparc64/iso-cd/debian-10.0-sparc64-NETINST-1.iso

and choosing

host: ftp.ports.debian.org
directory: /debian-ports

gives me the following error:

May 29 14:25:58 main-menu[5143]: INFO: Menu item 'choose-mirror' selected
May 29 14:25:58 anna-install: Installing apt-mirror-setup
May 29 14:26:13 choose-mirror[5177]: DEBUG: command: wget --no-verbose
http://ftp.ports.debian.org/debian-ports/dists/buster/Release -O - |
grep -E '^(Suite|Codename|Architectures):'
May 29 14:26:13 choose-mirror[5177]: WARNING **: mirror does not
support the specified release (buster)


and indeed , http://ftp.ports.debian.org/debian-ports/dists/ does not
have buster subdirectory.

[sparc64] number of processors in a LDOM

2018-04-04 Thread Anatoly Pugachev

Hello!

Can someone tell me or suggest why does getconf returns total available to
a physical machine cpu count, and not LDOM allocated processor/vcpu count ?

ttip$ getconf -a | grep PROCESSORS
_NPROCESSORS_CONF  256
_NPROCESSORS_ONLN  16

i believe, nproc (from coreutils) use getconf as well :

ttip$ nproc --all
256

ttip$ nproc
16

But this LDOM is defined as following (16 vcpus allocated):

# ldm list
NAME STATE  FLAGS   CONSVCPU  MEMORY   UTIL  NORM
UPTIME
ttip active -n  50041632G  0.0%  0.0%  3h 1m


Just to compare, if we take power systems (ppc64) LPAR, it reports only
LPAR allocated CPU count (not physical machine available cpu/core count).

I'm raising this issue, because some userspace tools use nproc to run
parallel make for example. And starting from 4.15+ (but not on 4.14) kernel
overcommited CPU usage (for example, using make -j256 on a LDOM with 16
vcpus allocated) gets me to the following (reproducible):

Message from syslogd@ttip at Apr  3 14:53:15 ...
 kernel:[  942.850499] BUG: workqueue lockup - pool cpus=8 node=0
flags=0x0 nice=0 stuck for 36s!
Apr 03 14:53:15 ttip kernel: BUG: workqueue lockup - pool cpus=8
node=0 flags=0x0 nice=0 stuck for 36s!
Apr 03 14:53:15 ttip kernel: Showing busy workqueues and worker pools:
Apr 03 14:53:15 ttip kernel: workqueue mm_percpu_wq: flags=0x8
Apr 03 14:53:15 ttip kernel:   pwq 16: cpus=8 node=0 flags=0x0 nice=0
active=1/256
Apr 03 14:53:15 ttip kernel: pending: vmstat_update
Apr 03 14:53:15 ttip kernel: workqueue xfs-sync/dm-0: flags=0x4
Apr 03 14:53:15 ttip kernel:   pwq 0: cpus=0 node=0 flags=0x0 nice=0
active=1/256
Apr 03 14:53:15 ttip kernel: pending: xfs_log_worker [xfs]
^C
Message from syslogd@ttip at Apr  3 14:53:45 ...
 kernel:[  972.929725] BUG: workqueue lockup - pool cpus=8 node=0
flags=0x0 nice=0 stuck for 66s!

Message from syslogd@ttip at Apr  3 14:54:15 ...
 kernel:[ 1003.008979] BUG: workqueue lockup - pool cpus=8 node=0
flags=0x0 nice=0 stuck for 96s!

Message from syslogd@ttip at Apr  3 14:54:46 ...
 kernel:[ 1033.088189] BUG: workqueue lockup - pool cpus=8 node=0
flags=0x0 nice=0 stuck for 126s!

Message from syslogd@ttip at Apr  3 14:55:16 ...
 kernel:[ 1063.166574] BUG: workqueue lockup - pool cpus=8 node=0
flags=0x0 nice=0 stuck for 156s!

Message from syslogd@ttip at Apr  3 14:55:46 ...
 kernel:[ 1093.244982] BUG: workqueue lockup - pool cpus=8 node=0
flags=0x0 nice=0 stuck for 186s!

This messages occasionally lead to machine/LDOM being unstable, i.e.
with some lockups to processes.

filtered dmesg output:

ttip$ dmesg  | egrep -i "cpu|smp"
[0.73] Linux version 4.16.0-05456-g17dec0a94915 (mator@ttip)
(gcc version 7.3.0 (Debian 7.3.0-14)) #659 SMP Wed Apr 4 12:16:32 MSK
2018
[0.037199] PLATFORM: max-cpus [1024]
[0.194415] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
[0.194525] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit,fmaf,vis3]
[0.194630] CPU CAPS: [hpc,ima,pause,cbcond,aes,des,kasumi,camellia]
[0.194731] CPU CAPS: [md5,sha1,sha256,sha512,mpmul,montmul,montsqr,crc32c]
[0.237948] percpu: Embedded 12 pages/cpu @(ptrval) s56584
r8192 d33528 u131072
[0.238199] pcpu-alloc: s56584 r8192 d33528 u131072 alloc=1*4194304
[0.238209] pcpu-alloc: [0] 000 001 002 003 004 005 006 007 008 009
010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026
027 028 029
030 031
[0.238363] pcpu-alloc: [0] 032 033 034 035 036 037 038 039 040 041
042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058
059 060 061
062 063
[0.238515] pcpu-alloc: [0] 064 065 066 067 068 069 070 071 072 073
074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090
091 092 093
094 095
[0.238668] pcpu-alloc: [0] 096 097 098 099 100 101 102 103 104 105
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122
123 124 125
126 127
[0.238820] pcpu-alloc: [0] 128 129 130 131 132 133 134 135 136 137
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
155 156 157
158 159
[0.238973] pcpu-alloc: [0] 160 161 162 163 164 165 166 167 168 169
170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186
187 188 189
190 191
[0.239125] pcpu-alloc: [0] 192 193 194 195 196 197 198 199 200 201
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
219 220 221
222 223
[0.239278] pcpu-alloc: [0] 224 225 226 227 228 229 230 231 232 233
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250
251 252 253
254 255
[0.239873] SUN4V: Mondo queue sizes [cpu(131072) dev(16384) r(8192) nr(256)]
[0.242373] log_buf_len individual max cpu contribution: 4096 bytes
[0.242438] log_buf_len total cpu_extra contributions: 1044480 bytes
[0.516414] smp: Bringing up secondary CPUs ...
[0.548006] smp: Brought up 1 node, 16 CPUs

[sparc64] number of processors in a LDOM

2018-04-04 Thread Anatoly Pugachev

Hello!

Can someone tell me or suggest why does getconf returns total available to
a physical machine cpu count, and not LDOM allocated processor/vcpu count ?

ttip$ getconf -a | grep PROCESSORS
_NPROCESSORS_CONF  256
_NPROCESSORS_ONLN  16

i believe, nproc (from coreutils) use getconf as well :

ttip$ nproc --all
256

ttip$ nproc
16

But this LDOM is defined as following (16 vcpus allocated):

# ldm list
NAME STATE  FLAGS   CONSVCPU  MEMORY   UTIL  NORM
UPTIME
ttip active -n  50041632G  0.0%  0.0%  3h 1m


Just to compare, if we take power systems (ppc64) LPAR, it reports only
LPAR allocated CPU count (not physical machine available cpu/core count).

I'm raising this issue, because some userspace tools use nproc to run
parallel make for example. And starting from 4.15+ (but not on 4.14) kernel
overcommited CPU usage (for example, using make -j256 on a LDOM with 16
vcpus allocated) gets me to the following (reproducible):

Message from syslogd@ttip at Apr  3 14:53:15 ...
 kernel:[  942.850499] BUG: workqueue lockup - pool cpus=8 node=0 flags=0x0
nice=0 stuck for 36s!
Apr 03 14:53:15 ttip kernel: BUG: workqueue lockup - pool cpus=8 node=0
flags=0x0 nice=0 stuck for 36s!
Apr 03 14:53:15 ttip kernel: Showing busy workqueues and worker pools:
Apr 03 14:53:15 ttip kernel: workqueue mm_percpu_wq: flags=0x8
Apr 03 14:53:15 ttip kernel:   pwq 16: cpus=8 node=0 flags=0x0 nice=0
active=1/256
Apr 03 14:53:15 ttip kernel: pending: vmstat_update
Apr 03 14:53:15 ttip kernel: workqueue xfs-sync/dm-0: flags=0x4
Apr 03 14:53:15 ttip kernel:   pwq 0: cpus=0 node=0 flags=0x0 nice=0
active=1/256
Apr 03 14:53:15 ttip kernel: pending: xfs_log_worker [xfs]
^C
Message from syslogd@ttip at Apr  3 14:53:45 ...
 kernel:[  972.929725] BUG: workqueue lockup - pool cpus=8 node=0 flags=0x0
nice=0 stuck for 66s!

Message from syslogd@ttip at Apr  3 14:54:15 ...
 kernel:[ 1003.008979] BUG: workqueue lockup - pool cpus=8 node=0 flags=0x0
nice=0 stuck for 96s!

Message from syslogd@ttip at Apr  3 14:54:46 ...
 kernel:[ 1033.088189] BUG: workqueue lockup - pool cpus=8 node=0 flags=0x0
nice=0 stuck for 126s!

Message from syslogd@ttip at Apr  3 14:55:16 ...
 kernel:[ 1063.166574] BUG: workqueue lockup - pool cpus=8 node=0 flags=0x0
nice=0 stuck for 156s!

Message from syslogd@ttip at Apr  3 14:55:46 ...
 kernel:[ 1093.244982] BUG: workqueue lockup - pool cpus=8 node=0 flags=0x0
nice=0 stuck for 186s!

This messages occasionally lead to machine/LDOM being unstable, i.e. with
some lockups to processes.

filtered dmesg output:

ttip$ dmesg  | egrep -i "cpu|smp"
[0.73] Linux version 4.16.0-05456-g17dec0a94915 (mator@ttip) (gcc
version 7.3.0 (Debian 7.3.0-14)) #659 SMP Wed Apr 4 12:16:32 MSK 2018
[0.037199] PLATFORM: max-cpus [1024]
[0.194415] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
[0.194525] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit,fmaf,vis3]
[0.194630] CPU CAPS: [hpc,ima,pause,cbcond,aes,des,kasumi,camellia]
[0.194731] CPU CAPS:
[md5,sha1,sha256,sha512,mpmul,montmul,montsqr,crc32c]
[0.237948] percpu: Embedded 12 pages/cpu @(ptrval) s56584 r8192
d33528 u131072
[0.238199] pcpu-alloc: s56584 r8192 d33528 u131072 alloc=1*4194304
[0.238209] pcpu-alloc: [0] 000 001 002 003 004 005 006 007 008 009 010
011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029
030 031
[0.238363] pcpu-alloc: [0] 032 033 034 035 036 037 038 039 040 041 042
043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061
062 063
[0.238515] pcpu-alloc: [0] 064 065 066 067 068 069 070 071 072 073 074
075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093
094 095
[0.238668] pcpu-alloc: [0] 096 097 098 099 100 101 102 103 104 105 106
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
126 127
[0.238820] pcpu-alloc: [0] 128 129 130 131 132 133 134 135 136 137 138
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157
158 159
[0.238973] pcpu-alloc: [0] 160 161 162 163 164 165 166 167 168 169 170
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189
190 191
[0.239125] pcpu-alloc: [0] 192 193 194 195 196 197 198 199 200 201 202
203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221
222 223
[0.239278] pcpu-alloc: [0] 224 225 226 227 228 229 230 231 232 233 234
235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253
254 255
[0.239873] SUN4V: Mondo queue sizes [cpu(131072) dev(16384) r(8192)
nr(256)]
[0.242373] log_buf_len individual max cpu contribution: 4096 bytes
[0.242438] log_buf_len total cpu_extra contributions: 1044480 bytes
[0.516414] smp: Bringing up secondary CPUs ...
[0.548006] smp: Brought up 1 node, 16 CPUs

Re: building debug version of klibc

2017-12-30 Thread Anatoly Pugachev

On Sat, Dec 30, 2017 at 3:14 PM,  <valdis.kletni...@vt.edu> wrote:
> On Sat, 30 Dec 2017 15:05:46 +0300, Anatoly Pugachev said:
>> On Sat, Dec 30, 2017 at 3:00 PM,  <valdis.kletni...@vt.edu> wrote:
>> > On Sat, 30 Dec 2017 13:54:05 +0300, Anatoly Pugachev said:
>> >> Hello!
>> >>
>> >> Can someone please help me in building debug version of klibc ?
>> >>
>> >> I've cloned git://git.kernel.org/pub/scm/libs/klibc/klibc.git  , but
>> >> failed to build it with debug info
>> >>
>> >> added "-g" to HOSTCFLAGS in Makefile, but
>
>> it's usual git kernel compile and install. And it's the first time I
>> started to get segfault from fstype.
>
> I missed where you went from klibc to building a new kernel, probably
> because you changed topics in mid-email.  Why were you building a new
> kernel for your host system just to get a klibc that had -g?

I'm using / testing a git kernel upstream on my hardware (sparc64 /
ppc64 / ia64 ).
Only this time, installing latest git kernel, I get a sigserv on
sparc64 from klibc utility (fstype), and posted about it to mailing
list.
So, to properly report with gdb backtrace, i was need to build klibc
(fstype) with debug info.
I'm not able to fix it myself, so probably someone else could look
into the issue (or i'm just raising awareness for the issue to other
who can hit it as well).
It could be compiler issue as well, if files in klibc.git repo does
not changed for about a year
I was asking for help with building debug version of klibc and since i
was able to build it, we could close this issue for now (going to
discuss it with debian distribution maintainers first).
Thanks!

Re: building debug version of klibc

2017-12-30 Thread Anatoly Pugachev

On Sat, Dec 30, 2017 at 3:00 PM,  <valdis.kletni...@vt.edu> wrote:
> On Sat, 30 Dec 2017 13:54:05 +0300, Anatoly Pugachev said:
>> Hello!
>>
>> Can someone please help me in building debug version of klibc ?
>>
>> I've cloned git://git.kernel.org/pub/scm/libs/klibc/klibc.git  , but
>> failed to build it with debug info
>>
>> added "-g" to HOSTCFLAGS in Makefile, but
>
> Hint:  HOSTCFLAGS is applied to code that needs to run on the machine that's
> doing the build, not the target code.  So for instance, if I'm 
> cross-compiling on
> an x86_64   for an ARM target (which I do quite a bit, building Lede 
> router images
> for my wireless), HOSTCFLAGS is applied to any x86_64 utility code that gets
> built.  I don't know what code is in klibc, but an example in the kernel 
> source
> tree would be 'objtool' - that runs on the host system during the build, not 
> at runtime.
>
>> I started to get segfault in fstype:
>
>> linux-2.6$ make install
> ...
>>  DEPMOD  4.15.0-rc5-00149-g5aa90a845892
>> sh ./arch/sparc/boot/install.sh 4.15.0-rc5-00149-g5aa90a845892 
>> arch/sparc/boot/zImage \
>>System.map "/boot"
>
> What directory did you do that in?  It looks like you're trying to install a 
> whole
> new kernel image, not a new initramfs that has an updated klibc on it.

Valdis,

it's usual git kernel compile and install. And it's the first time I
started to get segfault from fstype.

git kernel from
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

local directory where do i build it is ~/linux-2.6 , my own config.
Worked fine for more than 100th times.
So, i'm doing "make modules_install && make install" in this directory
which doing right about to copy vmlinuz, System.map, kernel config
file to /boot and generate initrd there and updating boot loader
(grub2) config file.

Re: building debug version of klibc

2017-12-30 Thread Anatoly Pugachev

On Sat, Dec 30, 2017 at 2:30 PM, Sam Ravnborg <s...@ravnborg.org> wrote:
> Hi Anatoly
>
> On Sat, Dec 30, 2017 at 01:54:05PM +0300, Anatoly Pugachev wrote:
>> Hello!
>>
>> Can someone please help me in building debug version of klibc ?
>>
>> I've cloned git://git.kernel.org/pub/scm/libs/klibc/klibc.git  , but
>> failed to build it with debug info
>>
>> added "-g" to HOSTCFLAGS in Makefile, but
>
> HOSTCFLAGS is used when building tools running on your build machine.
>
> Try something like this (untested, whitespace damaged):
>
> diff --git a/scripts/Kbuild.klibc b/scripts/Kbuild.klibc
> index f500d535..3e8124f7 100644
> --- a/scripts/Kbuild.klibc
> +++ b/scripts/Kbuild.klibc
> @@ -69,7 +69,7 @@ include $(srctree)/scripts/Kbuild.include
>  KLIBCREQFLAGS := $(call cc-option, -fno-stack-protector, ) \
>   $(call cc-option, -fwrapv, )
>  KLIBCARCHREQFLAGS :=
> -KLIBCOPTFLAGS :=
> +KLIBCOPTFLAGS := -g
>  KLIBCWARNFLAGS:= -W -Wall -Wno-sign-compare -Wno-unused-parameter
>  KLIBCSHAREDFLAGS  :=
>  KLIBCBITSIZE  :=
>
> If you use make V=1 then you should be able to see the
> full gcc command line, where -g should be included with
> theabove fix.


Sam, thanks!
I did notice that later as well, but I've changed KLIBCCFLAGS to
include "-g" and changed strip to echo:

mator@ttip:~/klibc$ git diff
diff --git a/scripts/Kbuild.klibc b/scripts/Kbuild.klibc
index f500d535..40cbfd60 100644
--- a/scripts/Kbuild.klibc
+++ b/scripts/Kbuild.klibc
@@ -74,7 +74,7 @@ KLIBCWARNFLAGS:= -W -Wall -Wno-sign-compare
-Wno-unused-parameter
 KLIBCSHAREDFLAGS  :=
 KLIBCBITSIZE  :=
 KLIBCLDFLAGS  :=
-KLIBCCFLAGS   :=
+KLIBCCFLAGS   := -g

 # Defaults for arch to override
 KLIBCARCHINCFLAGS = -I$(KLIBCKERNELOBJ)/arch/$(KLIBCARCH)/include
@@ -99,7 +99,7 @@ KLIBCAR  := $(AR)
 klibc-ar = $(KLIBCAR) $(if $(KBUILD_REPRODUCIBLE),$(2),$(1))

 KLIBCRANLIB  := $(call klibc-ar,s,Ds)
-KLIBCSTRIP   := $(STRIP)
+KLIBCSTRIP   := echo
 KLIBCNM  := $(NM)
 KLIBCOBJCOPY := $(OBJCOPY)
 KLIBCOBJDUMP := $(OBJDUMP)
@@ -126,7 +126,7 @@ KLIBCCPPFLAGS+= $(KLIBCDEFS)
 KLIBCCFLAGS  += $(KLIBCCPPFLAGS) $(KLIBCREQFLAGS) $(KLIBCARCHREQFLAGS)  \
 $(KLIBCOPTFLAGS) $(KLIBCWARNFLAGS)
 KLIBCAFLAGS  += -D__ASSEMBLY__ $(KLIBCCFLAGS)
-KLIBCSTRIPFLAGS  += --strip-all -R .comment -R .note
+#KLIBCSTRIPFLAGS  += --strip-all -R .comment -R .note

 KLIBCLIBGCC_DEF  := $(shell $(KLIBCCC) $(KLIBCCFLAGS) --print-libgcc)
 KLIBCLIBGCC ?= $(KLIBCLIBGCC_DEF)
mator@ttip:~/klibc$

this helped me to produce exec with debug info and stack trace:

(gdb) file ./usr/kinit/fstype/static/fstype
Reading symbols from ./usr/kinit/fstype/static/fstype...done.
(gdb) run
Starting program: /home/mator/klibc/usr/kinit/fstype/static/fstype

Program received signal SIGSEGV, Segmentation fault.
__syscall_common () at usr/klibc/arch/sparc64/syscall.S:15
15  st  %o0,[%g4]
(gdb) bt
#0  __syscall_common () at usr/klibc/arch/sparc64/syscall.S:15
#1  0x001010d4 in identify_fs ()
#2  0x001001f0 in main ()


PS: added sparclinux@vger , first thread message is
http://www.zytor.com/pipermail/klibc/2017-December/003962.html

building debug version of klibc

2017-12-30 Thread Anatoly Pugachev

Hello!

Can someone please help me in building debug version of klibc ?

I've cloned git://git.kernel.org/pub/scm/libs/klibc/klibc.git  , but
failed to build it with debug info

added "-g" to HOSTCFLAGS in Makefile, but

$ make -j KLIBCKERNELSRC=`pwd`/../linux-2.6/usr

still strips every debug symbol , and i'm failed to change
scripts/Kbuild.klibc and Makefile to remove strip usage

klibc$ find . -name fstype | xargs file
./usr/kinit/fstype:   directory
./usr/kinit/fstype/static/fstype: ELF 64-bit MSB executable, SPARC V9,
relaxed memory ordering, version 1 (SYSV), statically linked, stripped
./usr/kinit/fstype/shared/fstype: ELF 64-bit MSB executable, SPARC V9,
relaxed memory ordering, version 1 (SYSV), statically linked,
interpreter /lib/klibc-M67ne2AU3wnuYln_9h2L1vfH5J0.so, stripped



I started to get segfault in fstype:

linux-2.6$ make install
...
  DEPMOD  4.15.0-rc5-00149-g5aa90a845892
sh ./arch/sparc/boot/install.sh 4.15.0-rc5-00149-g5aa90a845892
arch/sparc/boot/zImage \
System.map "/boot"
run-parts: executing /etc/kernel/postinst.d/apt-auto-removal
4.15.0-rc5-00149-g5aa90a845892
/boot/vmlinuz-4.15.0-rc5-00149-g5aa90a845892
run-parts: executing /etc/kernel/postinst.d/initramfs-tools
4.15.0-rc5-00149-g5aa90a845892
/boot/vmlinuz-4.15.0-rc5-00149-g5aa90a845892
update-initramfs: Generating /boot/initrd.img-4.15.0-rc5-00149-g5aa90a845892
Segmentation fault
run-parts: executing /etc/kernel/postinst.d/zz-update-grub
4.15.0-rc5-00149-g5aa90a845892
/boot/vmlinuz-4.15.0-rc5-00149-g5aa90a845892

Dec 30 12:51:06 ttip kernel: fstype[162686]: segfault at 38 ip
8001069c (rpc 80004820) sp 07feffdf53a1 error 1 in
klibc-g_9mplOvk_73CeIA8YN-t9vhxyc.so[8000+14000]

linux-2.6$ gdb -q -c core.164896
[New LWP 164896]
Core was generated by `/usr/lib/klibc/bin/fstype /dev/vdiska2'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x8001069c in ?? ()

linux-2.6$ $ find /usr -name fstype | xargs file
/usr/lib/klibc/bin/fstype: ELF 64-bit MSB executable, SPARC V9,
relaxed memory ordering, version 1 (SYSV), statically linked,
interpreter /lib/klibc-g_9mplOvk_73CeIA8YN-t9vhxyc.so, stripped

linux-2.6# file -s /dev/vdiska2
/dev/vdiska2: Linux rev 1.0 ext4 filesystem data,
UUID=f2eda779-5310-4af2-b48a-b43db51c0961 (needs journal recovery)
(extents) (64bit) (large files) (huge files)

$ dpkg -S /usr/lib/klibc/bin/fstype
klibc-utils: /usr/lib/klibc/bin/fstype

$ dpkg -l klibc-utils
||/ Name   Version
Architecture   Description
+++-==-==-==-=
ii  klibc-utils2.0.4-10   sparc64
  small utilities built with klibc for early boot


$ apt show klibc-utils
Package: klibc-utils
Version: 2.0.4-10
Priority: optional
Section: libs
Source: klibc
Maintainer: maximilian attems 
Installed-Size: 522 kB
Depends: libklibc (= 2.0.4-10)
Breaks: initramfs-tools (<< 0.123~)
Homepage: https://git.kernel.org/cgit/libs/klibc/klibc.git
Download-Size: 107 kB
APT-Manual-Installed: no
APT-Sources: http://ftp.ports.debian.org/debian-ports unstable/main
sparc64 Packages
Description: small utilities built with klibc for early boot
 This package contains a collection of programs that are linked
 against klibc. These duplicate some of the functionality of a
 regular Linux toolset, but are typically much smaller than their
 full-function counterparts.  They are intended for inclusion in
 initramfs images and embedded systems.

Re: grub2 with SPARC support available for testing

2017-11-08 Thread Anatoly Pugachev

On Thu, Feb 9, 2017 at 9:44 PM, Eric Snowberg  wrote:
>> On Feb 8, 2017, at 6:23 AM, John Paul Adrian Glaubitz 
>>  wrote:
>> On 01/23/2017 12:40 AM, John Paul Adrian Glaubitz wrote:
>>> I just uploaded grub2_2.02~beta3-3+sparc64 to Debian "unreleased" which
>>> contains an additional set of 15 patches by Eric Snowberg (CC'ed) which
>>> improve SPARC in grub2 and add support for modern SPARC hardware through
>>> the SPARC T7.
>>
>> I uploaded grub2_2.02~beta3-4+sparc64.1 to "unreleased" today which is
>> based on Eric's latest sparc-next-v2 branch [1]. This new version fixes
>> the build problems on non-sparc* targets. I'm not sure if it addresses
>> other issues, but Eric can maybe comment on this.
>
> I put together a wiki to help document the installation process:
>
> https://github.com/esnowberg/grub2-sparc/wiki

JFYI

Installed another LDOM yesterday, ports have 2.02-2+sparc64.1 version
of grub2 package. Used it to install grub (still on sun partition
table disk), was unable to boot with error message:

# grub-install --skip-fs-probe --force /dev/vdiska
...
# reboot

{0} ok boot disk
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.

SPARC T5-2, No Keyboard
Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.8, 16. GB memory available, Serial #83424764.
Ethernet address 0:14:4f:f9:c:36, Host ID: 84f8f5fc.



Boot device: /virtual-devices@100/channel-devices@200/disk@0  File and args:
WARNING: Unsupported bootblk image, can not extract fcode

WARNING: Bootblk fcode extraction failed
GRUB Loading kernel

ERROR: Last Trap: Fast Data Access MMU Miss

booted from cdrom into rescue, downloaded old version of grub2 package
from [1] and installed grub with grub-install. After which I was able
to reboot to OS.

1. 
http://snapshot.debian.org/archive/debian-ports/20170401T004207Z/pool-sparc64/main/g/grub2/

Re: Updated installer images

2017-09-10 Thread Anatoly Pugachev

On Sun, Sep 10, 2017 at 10:12 PM, Christoph Biedl
 wrote:
> John Paul Adrian Glaubitz wrote...
>
>> Please test and report back on the individual architecture
>> mailing lists
>
> So far, the ride for ppc64 has been *extremely* painful. This is not
> necessarly due to your efforts, but it feels a lot like nobody ever has
> tried to set up Debian on a G5 using netboot.
>
> So far (might be incomplete, and I'm both tired and fairly upset):
>
> * Any reasonable documentation on this anywhere? No about how to set up
>   DHCP/TFTP server, I've done this many time. But what about which files
>   are needed, and how to provide a netboot-adjusted yaboot.conf, and
>   mostly: How to make yaboot make using it?
>
> * The OF bootloader needs two rounds to load yaboot.
>
> * yaboot should either get a decent on-line help or see bitrot.
>
> * yaboot's "conf /path/to/config" command, when initially using netboot,
>   happiliy ignores the file name but retrieves 01-xx-yy-xx-yy-xx-yy
>   using TFTP instead.
>
> * After a lot of trickery, the installer's vmlinux now gets loaded. At a
>   whopping 6 kbyte/sec (yes: six kilobytes). Just to remind you, kernel
>   and initrd take some 35 megabytes, and the G5 has already turned to
>   airplane mode. My neighbors will love me.
>
> This isn't getting anywhere useful soon.

I was able to install netboot sparc64 ldom (but not latest sid version
, which is too big to load by OBP. There's also #645657 debian bug,
but somehow it got closed).

Also, installed ppc64 LPAR, failed to install yaboot and using grub2
on Power8 server, but that installation wasn't netboot, but usual
iso/cdrom install.

I could probably try to install test ppc64 lpar with netboot just to
check how it will go, but i need to know where do i get netboot image,
since https://cdimage.debian.org/cdimage/ports/ does not have netboot
images.

Thanks

Re: qlogic fibrechannel

2017-09-07 Thread Anatoly Pugachev

On Thu, Sep 7, 2017 at 3:17 PM, Rick Leir  wrote:
> Hi all
> I am still hoping to get my Sun 2000 (vintage 2002) running with Debian. The
> problem was the Qlogic fibrechannel disk driver, so perhaps I can add in a
> different disk controller temporarily. What economical solution would you
> recommend? Perhaps a USB disk?

solaris 10 and solaris 11 lists sun blade 2000 as supported system, so
you can choose disk controller for your disks
using solaris 10 / solaris 11 HCL [1] query page for "disk controller"
[2] , and try to find out does your choosen controller would be
supported by linux kernel 4.12 (latest debian unstable / sid kernel),
should be quite cheap on "IT flea market" or on ebay...

1. http://www.oracle.com/webfolder/technetwork/hcl/data/sol/index.html
2. 
http://www.oracle.com/webfolder/technetwork/hcl/data/sol/components/views/disk_controller_all_results.page1.html

Re: non-free firmware

2017-03-21 Thread Anatoly Pugachev

On Tue, Mar 21, 2017 at 11:25 AM, Rick Leir  wrote:
>
> On 2017-03-21 03:38 AM, Frans van Berckel wrote:
> Maybe I should hook up a SCSI disk, but what I really want is an SSD. This
> old post suggests you can use a  SATA->SCSI bridge:
> https://lists.freebsd.org/pipermail/freebsd-sparc64/2011-October/008055.html

I'm not sure about will it work or not, but looking at solaris 10 HCL
[1], it does list support SunBlade 2000 , and then looking at "Disk
controller" supported devices, it could probably boot from supported
pci disk controller device.
So, quickly looking through "disk controller" pci device list [2] ,
shows for example Adaptec ASH-1233 [3] which works with linux [4] and
around $15 at ebay. As well there's a bunch of pata(ide) ssds at ebay.

This is just my suggestion, as I said, i'm not sure will it actually
work (be able to boot) or not, but doing such a test , budget will be
less $100 ($15 for controller and some $30-$50 for pata ssd).

1. http://www.oracle.com/webfolder/technetwork/hcl/data/sol/index.html
2. 
http://www.oracle.com/webfolder/technetwork/hcl/data/sol/components/views/disk_controller_all_results.page1.html
3. 
http://www.oracle.com/webfolder/technetwork/hcl/data/components/details/adaptec/sol_10_03_05/999.html
4. https://www.redhat.com/archives/fedora-list/2004-August/msg03275.html

Re: Retrieving disk info from sunvdc using udevadm

2017-03-16 Thread Anatoly Pugachev

On Wed, Mar 15, 2017 at 1:14 AM, John Paul Adrian Glaubitz
 wrote:
> Hi!
>
> Some Debian users who are installing Debian for sparc64 in an LDOM run into
> the problem that the debian-installer is unable to detect the installation
> medium.
>
> Digging through the sources of the responsible debian-installer module, it
> turns out that d-i uses "udevadm info" to query information from all available
> block devices listed in /sys/block. To detect a CD-ROM, it looks for 
> "ID_CDROM=1"
> or "ID_TYPE=cd".
>
> Unfortunately, this fails with sunvdc with a virtual CD-ROM drive as the data
> retrieved by "udevadm info" is very limited as compared to a standard PC with
> a physical CD-ROM drive.
>
> For comparison, on a SPARC T5, I get:
>
> /sys/block # udevadm info -q env -p /sys/block/vdiskd
> DEVLINKS=/dev/disk/by-uuid/2017-03-14-14-05-33-00 
> /dev/disk/by-label/Debian\x209.0\x20sparc64\x201
> DEVNAME=/dev/vdiskd
> DEVPATH=/devices/channel-devices/vdc-port-3-0/block/vdiskd
> DEVTYPE=disk
> ID_FS_LABEL=Debian_9.0_sparc64_1
> ID_FS_LABEL_ENC=Debian\x209.0\x20sparc64\x201
> ID_FS_TYPE=iso9660
> ID_FS_USAGE=filesystem
> ID_FS_UUID=2017-03-14-14-05-33-00
> ID_FS_UUID_ENC=2017-03-14-14-05-33-00
> ID_PART_TABLE_TYPE=sun
> MAJOR=254
> MINOR=24
> SUBSYSTEM=block
> USEC_INITIALIZED=634522
> /sys/block #
>
> Compare this to the output for a standard USB CD-ROM device on my laptop:
>
> glaubitz@ikarus:/sys/block$ udevadm info -q env -p /sys/block/sr0
> DEVLINKS=/dev/disk/by-path/pci-:00:14.0-usb-0:2:1.0-scsi-0:0:0:0 
> /dev/disk/by-id/usb-TSSTcorp_CDDVDW_SE-S084D_0j468695-0:0 /dev/dvdrw /dev/dvd 
> /dev/cdrom
> /dev/cdrw
> DEVNAME=/dev/sr0
> DEVPATH=/devices/pci:00/:00:14.0/usb1/1-2/1-2:1.0/host4/target4:0:0/4:0:0:0/block/sr0
> DEVTYPE=disk
> ID_BUS=usb
> ID_CDROM=1
> ID_CDROM_CD=1
> ID_CDROM_CD_R=1
> ID_CDROM_CD_RW=1
> ID_CDROM_DVD=1
> ID_CDROM_DVD_PLUS_R=1
> ID_CDROM_DVD_PLUS_RW=1
> ID_CDROM_DVD_PLUS_R_DL=1
> ID_CDROM_DVD_R=1
> ID_CDROM_DVD_RAM=1
> ID_CDROM_DVD_RW=1
> ID_CDROM_MRW=1
> ID_CDROM_MRW_W=1
> ID_FOR_SEAT=block-pci-_00_14_0-usb-0_2_1_0-scsi-0_0_0_0
> ID_INSTANCE=0:0
> ID_MODEL=CDDVDW_SE-S084D
> ID_MODEL_ENC=CDDVDW\x20SE-S084D\x20
> ID_MODEL_ID=1836
> ID_PATH=pci-:00:14.0-usb-0:2:1.0-scsi-0:0:0:0
> ID_PATH_TAG=pci-_00_14_0-usb-0_2_1_0-scsi-0_0_0_0
> ID_REVISION=TS00
> ID_SERIAL=TSSTcorp_CDDVDW_SE-S084D_0j468695-0:0
> ID_SERIAL_SHORT=0j468695
> ID_TYPE=cd
> ID_USB_DRIVER=usb-storage
> ID_USB_INTERFACES=:080250:
> ID_USB_INTERFACE_NUM=00
> ID_VENDOR=TSSTcorp
> ID_VENDOR_ENC=TSSTcorp
> ID_VENDOR_ID=0e8d
> MAJOR=11
> MINOR=0
> SUBSYSTEM=block
> SYSTEMD_READY=0
> TAGS=:systemd:uaccess:seat:
> USEC_INITIALIZED=16198574878
> glaubitz@ikarus:/sys/block$
>
> Would it be possible to extend sunvdc to display additional fields? In 
> particular, it would be very useful
> if sunvdc could indicate whether the virtual block device is a regular disk 
> or a CD-ROM drive.


Adrian,

wouldn't it be easier to patch d-i (userspace) to add support for
"ID_FS_TYPE=iso9660" as /dev/cdrom (besides of "ID_CDROM=1"), instead
of patching kernel (sunvdc.c) sources?

PS: running qemu on x86_64 as

$ qemu-system-sparc64  -hda hda.img  -cdrom debian-7.11.0-sparc-netinst.iso

and inside qemu :

# udevadm info -q env -p /sys/block/sr0
DEVLINKS=/dev/cdrom1 /dev/disk/by-id/ata-EQUMD_DVR-MO_MQ_3
/dev/disk/by-label/Debian\x207.11.0\x20sparc\x201
/dev/disk/by-path/platform-ffe2b5a8-pci-:00:05.0-scsi-1:0:0:0
/dev/dvd1
DEVNAME=/dev/sr0
DEVPATH=/devices/root/ffe2b5a8/pci:00/:00:05.0/host1/target1:0:0/1:0:0:0/block/sr0
DEVTYPE=disk
GENERATED=1
ID_ATA=1
ID_BUS=ata
ID_CDROM=1
ID_CDROM_DVD=1
ID_CDROM_MEDIA=1
ID_CDROM_MEDIA_CD=1
ID_CDROM_MEDIA_SESSION_COUNT=1
ID_CDROM_MEDIA_TRACK_COUNT=1
ID_CDROM_MEDIA_TRACK_COUNT_DATA=1
ID_CDROM_MRW=1
ID_CDROM_MRW_W=1
ID_FS_LABEL=Debian_7.11.0_sparc_1
ID_FS_LABEL_ENC=Debian\x207.11.0\x20sparc\x201
ID_FS_TYPE=iso9660
ID_FS_USAGE=filesystem
ID_MODEL=EQUMD_DVR-MO
ID_MODEL_ENC=EQUMD\x20DVR-MO\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20\x20
ID_PART_TABLE_TYPE=sun
ID_PATH=platform-ffe2b5a8-pci-:00:05.0-scsi-1:0:0:0
ID_PATH_TAG=platform-ffe2b5a8-pci-_00_05_0-scsi-1_0_0_0
ID_REVISION=.2+5
ID_SERIAL=EQUMD_DVR-MO_MQ_3
ID_SERIAL_SHORT=MQ_3
ID_TYPE=cd
MAJOR=11
MINOR=0
SUBSYSTEM=block
TAGS=:udev-acl:
UDEV_LOG=3
USEC_INITIALIZED=39425573

dmesg cut:

[   37.995639] udevd[43]: starting version 175
[   43.839280] SCSI subsystem initialized
[   44.074912] libata version 3.00 loaded.
[   44.168027] scsi0 : pata_cmd64x
[   44.174100] scsi1 : pata_cmd64x
[   44.175483] ata1: PATA max UDMA/33 cmd 0x1fe02008080 ctl
0x1fe02008100 bmdma 0x1fe02008280 irq 7
[   44.176309] ata2: PATA max UDMA/33 cmd 0x1fe02008180 ctl
0x1fe02008200 bmdma 0x1fe02008288 irq 7
[   44.197377] pata_cmd64x: active 10 recovery 10 setup 3.
[   44.197550] pata_cmd64x: active 10 recovery 10 setup 3.
[   44.366417] ata1.01: NODEV

Re: git kernel (4.9.0-rc3) hard lockup on cpu

2017-02-27 Thread Anatoly Pugachev

Another one

looking from hypervisor side, it's currently spinning at 100% load.

Console log:

[3242997.705804] fib_no_return.e[88919]: segfault at fff8000100045a20
ip fff800010095c180 (rpc fff800010095cfb4) sp fff8000100045b10 error
30002 in
libc-2.24.so[fff80001008dc000+15e000]
[3242998.037056] Kernel unaligned access at TPC[4a94b0] source_load+0x30/0x80
[3242998.037106] Kernel unaligned access at TPC[4b57f0]
find_busiest_group+0x190/0x9c0
[3242998.037145] Kernel unaligned access at TPC[4b57f4]
find_busiest_group+0x194/0x9c0
[3242998.037153] [ cut here ]
[3242998.037171] WARNING: CPU: 96 PID: 89282 at
kernel/sched/core.c:103 update_rq_clock+0x84/0xa0
[3242998.037173] Modules linked in: xt_tcpudp xt_multiport
xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_n
at nf_conntrack tun flash n2_rng rng_core camellia_sparc64 des_sparc64
des_generic aes_sparc64 md5_sparc64 sha512_sparc64[3242998.037232]
Kernel una
ligned access at TPC[4b5810] find_busiest_group+0x1b0/0x9c0
[3242998.037242] Kernel unaligned access at TPC[4b581c]
find_busiest_group+0x1bc/0x9c0
[3242998.037333]  sha256_sparc64 sha1_sparc64 ip_tables x_tables
autofs4 ext4 crc16 jbd2 fscrypto mbcache btrfs xor zlib_deflate
raid6_pq crc32c_spa
rc64 sunvnet sunvdc
[3242998.037454] CPU: 96 PID: 89282 Comm:  Not tainted 4.9.0-rc5+ #2
[3242998.037490] Call Trace:
[3242998.037513] ---[ end trace cf2c87b49379299d ]---
[3242998.037532] [ cut here ]
[3242998.037560] WARNING: CPU: 96 PID: 89282 at
kernel/sched/sched.h:772 update_curr+0xe8/0x320
[3242998.037588] Modules linked in: xt_tcpudp xt_multiport
xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_n
at nf_conntrack tun flash n2_rng rng_core camellia_sparc64 des_sparc64
des_generic aes_sparc64 md5_sparc64 sha512_sparc64 sha256_sparc64
sha1_sparc6
4 ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache btrfs
xor zlib_deflate raid6_pq crc32c_sparc64 sunvnet sunvdc
[3242998.037812] CPU: 96 PID: 89282 Comm:  Tainted: GW
4.9.0-rc5+ #2
[3242998.037852] Call Trace:
[3242998.037874] ---[ end trace cf2c87b49379299e ]---
[3242998.037894] [ cut here ]
[3242998.037918] WARNING: CPU: 96 PID: 89282 at
kernel/sched/sched.h:772 update_curr+0xe8/0x320
[3242998.037947] Modules linked in: xt_tcpudp xt_multiport
xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack tun flash n2_rng
rng_core camellia_sparc64 des_sparc64 des_generic aes_sparc64
md5_sparc64 sha512_sparc64 sha256_sparc64 sha1_sparc64 ip_tables
x_tables autofs4 ext4 crc16 jbd2 fscrypto mbcache btrfs xor
zlib_deflate raid6_pq crc32c_sparc64 sunvnet sunvdc
[3242998.038178] CPU: 96 PID: 89282 Comm:  Tainted: GW
4.9.0-rc5+ #2
[3242998.038189] Unable to handle kernel paging request in mna handler
[3242998.038189]  at virtual address f80001006504d297
[3242998.038191] current->{active_,}mm->context = 0dd5
[3242998.038193] current->{active_,}mm->pgd = fff8001e0c85c000
[3242998.038195]   \|/  \|/
[3242998.038195]   "@'/ .. \`@"
[3242998.038195]   /_| \__/ |_\
[3242998.038195]  \__U_/
[3242998.038198] fib_no_sync.exe(89279): Oops [#1]
[3242998.038203] CPU: 45 PID: 89279 Comm: fib_no_sync.exe Tainted: G
 W   4.9.0-rc5+ #2
[3242998.038207] task: fff8001dc9af2900 task.stack: fff8001e15b9c000
[3242998.038211] TSTATE: 009911e01607 TPC: 007a32e4 TNPC:
007a32e8 Y: 0129Tainted: GW
[3242998.038223] TPC: 
[3242998.038227] g0: 0001 g1: 00dd0099 g2:
f8000100666860ff g3: f800010083ef20ff
[3242998.038230] Unable to handle kernel paging request in mna handler
[3242998.038230] g4: fff8001dc9af2900 g5: fff800207c7da000 g6:
fff8001e15b9c000 g7: f800010083ef20ff
[3242998.038231]  at virtual address f80001006504d297
[3242998.038232] o0: 0001 o1: f80001006504d297 o2:
0001 o3: 
[3242998.038234] current->{active_,}mm->context = 0dd5
[3242998.038236] o4: 0080 o5: 0080 sp:
fff8001e15b9ede1 ret_pc: 004cbb18
[3242998.038237] current->{active_,}mm->pgd = fff8001e0c85c000
[3242998.038249] RPC: <__lock_acquire+0x78/0x1ca0>
[3242998.038254] l0: fff8001dc9af2900 l1: 01a67400 l2:
00dd00b9 l3: 00cff400
[3242998.038255]   \|/  \|/
[3242998.038255]   "@'/ .. \`@"
[3242998.038255]   /_| \__/ |_\
[3242998.038255]  \__U_/
[3242998.038256] l4: f80001006504d0ff l5: 0001 l6:
 l7: 
[3242998.038258] fib_no_sync.exe(89308): Oops [#2]
[3242998.038259] i0: 00dd0099 i1:  i2:
 i3: 
[3242998.038262] CPU: 46 PID: 89308 Comm: fib_no_sync.exe Tainted: G
 W   4.9.0-rc5+ #2
[3242998.038264]

Re: grub2 with SPARC support available for testing

2017-02-24 Thread Anatoly Pugachev

On Thu, Feb 9, 2017 at 9:44 PM, Eric Snowberg  wrote:
> I put together a wiki to help document the installation process:
>
> https://github.com/esnowberg/grub2-sparc/wiki

Eric,

I tried this on 2 my testing LDOMs, one with sun disk/partition table,
and another one with gpt disk/partition table.
As well on an older sun4u v215 physical machine (sun disks/partition table).

Can you please suggest how do I install with gpt disk? Details is below.

Ldom with sun disk had no problems with installing by wiki guide, and
booted with grub ok:

# apt search grub2
grub2/unreleased,now 2.02~beta3-5+sparc64 sparc64 [installed]
  GRand Unified Bootloader, version 2 (dummy package)

# parted /dev/vdiska p
Model: Unknown (unknown)
Disk /dev/vdiska: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: sun
Disk Flags:

Number  Start   End SizeFile system  Flags
 1  0.00B   296MB   296MB   ext3 boot
 2  296MB   10.3GB  10.0GB  ext4
 4  10.3GB  10.7GB  436MB   btrfs

# cat /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="ro zswap.enabled=1 noresume"
GRUB_CMDLINE_LINUX=""
GRUB_DISABLE_LINUX_UUID=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_DISABLE_RECOVERY="true"
GRUB_PRELOAD_MODULES="iso9660"

# mkdir /boot/grub

# grub-mkconfig -o /boot/grub/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.10.0-06476-gbc49a7831b11
Found initrd image: /boot/initrd.img-4.10.0-06476-gbc49a7831b11
Found linux image: /boot/vmlinuz-4.9.0-2-sparc64-smp
Found initrd image: /boot/initrd.img-4.9.0-2-sparc64-smp
done

# grub-install --force --skip-fs-probe /dev/vdiska1
Installing for sparc64-ieee1275 platform.
grub-install: warning: Discarding improperly nested partition
(hostdisk//dev/vdiska,sun1,sun2).
grub-install: warning: Discarding improperly nested partition
(hostdisk//dev/vdiska,sun1,sun4).
grub-install: warning: Attempting to install GRUB to a disk with
multiple partition labels.  This is not supported yet..
grub-install: warning: Embedding is not possible.  GRUB can only be
installed in this setup by using blocklists.  However, blocklists are
UNRELIABLE and their use is discouraged..
Installation finished. No error reported.

# reboot

[3015439.460830] reboot: Restarting system
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.

SPARC T5-2, No Keyboard
Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.5, 32. GB memory available, Serial #83494642.
Ethernet address 0:14:4f:fa:6:f2, Host ID: 84fa06f2.

Boot device: vdisk1  File and args:
WARNING: Unsupported bootblk image, can not extract fcode

WARNING: Bootblk fcode extraction failed
GRUB Loading kernel

 GNU GRUB  version 2.02~beta3-5+sparc64

 ++
 | Debian GNU/Linux   |
 |*Advanced options for Debian GNU/Linux  |
 ||
 ||
 ||
 ||
 ||
 ||
 ||
 ||
 ||
 ||
 ++

  Use the ^ and v keys to select which entry is highlighted.
  Press enter to boot the selected OS, `e' to edit the commands
  before booting or `c' for a command-line.

Another one I have with gpt disk , I was unable to install grub boot block:

root@deb4g:/# parted /dev/vdiska p
Model: Unknown (unknown)
Disk /dev/vdiska: 161GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End Size   File system  Name  Flags
 1  1049kB  1000MB  999MB  ext3   boot, esp
 2  1001MB  161GB   160GB  ext4 2

root@deb4g:/# grub-install /dev/vdiska
Installing for sparc64-ieee1275 platform.
grub-install: warning: this GPT partition label contains no BIOS Boot
Partition; embedding won't be possible.
grub-install: warning: Embedding is not possible.  GRUB can only be
installed in this setup

Re: git kernel (4.9.0-rc3) hard lockup on cpu

2016-12-13 Thread Anatoly Pugachev

not a hard lockup, but soft one:

Message from syslogd@landau at Dec 13 01:31:08 ...
 kernel:[9977482.116040] NMI watchdog: BUG: soft lockup - CPU#0 stuck
for 23s! [rm:99074]
Dec 13 01:31:08 landau kernel: NMI watchdog: BUG: soft lockup - CPU#0
stuck for 23s! [rm:99074]
Dec 13 01:31:08 landau kernel: Modules linked in: tun xt_tcpudp
xt_multiport xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack n2_rng flash rng_core
camellia_sparc64 des_sparc64 des_generic aes_sparc64 md5_sparc64
sha512_sparc64 sha256_sparc64 sha1_sparc64 ip_tables x_tables autofs4
ext4 crc16 jbd2 fscrypto mbcache btrfs xor zlib_deflate raid6_pq
crc32c_sparc64 sunvnet sunvdc
Dec 13 01:31:08 landau kernel: irq event stamp: 8623998
Dec 13 01:31:08 landau kernel: hardirqs last  enabled at (8623997):
Dec 13 01:31:08 landau kernel: [<00a7016c>]
_raw_spin_unlock_irqrestore+0x2c/0x80
Dec 13 01:31:08 landau kernel: hardirqs last disabled at (8623998):
Dec 13 01:31:08 landau kernel: [<00426be8>] sys_call_table+0x648/0x820
Dec 13 01:31:08 landau kernel: softirqs last  enabled at (8306350):
Dec 13 01:31:08 landau kernel: [<00a72ca0>] __do_softirq+0x220/0x5c0
Dec 13 01:31:08 landau kernel: softirqs last disabled at (8306305):
Dec 13 01:31:08 landau kernel: [<0042c1b4>]
do_softirq_own_stack+0x34/0x60
Dec 13 01:31:08 landau kernel: CPU: 0 PID: 99074 Comm: rm Not tainted
4.9.0-rc5+ #2
Dec 13 01:31:08 landau kernel: task: fff8000eead080e0 task.stack:
fff8000f0bdcc000
Dec 13 01:31:08 landau kernel: TSTATE: 11001604 TPC:
00a7017c TNPC: 00a70180 Y: 0021Not tainted
Dec 13 01:31:08 landau kernel: TPC: <_raw_spin_unlock_irqrestore+0x3c/0x80>
Dec 13 01:31:08 landau kernel: g0: 0fff g1:
 g2: 014e46e8 g3: 0003
Dec 13 01:31:08 landau kernel: g4: fff8000eead080e0 g5:
fff800207c23a000 g6: fff8000f0bdcc000 g7: ca934ba76367fbad
Dec 13 01:31:08 landau kernel: o0: 01ba7888 o1:
0003 o2: 007cd758 o3: fff8000eead080e0
Dec 13 01:31:08 landau kernel: o4: 0004 o5:
fff8000eead08970 sp: fff8000f0bdcee81 ret_pc: 00a7016c
Dec 13 01:31:08 landau kernel: RPC: <_raw_spin_unlock_irqrestore+0x2c/0x80>
Dec 13 01:31:08 landau kernel: l0:  l1:
 l2: fff8000f0c931810 l3: 00d00400
Dec 13 01:31:08 landau kernel: l4: 000e l5:
0001 l6:  l7: 0008
Dec 13 01:31:08 landau kernel: i0: 01ba7888 i1:
 i2: 00c78800 i3: 01b38648
Dec 13 01:31:08 landau kernel: i4:  i5:
01ba7888 i6: fff8000f0bdcef31 i7: 007cd758
Dec 13 01:31:08 landau kernel: I7: 
Dec 13 01:31:08 landau kernel: Call Trace:
Dec 13 01:31:08 landau kernel:  [007cd758]
debug_object_activate+0xd8/0x240
Dec 13 01:31:08 landau kernel:  [004f5700]
__call_rcu.constprop.61+0x20/0x6e0
Dec 13 01:31:08 landau kernel:  [004f5e5c] call_rcu_sched+0x1c/0x40
Dec 13 01:31:08 landau kernel:  [00653ea0] dentry_free+0x40/0xc0
Dec 13 01:31:08 landau kernel:  [00654060] __dentry_kill+0x140/0x1c0
Dec 13 01:31:08 landau kernel:  [00654cd4]
shrink_dentry_list+0x114/0x5a0
Dec 13 01:31:08 landau kernel:  [00655220]
shrink_dcache_parent+0x20/0x80
Dec 13 01:31:08 landau kernel:  [006494b8] vfs_rmdir+0xb8/0x160
Dec 13 01:31:08 landau kernel:  [0064b2b4] do_rmdir+0x1f4/0x220
Dec 13 01:31:08 landau kernel:  [0064bdf8] SyS_unlinkat+0x38/0x60
Dec 13 01:31:08 landau kernel:  [00406234] linux_sparc_syscall+0x34/0x44

Re: Segmentation faults with aptitude and apt-get on some packages

2016-11-30 Thread Anatoly Pugachev

> $ apt update
> $ apt-get build-dep linux
> $ cd /usr/src/
> $ wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.8.11.tar.xz
> $ tar xf linux-4.8.11.tar.xz
> $ cd linux-4.8.11
> $ cp -av /boot/config-$(uname -r)
> $ make oldconfig (Just always  until the prompt comes back)

or instead of "make oldconfig" use "make olddefconfig"
without hitting enter on every new kernel option

Re: git kernel (4.9.0-rc3) hard lockup on cpu

2016-11-07 Thread Anatoly Pugachev

On Sun, Nov 6, 2016 at 2:56 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
> Hello!
>
> We've got kernel cpu lockup after running for about 3.5 days
> (machine/ldom is a buildd instance for debian ports for sparc64 arch).
>
> Kernel is built from git 0c183d92b20b5c84ca655b45ef57b3318b83eb9e
>
> Nov 06 13:17:34 landau kernel: Watchdog detected hard LOCKUP on cpu 1
> Nov 06 13:17:34 landau kernel: [ cut here ]
> Nov 06 13:17:34 landau kernel: WARNING: CPU: 1 PID: 45713 at
> arch/sparc/kernel/nmi.c:80 perfctr_irq+0x2ec/0x340
> Nov 06 13:17:34 landau kernel: [478B blob data]
> Nov 06 13:17:34 landau kernel: CPU: 1 PID: 45713 Comm: llvm-nm Not
> tainted 4.9.0-rc3+ #1
> Nov 06 13:17:34 landau kernel: Call Trace:
> Nov 06 13:17:34 landau kernel:  [004670a0] __warn+0xc0/0xe0
> Nov 06 13:17:34 landau kernel:  [004670f4] warn_slowpath_fmt+0x34/0x60
> Nov 06 13:17:34 landau kernel:  [009943ec] perfctr_irq+0x2ec/0x340
> Nov 06 13:17:34 landau kernel:  [004209f4] tl0_irq15+0x14/0x20
> Nov 06 13:17:34 landau kernel:  [00993afc]
> _raw_spin_lock_irqsave+0x1c/0x40
> Nov 06 13:17:34 landau kernel:  [0057cd08] 
> pagevec_lru_move_fn+0x68/0xe0
> Nov 06 13:17:34 landau kernel:  [0057e0f0] 
> lru_add_drain_cpu+0x110/0x120
> Nov 06 13:17:34 landau kernel:  [0057e38c] lru_add_drain+0xc/0x20
> Nov 06 13:17:34 landau kernel:  [005a8bb8] unmap_region+0x18/0xc0
> Nov 06 13:17:34 landau kernel:  [005aa7c4] do_munmap+0x1e4/0x380
> Nov 06 13:17:34 landau kernel:  [005ab9d0] mmap_region+0xf0/0x5a0
> Nov 06 13:17:34 landau kernel:  [005ac1f4] do_mmap+0x374/0x400
> Nov 06 13:17:34 landau kernel:  [0058e1a0] vm_mmap_pgoff+0x60/0xa0
> Nov 06 13:17:34 landau kernel:  [005a9f64] SyS_mmap_pgoff+0xa4/0x240
> Nov 06 13:17:34 landau kernel:  [0042efd4] SyS_mmap+0x54/0x80
> Nov 06 13:17:34 landau kernel:  [004061f4] 
> linux_sparc_syscall+0x34/0x44
> Nov 06 13:17:34 landau kernel: ---[ end trace 9fdbc358eb6eb335 ]---

another one:

Nov 07 09:01:32 landau: INFO: task kworker/42:2:239256 blocked for
more than 120 seconds.
Nov 07 09:01:32 landau:   Not tainted 4.9.0-rc3+ #1
Nov 07 09:01:32 landau: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 07 09:01:32 landau: kworker/42:2D 00992e70 0
239256  2 0x0700
Nov 07 09:01:32 landau: Workqueue: cgroup_destroy css_free_work_fn
Nov 07 09:01:32 landau: Call Trace:
Nov 07 09:01:32 landau:  [00990304] schedule+0x24/0xa0
Nov 07 09:01:32 landau:  [00992e70] schedule_timeout+0x210/0x360
Nov 07 09:01:32 landau:  [00990d74] wait_for_common+0x94/0x140
Nov 07 09:01:32 landau:  [00990e38] wait_for_completion+0x18/0x40
Nov 07 09:01:32 landau:  [004cefb4] _rcu_barrier+0x194/0x1e0
Nov 07 09:01:32 landau:  [004cf030] rcu_barrier+0x10/0x20
Nov 07 09:01:32 landau:  [00597734] release_caches+0x54/0x80
Nov 07 09:01:32 landau:  [005977f4] memcg_destroy_kmem_caches+0x94/0xe0
Nov 07 09:01:32 landau:  [005dbfe0] mem_cgroup_css_free+0xe0/0x140
Nov 07 09:01:32 landau:  [004fd9b4] css_free_work_fn+0x34/0x420
Nov 07 09:01:32 landau:  [0048172c] process_one_work+0x16c/0x460
Nov 07 09:01:32 landau:  [00481b40] worker_thread+0x120/0x520
Nov 07 09:01:32 landau:  [0048804c] kthread+0xac/0xe0
Nov 07 09:01:32 landau:  [00406044] ret_from_fork+0x1c/0x2c
Nov 07 09:01:32 landau:  []   (null)
Nov 07 09:02:38 landau: Watchdog detected hard LOCKUP on cpu 25
Nov 07 09:02:38 landau: [ cut here ]
Nov 07 09:02:38 landau: WARNING: CPU: 25 PID: 245304 at
arch/sparc/kernel/nmi.c:80 perfctr_irq+0x2ec/0x340
Nov 07 09:02:38 landau: [478B blob data]
Nov 07 09:02:38 landau: CPU: 25 PID: 245304 Comm: cc1 Not tainted 4.9.0-rc3+ #1
Nov 07 09:02:38 landau: Call Trace:
Nov 07 09:02:38 landau:  [004670a0] __warn+0xc0/0xe0
Nov 07 09:02:38 landau:  [004670f4] warn_slowpath_fmt+0x34/0x60
Nov 07 09:02:38 landau:  [009943ec] perfctr_irq+0x2ec/0x340
Nov 07 09:02:38 landau:  [004209f4] tl0_irq15+0x14/0x20
Nov 07 09:02:38 landau: ---[ end trace ef8661be81a54612 ]---

git kernel (4.9.0-rc3) hard lockup on cpu

2016-11-06 Thread Anatoly Pugachev

Hello!

We've got kernel cpu lockup after running for about 3.5 days
(machine/ldom is a buildd instance for debian ports for sparc64 arch).

Kernel is built from git 0c183d92b20b5c84ca655b45ef57b3318b83eb9e

Nov 06 13:17:34 landau kernel: Watchdog detected hard LOCKUP on cpu 1
Nov 06 13:17:34 landau kernel: [ cut here ]
Nov 06 13:17:34 landau kernel: WARNING: CPU: 1 PID: 45713 at
arch/sparc/kernel/nmi.c:80 perfctr_irq+0x2ec/0x340
Nov 06 13:17:34 landau kernel: [478B blob data]
Nov 06 13:17:34 landau kernel: CPU: 1 PID: 45713 Comm: llvm-nm Not
tainted 4.9.0-rc3+ #1
Nov 06 13:17:34 landau kernel: Call Trace:
Nov 06 13:17:34 landau kernel:  [004670a0] __warn+0xc0/0xe0
Nov 06 13:17:34 landau kernel:  [004670f4] warn_slowpath_fmt+0x34/0x60
Nov 06 13:17:34 landau kernel:  [009943ec] perfctr_irq+0x2ec/0x340
Nov 06 13:17:34 landau kernel:  [004209f4] tl0_irq15+0x14/0x20
Nov 06 13:17:34 landau kernel:  [00993afc]
_raw_spin_lock_irqsave+0x1c/0x40
Nov 06 13:17:34 landau kernel:  [0057cd08] pagevec_lru_move_fn+0x68/0xe0
Nov 06 13:17:34 landau kernel:  [0057e0f0] lru_add_drain_cpu+0x110/0x120
Nov 06 13:17:34 landau kernel:  [0057e38c] lru_add_drain+0xc/0x20
Nov 06 13:17:34 landau kernel:  [005a8bb8] unmap_region+0x18/0xc0
Nov 06 13:17:34 landau kernel:  [005aa7c4] do_munmap+0x1e4/0x380
Nov 06 13:17:34 landau kernel:  [005ab9d0] mmap_region+0xf0/0x5a0
Nov 06 13:17:34 landau kernel:  [005ac1f4] do_mmap+0x374/0x400
Nov 06 13:17:34 landau kernel:  [0058e1a0] vm_mmap_pgoff+0x60/0xa0
Nov 06 13:17:34 landau kernel:  [005a9f64] SyS_mmap_pgoff+0xa4/0x240
Nov 06 13:17:34 landau kernel:  [0042efd4] SyS_mmap+0x54/0x80
Nov 06 13:17:34 landau kernel:  [004061f4] linux_sparc_syscall+0x34/0x44
Nov 06 13:17:34 landau kernel: ---[ end trace 9fdbc358eb6eb335 ]---

Re: Regression with 4.7.2 on sun4u

2016-10-21 Thread Anatoly Pugachev

On Fri, Oct 21, 2016 at 12:12 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
> On Wed, Sep 7, 2016 at 1:01 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
>> On Wed, Sep 7, 2016 at 12:22 PM, John Paul Adrian Glaubitz
>> <glaub...@physik.fu-berlin.de> wrote:
>>> Hello!
>>>
>>> After kernel 4.7.2 entered Debian unstable, I decided to upgrade the 
>>> buildds and ran into an
>>> apparent regression with the 4.7.x kernels on sun4u machines:
>>
>> It's not only with sun4u, we're getting kernel OOPS on sun4v as well:
>
> debian packaged 4.7.6 kernel, machine is a LDOM on T5-2 server, OOPS
> after kernel boot within a few minutes:


reproduced with latest git 4.9.0-rc1+ (v4.9-rc1-148-g6f33d645) kernel.
Machine boots ok, i can login as unprivileged user (via ssh), compile
and install kernel, run sudo, install packages (apt upgrade),
apache/mysql and other startup daemons works, but if I try to login as
root via ssh, it throws kernel oops / illegal instruction.

Any idea how to debug this?

otherhost$ ssh ttip -l root -v
...
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessi...@openssh.com
debug1: Entering interactive session.
Write failed: Broken pipe
$

I can strace -f -p $pid_of_sshd , but not sure it would help.

URL version => http://paste.debian.net/plain/884751
kernel config => http://paste.debian.net/plain/884806

NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.

SPARC T5-2, No Keyboard
Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.38.5, 32. GB memory available, Serial #83494642.
Ethernet address 0:14:4f:fa:6:f2, Host ID: 84fa06f2.



Boot device: vdisk1  File and args:
SILO Version 1.4.14
boot:
Allocated 64 Megs of memory at 0x4000 for kernel
Uncompressing image...
Loaded kernel version 4.9.0
Loading initial ramdisk (13616359 bytes at 0x7400 phys, 0x40C0 virt)...

[0.00] PROMLIB: Sun IEEE Boot Prom 'OBP 4.38.5 2016/06/22 19:36'
[0.00] PROMLIB: Root node compatible: sun4v
[0.00] Linux version 4.9.0-rc1+ (mator@ttip) (gcc version
6.2.0 20161010 (Debian 6.2.0-6+sparc64) ) #19 SMP Fri Oct 21 14:47:01
MSK 2016
[0.00] bootconsole [earlyprom0] enabled
[0.00] ARCH: SUN4V
[0.00] Ethernet address: 00:14:4f:fa:06:f2
[0.00] MM: PAGE_OFFSET is 0xfff8 (max_phys_bits == 47)
[0.00] MM: VMALLOC [0x0001 --> 0x0006]
[0.00] MM: VMEMMAP [0x0006 --> 0x000c]
[0.00] Kernel: Using 3 locked TLB entries for main kernel image.
[0.00] Remapping the kernel... [0.00] done.
[0.00] OF stdout device is: /virtual-devices@100/console@1
[0.00] PROM: Built device tree with 67418 bytes of memory.
[0.00] MDESC: Size is 29648 bytes.
[0.00] PLATFORM: banner-name [SPARC T5-2]
[0.00] PLATFORM: name [ORCL,SPARC-T5-2]
[0.00] PLATFORM: hostid [84fa06f2]
[0.00] PLATFORM: serial# [0035260e]
[0.00] PLATFORM: stick-frequency [3b9aca00]
[0.00] PLATFORM: mac-address [144ffa06f2]
[0.00] PLATFORM: watchdog-resolution [1000 ms]
[0.00] PLATFORM: watchdog-max-timeout [3153600 ms]
[0.00] PLATFORM: max-cpus [1024]
[0.00] Top of RAM: 0x82f94e000, Total RAM: 0x7ff386000
[0.00] Memory hole size: 773MB
[0.00] Allocated 24576 bytes for kernel page tables.
[0.00] Zone ranges:
[0.00]   Normal   [mem 0x3040-0x00082f94dfff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x3040-0x6feb]
[0.00]   node   0: [mem 0x6ff4-0x6ff47fff]
[0.00]   node   0: [mem 0x7000-0x00082f8b3fff]
[0.00]   node   0: [mem 0x00082f944000-0x00082f94dfff]
[0.00] Initmem setup node 0 [mem 0x3040-0x00082f94dfff]
[0.00] Booting Linux...
[0.00] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
[0.00] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit,fmaf,vis3]
[0.00] CPU CAPS: [hpc,ima,pause,cbcond,aes,des,kasumi,camellia]
[0.00] CPU CAPS: [md5,sha1,sha256,sha512,mpmul,montmul,montsqr,crc32c]
[0.00] percpu: Embedded 10 pages/cpu @fff800082d00 s43096
r8192 d30632 u131072
[0.00] SUN4V: Mondo queue sizes [cpu(131072) dev(16384) r(8192) nr(256)]
[0.00] Built 1 zonelists in Zone order, mobility grouping on.
Total pages: 4155855
[0.00] Kernel command line: root=/dev/vdiska2 ro
zswap.enabled=1 noresume
[0.00] log_buf_len individual max cpu contribution: 4096 bytes
[0.00] log_buf_len total cpu_

Re: Regression with 4.7.2 on sun4u

2016-10-21 Thread Anatoly Pugachev

On Wed, Sep 7, 2016 at 1:01 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
> On Wed, Sep 7, 2016 at 12:22 PM, John Paul Adrian Glaubitz
> <glaub...@physik.fu-berlin.de> wrote:
>> Hello!
>>
>> After kernel 4.7.2 entered Debian unstable, I decided to upgrade the buildds 
>> and ran into an
>> apparent regression with the 4.7.x kernels on sun4u machines:
>
> It's not only with sun4u, we're getting kernel OOPS on sun4v as well:

debian packaged 4.7.6 kernel, machine is a LDOM on T5-2 server, OOPS
after kernel boot within a few minutes:

[  OK  ] Started Update UTMP about System Runlevel Changes.

Debian GNU/Linux stretch/sid ttip console

ttip login: [5435944.506755] systemd-journald[334]: File
/var/log/journal/c02366aaa6e44182ba0caa130d880aac/user-1000.journal
corrupted or uncleanly shut down, renaming and replacing.
[5435988.433976]   \|/  \|/
[5435988.433976]   "@'/ .. \`@"
[5435988.433976]   /_| \__/ |_\
[5435988.433976]  \__U_/
[5435988.434000] systemd(1): Kernel illegal instruction [#1]
[5435988.434008] CPU: 0 PID: 1 Comm: systemd Not tainted
4.7.0-1-sparc64-smp #1 Debian 4.7.6-1
[5435988.434016] task: fff8000815f43620 ti: fff8000815f44000 task.ti:
fff8000815f44000
[5435988.434023] TSTATE: 004411001603 TPC: 005c2a9c TNPC:
005c2aa0 Y: Not tainted
[5435988.434039] TPC: <__kmalloc_track_caller+0x13c/0x200>
[5435988.434044] g0: fff800082c3e6000 g1: 0040 g2:
 g3: 0001
[5435988.434051] g4: fff8000815f43620 g5: fff800082c3e6000 g6:
fff8000815f44000 g7: 00636500
[5435988.434057] o0:  o1: 03ff o2:
 o3: 
[5435988.434063] o4: 00b0d450 o5: 00b0d400 sp:
fff8000815f46f01 ret_pc: 005c2a94
[5435988.434075] RPC: <__kmalloc_track_caller+0x134/0x200>
[5435988.434082] l0: 4000 l1: fff8304020e0 l2:
000612208db8 l3: 
[5435988.434091] l4: fff800082d00de68 l5: 000612208dd8 l6:
 l7: fff8000100e9a000
[5435988.434101] i0: fff8304020e0 i1: 024000c0 i2:
00585ffc i3: 024000c0
[5435988.434110] i4: 000b i5: 024000c0 i6:
fff8000815f46fb1 i7: 00585f88
[5435988.434127] I7: <kstrdup+0x28/0x60>
[5435988.434132] Call Trace:
[5435988.434140]  [00585f88] kstrdup+0x28/0x60
[5435988.434148]  [00585ffc] kstrdup_const+0x3c/0x60
[5435988.434158]  [00657b10] __kernfs_new_node+0x10/0xc0
[5435988.434165]  [00658d64] kernfs_new_node+0x24/0x60
[5435988.434173]  [0065913c] kernfs_create_dir_ns+0x1c/0x80
[5435988.434182]  [004fb864] cgroup_mkdir+0x1c4/0x2c0
[5435988.434189]  [00658cbc] kernfs_iop_mkdir+0x5c/0xa0
[5435988.434198]  [005e7a78] vfs_mkdir+0xd8/0x160
[5435988.434205]  [005ed4fc] SyS_mkdirat+0xdc/0x160
[5435988.434212]  [005ed598] SyS_mkdir+0x18/0x40
[5435988.434223]  [004061f4] linux_sparc_syscall+0x34/0x44
[5435988.434229] Disabling lock debugging due to kernel taint
[5435988.434237] Caller[00585f88]: kstrdup+0x28/0x60
[5435988.434245] Caller[00585ffc]: kstrdup_const+0x3c/0x60
[5435988.434252] Caller[00657b10]: __kernfs_new_node+0x10/0xc0
[5435988.434259] Caller[00658d64]: kernfs_new_node+0x24/0x60
[5435988.434266] Caller[0065913c]: kernfs_create_dir_ns+0x1c/0x80
[5435988.434273] Caller[004fb864]: cgroup_mkdir+0x1c4/0x2c0
[5435988.434281] Caller[00658cbc]: kernfs_iop_mkdir+0x5c/0xa0
[5435988.434288] Caller[005e7a78]: vfs_mkdir+0xd8/0x160
[5435988.434295] Caller[005ed4fc]: SyS_mkdirat+0xdc/0x160
[5435988.434302] Caller[005ed598]: SyS_mkdir+0x18/0x40
[5435988.434309] Caller[004061f4]: linux_sparc_syscall+0x34/0x44
[5435988.434316] Caller[fff80001001ef870]: 0xfff80001001ef870
[5435988.434322] Instruction DUMP: a018  400eed9b  0100
<3fed> 0100  106fffc3  0100  c611a036  05002be5
[5435988.435227]   \|/  \|/
[5435988.435227]   "@'/ .. \`@"
[5435988.435227]   /_| \__/ |_\
[5435988.435227]  \__U_/
[5435988.435250] systemd(1): Kernel illegal instruction [#2]
[5435988.435259] CPU: 0 PID: 1 Comm: systemd Tainted: G  D
4.7.0-1-sparc64-smp #1 Debian 4.7.6-1
[5435988.435273] task: fff8000815f43620 ti: fff8000815f44000 task.ti:
fff8000815f44000
[5435988.435285] TSTATE: 004411001602 TPC: 005c30a0 TNPC:
005c30a4 Y: 00198519Tainted: G  D
[5435988.435300] TPC: <__kmalloc+0x140/0x200>
[5435988.435309] g0: 00b4bc00 g1: 0040 g2:
 g3: 0800ad84
[5435988.435322] g4: fff8000815f43620 g5: fff800082c3e6000 g6:
fff8000815f44000 g7: 
[5435988.435336] o0:  o1: 03ff

Re: silo: FTBFS on sparc64

2016-10-02 Thread Anatoly Pugachev

Package: silo
Followup-For: Bug #730478
User: debian-sparc@lists.debian.org
Usertags: sparc64
X-Debbugs-Cc: debian-sparc@lists.debian.org

Hello!

Can we please close this bug report since there's working silo package in the 
unreleased sparc64.

# apt show silo
Package: silo
Version: 1.4.14+git20141019-5
Priority: important
Section: admin
Maintainer: Debootloaders SILO Maintainers Team 

Installed-Size: 295 kB
Depends: libc6 (>= 2.3)
Replaces: sparc-utils (<< 1.8-1)
Download-Size: 133 kB
APT-Manual-Installed: yes
APT-Sources: http://ftp.ports.debian.org/debian-ports unreleased/main sparc64 
Packages
Description: Sparc Improved LOader
 Like LILO or MILO, but for SPARC.
 This is the program you need to use if you plan to boot SPARC/Linux
 via a hard drive, floppy or CDROM.  It installs to the boot block of
 your system and will allow for booting of Linux, Solaris, and SunOS.




Thanks.

Re: Regression with 4.7.2 on sun4u

2016-09-07 Thread Anatoly Pugachev

On Wed, Sep 7, 2016 at 12:22 PM, John Paul Adrian Glaubitz
 wrote:
> Hello!
>
> After kernel 4.7.2 entered Debian unstable, I decided to upgrade the buildds 
> and ran into an
> apparent regression with the 4.7.x kernels on sun4u machines:

It's not only with sun4u, we're getting kernel OOPS on sun4v as well:

landau login: [860301.509777]   \|/  \|/
[860301.509777]   "@'/ .. \`@"
[860301.509777]   /_| \__/ |_\
[860301.509777]  \__U_/
[860301.509801] systemd-journal(1059): Kernel illegal instruction [#1]
[860301.509807]   \|/  \|/
[860301.509807]   "@'/ .. \`@"
[860301.509807]   /_| \__/ |_\
[860301.509807]  \__U_/
[860301.509814] CPU: 88 PID: 1059 Comm: systemd-journal Not tainted
4.7.0-1-sparc64-smp #1 Debian 4.7.2-1
[860301.509817] task: fff800201eb3a4c0 ti: fff800201e83c000 task.ti:
fff800201e83c000
[860301.509819] TSTATE: 004411001600 TPC: 005c27dc TNPC:
005c27e0 Y: 0004Not tainted
[860301.509830] TPC: <__kmalloc_track_caller+0x13c/0x200>
[860301.509832] g0: 000231c55926 g1: 0040 g2:
 g3: c000
[860301.509834] g4: fff800201eb3a4c0 g5: fff800207cae6000 g6:
fff800201e83c000 g7: 6700
[860301.509837] o0:  o1: 03ff o2:
fff800201e83fd98 o3: 4040
[860301.509838] o4:  o5: fff800201e83fbb8 sp:
fff800201e83edb1 ret_pc: 005c27d4
[860301.509841] RPC: <__kmalloc_track_caller+0x134/0x200>
[860301.509843] l0:  l1: fff8804024a0 l2:
 l3: 
[860301.509845] l4:  l5:  l6:
 l7: fff80001009fc000
[860301.509847] i0: fff8804024a0 i1: 024106c0 i2:
0085333c i3: 024106c0
[860301.509849] i4: 01c0 i5: 024106c0 i6:
fff800201e83ee61 i7: 00853280
[860301.509855] I7: <__kmalloc_reserve.isra.5+0x20/0x80>
[860301.509856] Call Trace:
[860301.509859]  [00853280] __kmalloc_reserve.isra.5+0x20/0x80
[860301.509861]  [0085333c] __alloc_skb+0x5c/0x180
[860301.509864]  [008534a4] alloc_skb_with_frags+0x44/0x1e0
[860301.509873]  [0084ddcc] sock_alloc_send_pskb+0x1ec/0x220
[860301.509883]  [009219cc] unix_dgram_sendmsg+0x12c/0x600
[860301.509887]  [008485dc] sock_sendmsg+0x3c/0x80
[860301.509890]  [00849010] ___sys_sendmsg+0x250/0x260
[860301.509893]  [00849f94] __sys_sendmsg+0x34/0x80
[860301.509895]  [0084a000] SyS_sendmsg+0x20/0x40
[860301.509904]  [004061f4] linux_sparc_syscall+0x34/0x44
[860301.509906] Disabling lock debugging due to kernel taint

full boot log in attachment as well.


sun4v-4.7.2-boot.log.gz
Description: GNU Zip compressed data

Re: [sparc64] sigbus in e2fsck

2016-08-30 Thread Anatoly Pugachev

On Tue, Aug 30, 2016 at 10:16 PM, Theodore Ts'o <ty...@mit.edu> wrote:
> On Tue, Aug 30, 2016 at 06:12:39PM +0300, Anatoly Pugachev wrote:
>>
>> (gdb) p bh->b_data
>> $1 = 
>> "\300;9\230\000\000\000\005\000\000\253\204\000\000\000\070\000\000\000\000\000\000$\022\000\000\000\000\000\000$<\000\000\000\000\000\000$\270\000\000\000\000\000\000$]\000\000\000\000\000\000$\024",
>> '\000' 
>> (gdb) p offset
>> $2 = 16
>> (gdb) p *bh->b_data
>> $3 = -64 '\300'
>> (gdb) p *(bh->b_data+offset)
>> $6 = 0 '\000'
>
> Can you give us "p >b_data" (so we can get the starting address of
> b_data to make sure it's aligned) and "p offset" (so we can check and
> make sure offset is sane)?

(gdb) p >b_data
$7 = (char (*)[1024]) 0x2e9b9c
(gdb) p offset
$8 = 16

Re: [sparc64] sigbus in e2fsck

2016-08-30 Thread Anatoly Pugachev

On Tue, Aug 30, 2016 at 5:58 PM, Theodore Ts'o <ty...@mit.edu> wrote:
> On Tue, Aug 30, 2016 at 02:56:45PM +0200, John Paul Adrian Glaubitz wrote:
>> On 08/30/2016 02:42 PM, Anatoly Pugachev wrote:
>> > ../../e2fsck/recovery.c:866
>> > 866 blocknr = ext2fs_be64_to_cpu(* ((__u64
>> > *) (bh->b_data+offset)));
>>
>> The reason is that this expression is casting "char * b_data" [1] into u64 
>> [2]
>> which provokes unaligned access. Since such expression are often inevitable,
>> it's probably best to modify the conversion macros in bitops.h [3] to be
>> safe against unaligned accesses.
>
> I don't think that's it.  b_data is a 4k buffer should be 8 byte
> aligned.  For a file system with 64-bit blocks (which you presumably
> have since we're on the be64 path as shown in your debugger output)
> the offset is initially set to 16, and is incremented in chunks of 8
> bytes.  So there shouldn't be any unaligned access.
>
> Since you are able to provke this in a debugger, can you have gdb
> print out the value of bh->b_data and offset, so we can be sure what's
> going on?

(gdb) p bh->b_data
$1 = 
"\300;9\230\000\000\000\005\000\000\253\204\000\000\000\070\000\000\000\000\000\000$\022\000\000\000\000\000\000$<\000\000\000\000\000\000$\270\000\000\000\000\000\000$]\000\000\000\000\000\000$\024",
'\000' 
(gdb) p offset
$2 = 16
(gdb) p *bh->b_data
$3 = -64 '\300'
(gdb) p *(bh->b_data+offset)
$6 = 0 '\000'

[sparc64] sigbus in e2fsck

2016-08-30 Thread Anatoly Pugachev

Hello!

I'm getting SIGBUS (unaligned access?) in fsck.ext4 / e2fsck running
on debian sparc64 sid/unstable linux:

root@ttip:/home/mator/e2fsprogs# git describe
v1.43.1-14-g9a23fa8
root@ttip:/home/mator/e2fsprogs# git remote -v
origin  git://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git (fetch)
root@ttip:/home/mator/e2fsprogs/build/e2fsck# gdb -q
(gdb) file ./e2fsck
Reading symbols from ./e2fsck...done.
(gdb) set args /dev/vdiskc2
(gdb) run
Starting program: /home/mator/e2fsprogs/build/e2fsck/e2fsck /dev/vdiskc2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
e2fsck 1.43.1 (08-Jun-2016)
/dev/vdiskc2: recovering journal

Program received signal SIGBUS, Bus error.
0x00142f54 in scan_revoke_records (journal=0x2decd0,
bh=0x2e9b70, sequence=36855, info=0x7feef68) at
../../e2fsck/recovery.c:866
866 blocknr = ext2fs_be64_to_cpu(* ((__u64
*) (bh->b_data+offset)));
(gdb) bt
#0  0x00142f54 in scan_revoke_records (journal=0x2decd0,
bh=0x2e9b70, sequence=36855, info=0x7feef68) at
../../e2fsck/recovery.c:866
#1  0x00142bf8 in do_one_pass (journal=0x2decd0,
info=0x7feef68, pass=PASS_REVOKE) at ../../e2fsck/recovery.c:767
#2  0x00141c54 in journal_recover (journal=0x2decd0) at
../../e2fsck/recovery.c:273
#3  0x00139570 in recover_ext3_journal (ctx=0x2d1000) at
../../e2fsck/journal.c:940
#4  0x00139750 in e2fsck_run_ext3_journal (ctx=0x2d1000) at
../../e2fsck/journal.c:978
#5  0x001152d0 in main (argc=2, argv=0x7fef6e8) at
../../e2fsck/unix.c:1678
(gdb)

It should be the same sigbus which I have on debian boot (log cut):

[952000.246726] sunvnet: eth0: PORT ( remote-mac 00:14:4f:f8:38:39 )
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Begin: Will now check root file system ... fsck from util-linux 2.28.1
[/sbin/fsck.ext4 (1) -- /dev/vdiska2] fsck.ext4 -a -C0 /dev/vdiska2
/dev/vdiska2: recovering journal
Signal (10) SIGBUS si_code=BUS_ADRALN fault addr=0x1164a8c
fsck.ext4(+0x313f0)[0x10313f0]
fsck exited with status code 8
done.
Warning: File system check failed but did not detect errors
[952005.583237] EXT4-fs (vdiska2): INFO: recovery required on readonly
filesystem
[952005.583350] EXT4-fs (vdiska2): write access will be enabled during recovery
[952005.666515] EXT4-fs (vdiska2): recovery complete
[952005.680055] EXT4-fs (vdiska2): mounted filesystem with ordered
data mode. Opts: (null)
done.
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[952005.743934] ip_tables: (C) 2000-2006 Netfilter Core Team

Re: [sparc64] fio bus error

2016-08-22 Thread Anatoly Pugachev

On Mon, Aug 22, 2016 at 8:32 PM, Jens Axboe <ax...@kernel.dk> wrote:
> On 08/20/2016 09:19 AM, Anatoly Pugachev wrote:
>> Thread 2 "fio" received signal SIGBUS, Bus error.
>> [Switching to Thread 0x80011025b910 (LWP 15753)]
>> calc_log_samples () at stat.c:2461
>> 2461tmp = add_bw_samples(td, );
>> (gdb) bt
>> #0  calc_log_samples () at stat.c:2461
>> #1  0x0018e944 in helper_thread_main (data=0x800100ac5670)
>> at helper_thread.c:121
>> #2  0x80010063ba04 in start_thread (arg=0x80011025b910) at
>> pthread_create.c:335
>> #3  0x800100944f58 in __thread_start () at
>> ../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:93
>> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
>> (gdb)
>
>
> Does this help?
>
> diff --git a/stat.c b/stat.c
> index 552d88dde067..74c2686c660c 100644
> --- a/stat.c
> +++ b/stat.c
> @@ -2457,12 +2457,12 @@ int calc_log_samples(void)
> next = min(td->o.iops_avg_time, td->o.bw_avg_time);
> continue;
> }
> -   if (!per_unit_log(td->bw_log)) {
> +   if (td->bw_log && !per_unit_log(td->bw_log)) {
> tmp = add_bw_samples(td, );
> if (tmp < next)
> next = tmp;
> }
> -   if (!per_unit_log(td->iops_log)) {
> +   if (td->iops_log && !per_unit_log(td->iops_log)) {
> tmp = add_iops_samples(td, );
> if (tmp < next)
> next = tmp;
>


Jens,

yes, this patch fixed sigbus. Thanks.

[sparc64] fio bus error

2016-08-20 Thread Anatoly Pugachev

Hello!

I'm getting bus error on sparc64 debian sid linux with git compiled fio:

mator@nvg5120:~/fio.git$ git describe
fio-2.13-77-gd1f6fca

mator@nvg5120:~/fio.git$ cat /tmp/test.fio
[global]
bs=8k
iodepth=16
iodepth_batch=8
randrepeat=1
size=1m
directory=/home/mator/fio.dir
numjobs=5
[job1]
ioengine=sync
bs=1k
direct=1
rw=randread
filename=file1:file2


mator@nvg5120:~/fio.git$ gdb -q
(gdb) file ./fio
Reading symbols from ./fio...done.
(gdb) set args /tmp/test.fio
(gdb) run
Starting program: /home/mator/fio.git/fio /tmp/test.fio
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
job1: (g=0): rw=randread, bs=1K-1K/1K-1K/1K-1K, ioengine=sync, iodepth=16
...
fio-2.13-77-gd1f6f
[New Thread 0x80011025b910 (LWP 15753)]
Starting 5 processes

Thread 2 "fio" received signal SIGBUS, Bus error.
[Switching to Thread 0x80011025b910 (LWP 15753)]
calc_log_samples () at stat.c:2461
2461tmp = add_bw_samples(td, );
(gdb) bt
#0  calc_log_samples () at stat.c:2461
#1  0x0018e944 in helper_thread_main (data=0x800100ac5670)
at helper_thread.c:121
#2  0x80010063ba04 in start_thread (arg=0x80011025b910) at
pthread_create.c:335
#3  0x800100944f58 in __thread_start () at
../sysdeps/unix/sysv/linux/sparc/sparc64/clone.S:93
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

[sparc64] git kernel TPC and OOPS after enabling CONFIG_DEBUG_SLAB

2016-08-16 Thread Anatoly Pugachev

Hello!

I'm getting kernel (git describe v4.8-rc2-6-g3684b03) TPC and OOPS
after enabling CONFIG_DEBUG_SLAB=y on my test sparc64 debian sid
machine.
I wasn't able to trace back when it was introduced, but 4.6 and 4.7
kernels is also affected.

Boot log (dmesg):


[0.00] PROMLIB: Sun IEEE Boot Prom 'OBP 4.33.6.g 2016/03/11 06:05'
[0.00] PROMLIB: Root node compatible: sun4v
[0.00] Linux version 4.8.0-rc2+ (mator@nvg5120) (gcc version
6.1.1 20160802 (Debian 6.1.1-11) ) #73 SMP Tue Aug 16 14:15:35 MSK
2016
[0.00] debug: skip boot console de-registration.
[0.00] bootconsole [earlyprom0] enabled
[0.00] ARCH: SUN4V
[0.00] Ethernet address: 00:14:4f:ac:4a:18
[0.00] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.00] MM: VMALLOC [0x0001 --> 0x6000]
[0.00] MM: VMEMMAP [0x6000 --> 0xc000]
[0.00] Kernel: Using 2 locked TLB entries for main kernel image.
[0.00] Remapping the kernel... done.
[0.00] OF stdout device is: /virtual-devices@100/console@1
[0.00] PROM: Built device tree with 195069 bytes of memory.
[0.00] MDESC: Size is 61728 bytes.
[0.00] PLATFORM: banner-name [SPARC Enterprise T5120]
[0.00] PLATFORM: name [SUNW,SPARC-Enterprise-T5120]
[0.00] PLATFORM: hostid [84ac4a18]
[0.00] PLATFORM: serial# [00ab4130]
[0.00] PLATFORM: stick-frequency [457646c0]
[0.00] PLATFORM: mac-address [144fac4a18]
[0.00] PLATFORM: watchdog-resolution [1000 ms]
[0.00] PLATFORM: watchdog-max-timeout [3153600 ms]
[0.00] PLATFORM: max-cpus [64]
[0.00] Top of RAM: 0x3ffb16000, Total RAM: 0x3f76ac000
[0.00] Memory hole size: 132MB
[0.00] Allocated 16384 bytes for kernel page tables.
[0.00] Zone ranges:
[0.00]   Normal   [mem 0x0840-0x0003ffb15fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x0840-0x0003ffa89fff]
[0.00]   node   0: [mem 0x0003ffa9a000-0x0003ffaadfff]
[0.00]   node   0: [mem 0x0003ffb08000-0x0003ffb15fff]
[0.00] Initmem setup node 0 [mem 0x0840-0x0003ffb15fff]
[0.00] On node 0 totalpages: 2079574
[0.00]   Normal zone: 18278 pages used for memmap
[0.00]   Normal zone: 0 pages reserved
[0.00]   Normal zone: 2079574 pages, LIFO batch:15
[0.00] Booting Linux...
[0.00] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
[0.00] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit]
[0.00] percpu: Embedded 9 pages/cpu @8003ff00 s34200
r8192 d31336 u131072
[0.00] pcpu-alloc: s34200 r8192 d31336 u131072 alloc=1*4194304
[0.00] pcpu-alloc: [0] 00 01 02 03 04 05 06 07 08 09 10 11 12
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
[0.00] pcpu-alloc: [0] 32 33 34 35 36 37 38 39 40 41 42 43 44
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
[0.00] SUN4V: Mondo queue sizes [cpu(8192) dev(16384) r(8192) nr(256)]
[0.00] Built 1 zonelists in Zone order, mobility grouping on.
Total pages: 2061296
[0.00] Kernel command line: root=/dev/mapper/vg1-root ro
zswap.enabled=1 keep_bootcon console=ttyS0 noresume
[0.00] log_buf_len individual max cpu contribution: 4096 bytes
[0.00] log_buf_len total cpu_extra contributions: 258048 bytes
[0.00] log_buf_len min size: 131072 bytes
[0.00] log_buf_len: 524288 bytes
[0.00] early log buf free: 127576(97%)
[0.00] PID hash table entries: 4096 (order: 2, 32768 bytes)
[0.00] Dentry cache hash table entries: 2097152 (order: 11,
16777216 bytes)
[0.00] Inode-cache hash table entries: 1048576 (order: 10,
8388608 bytes)
[0.00] Sorting __ex_table...
[0.00] Memory: 16433256K/16636592K available (4999K kernel
code, 481K rwdata, 1320K rodata, 304K init, 984K bss, 203336K
reserved, 0K cma-reserved)
[0.00] Hierarchical RCU implementation.
[0.00]  Build-time adjustment of leaf fanout to 64.
[0.00]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=64.
[0.00] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=64
[0.00] NR_IRQS:2048 nr_irqs:2048 1
[0.00] SUN4V: Using IRQ API major 1, cookie only virqs disabled
[1349360.944214] clocksource: stick: mask: 0x
max_cycles: 0x10cc5ac4c8a, max_idle_ns: 440795218862 ns
[1349360.950803] clocksource: mult[dbabc5] shift[24]
[1349360.952989] clockevent: mult[952b25d1] shift[31]
[1349360.956004] Console: colour dummy device 80x25
[1349361.068108] Calibrating delay using timer specific routine..
2336.45 BogoMIPS (lpj=4672908)
[1349361.069084] pid_max: default: 65536 minimum: 512
[1349361.071178] Security Framework initialized
[1349361.071509] Yama:

[sparc64] qtcreator: segfaults on start

2016-08-04 Thread Anatoly Pugachev

Package: qtcreator
Version: 4.0.2-1
Severity: normal
User: debian-sparc@lists.debian.org
Usertags: sparc64
X-Debbugs-Cc: debian-sparc@lists.debian.org


Dear Maintainer,

   * What led up to the situation?

mator@nvg5120:~$ /usr/bin/qtcreator
failed to get the current screen resources
Qt: couldn't get core keyboard device info
Segmentation fault (core dumped)

mator@nvg5120:~$ gdb /usr/bin/qtcreator

(gdb) run
Starting program: /usr/bin/qtcreator
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
[New Thread 0x8001099fd910 (LWP 27814)]
failed to get the current screen resources
Qt: couldn't get core keyboard device info

Thread 1 "qtcreator" received signal SIGSEGV, Segmentation fault.
xcb_setup_vendor_end (R=R@entry=0x0) at xproto.c:869
869 xproto.c: No such file or directory.
(gdb) bt
#0  xcb_setup_vendor_end (R=R@entry=0x0) at xproto.c:869
#1  0x800104297140 in xcb_setup_pixmap_formats_iterator (R=R@entry=0x0) at 
xproto.c:892
#2  0x80010429717c in xcb_setup_roots_iterator (R=0x0) at xproto.c:909
#3  0x800109e57f6c in dri2_x11_add_configs_for_visuals 
(dri2_dpy=dri2_dpy@entry=0x2591f0, disp=disp@entry=0x258880, 
supports_preserved=supports_preserved@entry=true)
at ../../../src/egl/drivers/dri2/platform_x11.c:739
#4  0x800109e58fec in dri2_initialize_x11_swrast (disp=disp@entry=0x258880, 
drv=) at ../../../src/egl/drivers/dri2/platform_x11.c:1213
#5  0x800109e597cc in dri2_initialize_x11 (drv=, 
disp=0x258880) at ../../../src/egl/drivers/dri2/platform_x11.c:1493
#6  0x800109e51f18 in _eglMatchAndInitialize (dpy=0x258880) at 
../../../src/egl/main/egldriver.c:261
#7  0x800109e5201c in _eglMatchDriver (dpy=dpy@entry=0x258880, 
test_only=test_only@entry=0) at ../../../src/egl/main/egldriver.c:292
#8  0x800109e4ad98 in eglInitialize (dpy=0x258880, major=0x7fee380, 
minor=0x7fee384) at ../../../src/egl/main/eglapi.c:484
#9  0x800109d335e0 in ?? () from 
/usr/lib/sparc64-linux-gnu/qt5/plugins/xcbglintegrations/libqxcb-egl-integration.so
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: sparc64

Kernel: Linux 4.7.0-rc7-sparc64-smp (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages qtcreator depends on:
ii  libbotan-1.10-1   1.10.12-1.1
ii  libc6 2.23-4
ii  libclang1-3.6 1:3.6.2-3+b2
ii  libgcc1   1:6.1.1-10
ii  libqbscore1   1.5.1+dfsg-1
ii  libqbsqtprofilesetup1 1.5.1+dfsg-1
ii  libqt5concurrent5 5.6.1+dfsg-3
ii  libqt5core5a  5.6.1+dfsg-3
ii  libqt5designer5   5.6.1-2
ii  libqt5designercomponents5 5.6.1-2
ii  libqt5gui55.6.1+dfsg-3
ii  libqt5help5   5.6.1-2
ii  libqt5network55.6.1+dfsg-3
ii  libqt5printsupport5   5.6.1+dfsg-3
ii  libqt5qml5 [qtdeclarative-abi-5-6-0]  5.6.1-5
ii  libqt5quick5  5.6.1-5
ii  libqt5quickwidgets5   5.6.1-5
ii  libqt5sql55.6.1+dfsg-3
ii  libqt5sql5-sqlite 5.6.1+dfsg-3
ii  libqt5webkit5 5.6.1+dfsg-4+b1
ii  libqt5widgets55.6.1+dfsg-3
ii  libqt5xml55.6.1+dfsg-3
ii  libstdc++66.1.1-10
ii  qml-module-qtquick-controls   5.6.1-2
ii  qml-module-qtquick2   5.6.1-5
ii  qtchooser 58-gfab25f1-1
ii  qtcreator-data4.0.2-1

Versions of packages qtcreator recommends:
ii  clang1:3.6-33+b1
ii  gdb  7.11.1-2
ii  make 4.1-9
ii  qmlscene 5.6.1-5
ii  qt5-doc  5.6.1-1
ii  qt5-qmltooling-plugins   5.6.1-5
ii  qtbase5-dev-tools5.6.1+dfsg-3
ii  qtcreator-doc4.0.2-1
ii  qtdeclarative5-dev-tools 5.6.1-5
ii  qttools5-dev-tools   5.6.1-2
ii  qttranslations5-l10n 5.6.1-2
ii  qtxmlpatterns5-dev-tools 5.6.1-2
ii  xterm [x-terminal-emulator]  325-1

Versions of packages qtcreator suggests:
pn  cmake  
ii  g++4:6.1.1-1
ii  git1:2.8.1-1
pn  kdelibs5-data  
ii  subversion 1.9.4-2

-- no debconf information

Re: btrfs on sparc64 results in kernel stack trace in 1 minute test

2016-07-30 Thread Anatoly Pugachev

On Sat, Jul 30, 2016 at 12:52 AM, Jeff Mahoney <je...@suse.com> wrote:
>> On Jul 29, 2016, at 5:11 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
>> and in logs:
>>
>> Jul 30 00:05:48 nvg5120 kernel: BTRFS info (device loop0): inode
>> 227514 still on the orphan list
>> Jul 30 00:06:01 nvg5120 kernel: [ cut here ]
>> Jul 30 00:06:01 nvg5120 kernel: WARNING: CPU: 36 PID: 3110 at
>> fs/btrfs/inode.c:3215 btrfs_orphan_commit_root+0x188/0x1a0 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel: Modules linked in: loop btrfs
>> zlib_deflate sg e1000e ptp pps_core n2_crypto(+) flash sha256_generic
>> des_generic n2_rng rng_core sunrpc autofs4 ext4 crc16 jbd2 mbcache
>> raid10 raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy
>> async_pq raid6_pq async_xor xor async_tx raid0 multipath linear dm_mod
>> raid1 md_mod sd_mod mptsas scsi_transport_sas mptscsih scsi_mod
>> mptbase
>> Jul 30 00:06:02 nvg5120 kernel: CPU: 36 PID: 3110 Comm:
>> btrfs-transacti Tainted: G  D 4.7.0+ #51
>> Jul 30 00:06:02 nvg5120 kernel: Call Trace:
>> Jul 30 00:06:02 nvg5120 kernel:  [00463e44] __warn+0xa4/0xc0
>> Jul 30 00:06:02 nvg5120 kernel:  [10a2ae48]
>> btrfs_orphan_commit_root+0x188/0x1a0 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [10a214c0]
>> commit_fs_roots+0xa0/0x180 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [10a242d0]
>> btrfs_commit_transaction+0x4b0/0xd00 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [10a1cc30]
>> transaction_kthread+0xf0/0x1c0 [btrfs]
>> Jul 30 00:06:02 nvg5120 kernel:  [00480ff0] kthread+0xb0/0xe0
>> Jul 30 00:06:02 nvg5120 kernel:  [00406044] ret_from_fork+0x1c/0x2c
>> Jul 30 00:06:02 nvg5120 kernel:  []   (null)
>> Jul 30 00:06:02 nvg5120 kernel: ---[ end trace ee8374e54a090229 ]---
>>
> This is tainted D, which means there's an Oops above this in the log.  Can 
> you provide that?


Jeff,

it is another kernel OOPS, which i need to investigate:

Jul 29 21:25:35 nvg5120 kernel: e1000e :09:00.1 enp9s0f1: renamed from eth3
Jul 29 21:25:35 nvg5120 systemd-udevd[1488]: worker [1654] terminated
by signal 9 (Killed)
Jul 29 21:25:35 nvg5120 systemd-udevd[1488]: worker [1654] failed
while handling '/devices/root/f0283a50/f028681c'
Jul 29 21:25:36 nvg5120 systemd[1]: Found device ST914602SSUN146G 1.
Jul 29 21:25:40 nvg5120 kernel: e1000e :08:00.1 enp8s0f1: renamed from eth1
Jul 29 21:25:40 nvg5120 kernel: n2_crypto: md5 alg registration failed
Jul 29 21:25:40 nvg5120 kernel: n2cp f028681c:
/virtual-devices@100/n2cp@7: Unable to register algorithms.
Jul 29 21:25:40 nvg5120 kernel: sha1_sparc64: sparc64 sha1 opcode not available.
Jul 29 21:25:40 nvg5120 kernel: n2cp: probe of f028681c failed with error -22
Jul 29 21:25:40 nvg5120 kernel: n2_crypto: Found NCP at
/virtual-devices@100/ncp@6
Jul 29 21:25:40 nvg5120 kernel: n2_crypto: Registered NCS HVAPI version 2.0
Jul 29 21:25:40 nvg5120 kernel: Kernel unaligned access at TPC[577b68]
kmem_cache_alloc+0xa8/0x1a0
Jul 29 21:25:40 nvg5120 kernel: Unable to handle kernel paging request
in mna handler
Jul 29 21:25:40 nvg5120 kernel:  at virtual address 6b6aeb6f69f2cb6b
Jul 29 21:25:41 nvg5120 kernel: current->{active_,}mm->context =
07a2
Jul 29 21:25:41 nvg5120 kernel: current->{active_,}mm->pgd = 8003e9c72000
Jul 29 21:25:41 nvg5120 kernel:   \|/  \|/
  "@'/ .. \`@"
  /_| \__/ |_\
 \__U_/
Jul 29 21:25:41 nvg5120 kernel: systemd-udevd(1654): Oops [#1]
Jul 29 21:25:41 nvg5120 kernel: CPU: 56 PID: 1654 Comm: systemd-udevd
Not tainted 4.7.0+ #51
Jul 29 21:25:41 nvg5120 kernel: task: 8003ecf90a20 ti:
8003edcd4000 task.ti: 8003edcd4000
Jul 29 21:25:41 nvg5120 kernel: TSTATE: 004411e01605 TPC:
00577b68 TNPC: 00577b6c Y: Not tainted
Jul 29 21:25:41 nvg5120 kernel: TPC: <kmem_cache_alloc+0xa8/0x1a0>
Jul 29 21:25:41 nvg5120 kernel: g0:  g1:
6b6b6b6b6b6b6b6b g2:  g3: 
Jul 29 21:25:41 nvg5120 kernel: g4: 8003ecf90a20 g5:
8003fe876000 g6: 8003edcd4000 g7: cee0
Jul 29 21:25:41 nvg5120 kernel: o0:  o1:
03ff o2:  o3: 8003eee883c0
Jul 29 21:25:41 nvg5120 kernel: o4: 0080 o5:
0011 sp: 8003edcd6b51 ret_pc: 00577b34
Jul 29 21:25:41 nvg5120 kernel: RPC: <kmem_cache_alloc+0x74/0x1a0>
Jul 29 21:25:41 nvg5120 kernel: l0: 8003ffa28040 l1:
8003ffa28030 l2: d5c0 l3: 009f4800
Jul 29 21:25:41 nvg5120 kernel: l4:  l5:
000

Re: btrfs on sparc64 results in kernel stack trace in 1 minute test

2016-07-29 Thread Anatoly Pugachev

On Thu, Jul 14, 2016 at 1:29 PM, Filipe Manana <fdman...@gmail.com> wrote:
> On Thu, Jul 14, 2016 at 11:08 AM, Anatoly Pugachev <mator...@gmail.com> wrote:
>> Hi!
>>
>> I'm using git (describe, v4.7-rc7-16-gcf875cc) kernel,
>> with patch "fix extent buffer bitmap tests on big-endian systems", see
>> [1] (to be able to load/use btrfs module)
>>
>> and getting brtfs filesystem going to read only mode as well getting
>> kernel stack trace in 1 minute after started to copying files to fs.
>
> We've seen this happening on arm64 as well, and it's currently being
> investigated.

update,

I can't reproduce same trace on 4.7.0+ kernel (v4.7-0-g523d939) with
"big endian" patch [1] and btrfs-progs 4.7.
After about 50 minutes of cycle copy, got:

mator@nvg5120:~$ cnt=0; while true; do let cnt++; echo -n "$cnt ";
date; sleep 2; rm -rf /mnt/1/testdir; for i in  linux-2.6 gcc-6.1.0
v7.4.1a; do echo -n "$i "; rsync -a $i /mnt/1/testdir; done; done
1 Fri Jul 29 23:16:55 MSK 2016
linux-2.6 gcc-6.1.0 v7.4.1a 2 Fri Jul 29 23:34:18 MSK 2016
linux-2.6 gcc-6.1.0 v7.4.1a 3 Fri Jul 29 23:57:13 MSK 2016
rm: cannot remove '/mnt/1/testdir/linux-2.6/drivers/nvme': Directory not empty

and in logs:

Jul 30 00:05:48 nvg5120 kernel: BTRFS info (device loop0): inode
227514 still on the orphan list
Jul 30 00:06:01 nvg5120 kernel: [ cut here ]
Jul 30 00:06:01 nvg5120 kernel: WARNING: CPU: 36 PID: 3110 at
fs/btrfs/inode.c:3215 btrfs_orphan_commit_root+0x188/0x1a0 [btrfs]
Jul 30 00:06:02 nvg5120 kernel: Modules linked in: loop btrfs
zlib_deflate sg e1000e ptp pps_core n2_crypto(+) flash sha256_generic
des_generic n2_rng rng_core sunrpc autofs4 ext4 crc16 jbd2 mbcache
raid10 raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy
async_pq raid6_pq async_xor xor async_tx raid0 multipath linear dm_mod
raid1 md_mod sd_mod mptsas scsi_transport_sas mptscsih scsi_mod
mptbase
Jul 30 00:06:02 nvg5120 kernel: CPU: 36 PID: 3110 Comm:
btrfs-transacti Tainted: G  D 4.7.0+ #51
Jul 30 00:06:02 nvg5120 kernel: Call Trace:
Jul 30 00:06:02 nvg5120 kernel:  [00463e44] __warn+0xa4/0xc0
Jul 30 00:06:02 nvg5120 kernel:  [10a2ae48]
btrfs_orphan_commit_root+0x188/0x1a0 [btrfs]
Jul 30 00:06:02 nvg5120 kernel:  [10a214c0]
commit_fs_roots+0xa0/0x180 [btrfs]
Jul 30 00:06:02 nvg5120 kernel:  [10a242d0]
btrfs_commit_transaction+0x4b0/0xd00 [btrfs]
Jul 30 00:06:02 nvg5120 kernel:  [10a1cc30]
transaction_kthread+0xf0/0x1c0 [btrfs]
Jul 30 00:06:02 nvg5120 kernel:  [00480ff0] kthread+0xb0/0xe0
Jul 30 00:06:02 nvg5120 kernel:  [00406044] ret_from_fork+0x1c/0x2c
Jul 30 00:06:02 nvg5120 kernel:  []   (null)
Jul 30 00:06:02 nvg5120 kernel: ---[ end trace ee8374e54a090229 ]---


[1]. http://www.spinics.net/lists/linux-btrfs/msg57193.html

Re: [sparc64] mkfs.btrfs bus error / align issue?

2016-07-29 Thread Anatoly Pugachev

On Fri, Jul 29, 2016 at 3:41 PM, David Sterba <dste...@suse.cz> wrote:
> On Thu, Jul 28, 2016 at 11:34:58PM +0300, Anatoly Pugachev wrote:
>> well, I think mkfs.btrfs is fixed, since I just tested it with :
>
> Good news, thanks.
>
> quick stats of the TPC messages:
>
>  23 __btrfs_map_block+0x36c/0x1180
>   9 __remove_rbio_from_cache+0x38/0x140
>   6 lock_stripe_add+0xb0/0x360
>   4 __btrfs_map_block+0x3d4/0x1180
>   3 __btrfs_map_block+0xca0/0x1180
>
> running in 'gdb btrfs.ko' for each of the addresses should tell us what are 
> the
> locations:
>
> gdb> l *(__btrfs_map_block+0x36c)
> ...

installed fresh btrgs-progs from git
mator@nvg5120:~/btrfs-progs$ git describe --long
v4.7-0-g9d2ea01

recompiled kernel with debug info... and run xfstests/check 'btrfs/06?' again

mator@nvg5120:~/linux-2.6$ git describe --long
v4.7-0-g523d939

kernel is patched with [1] to enable btrfs module loading on
big-endian systems (not sure does current linux kernel git includes
this patch or not, used/checkout plain v4.7 tag which is 5 days old)

root@nvg5120:/home/mator/xfstests# ./check 'btrfs/06?'
FSTYP -- btrfs
PLATFORM  -- Linux/sparc64 nvg5120 4.7.0+
MKFS_OPTIONS  -- /dev/loop0
MOUNT_OPTIONS -- /dev/loop0 /mnt/scratch

btrfs/060156s
btrfs/061182s
btrfs/062312s
btrfs/063162s
btrfs/064152s
btrfs/06561s
btrfs/06665s
btrfs/067158s
btrfs/06874s
btrfs/06965s
Ran: btrfs/060 btrfs/061 btrfs/062 btrfs/063 btrfs/064 btrfs/065
btrfs/066 btrfs/067 btrfs/068 btrfs/069
Passed all 10 tests

$ journalctl -b -k | awk '/TPC/{print $11}' | sort | uniq -c | sort -n
  4 __btrfs_map_block+0xa10/0x1100
  5 lock_stripe_add+0xb0/0x340
  7 __btrfs_map_block+0x9d4/0x1100
  9 __remove_rbio_from_cache+0x30/0x140
 29 __btrfs_map_block+0x96c/0x1100


$ gdb -q /lib/modules/4.7.0+/kernel/fs/btrfs/btrfs.ko
Reading symbols from /lib/modules/4.7.0+/kernel/fs/btrfs/btrfs.ko...done.
(gdb) l *(__btrfs_map_block+0x96c)
0x8498c is in __btrfs_map_block (fs/btrfs/volumes.c:5615).
5610div_u64_rem(stripe_nr, num_stripes, );
5611
5612/* Fill in the logical address of each stripe */
5613tmp = stripe_nr * nr_data_stripes(map);
5614for (i = 0; i < nr_data_stripes(map); i++)
5615bbio->raid_map[(i+rot) % num_stripes] =
5616em->start + (tmp + i) * map->stripe_len;
5617
5618bbio->raid_map[(i+rot) % map->num_stripes] =
RAID5_P_STRIPE;
5619if (map->type & BTRFS_BLOCK_GROUP_RAID6)
(gdb) l *(__btrfs_map_block+0x9d4)
0x849f4 is in __btrfs_map_block (fs/btrfs/volumes.c:5618).
5613tmp = stripe_nr * nr_data_stripes(map);
5614for (i = 0; i < nr_data_stripes(map); i++)
5615bbio->raid_map[(i+rot) % num_stripes] =
5616em->start + (tmp + i) * map->stripe_len;
5617
5618bbio->raid_map[(i+rot) % map->num_stripes] =
RAID5_P_STRIPE;
5619if (map->type & BTRFS_BLOCK_GROUP_RAID6)
5620bbio->raid_map[(i+rot+1) % num_stripes] =
5621RAID6_Q_STRIPE;
5622}
(gdb) l *(__btrfs_map_block+0xa10)
0x84a30 is in __btrfs_map_block (fs/btrfs/volumes.c:5620).
5615bbio->raid_map[(i+rot) % num_stripes] =
5616em->start + (tmp + i) * map->stripe_len;
5617
5618bbio->raid_map[(i+rot) % map->num_stripes] =
RAID5_P_STRIPE;
5619if (map->type & BTRFS_BLOCK_GROUP_RAID6)
5620bbio->raid_map[(i+rot+1) % num_stripes] =
5621RAID6_Q_STRIPE;
5622}
5623
5624if (rw & REQ_DISCARD) {
(gdb) l *(lock_stripe_add+0xb0)
0xe0370 is in lock_stripe_add (fs/btrfs/raid56.c:685).
680 int walk = 0;
681
682 spin_lock_irqsave(>lock, flags);
683 list_for_each_entry(cur, >hash_list, hash_list) {
684 walk++;
685 if (cur->bbio->raid_map[0] == rbio->bbio->raid_map[0]) {
686 spin_lock(>bio_list_lock);
687
688 /* can we steal this cached rbio's pages? */
689 if (bio_list_empty(>bio_list) &&
(gdb) l *(__remove_rbio_from_cache+0x30)
0xdfe30 is in __remove_rbio_from_cache (include/linux/spinlock.h:302).
297 raw_spin_lock_init(&(_lock)->rlock);\
298 } while (0)
299
300 static __always_inline void spin_lock(spinlock_t *lock)
301 {
302 raw_spin_lock(>rlock);
303 }
304
305 static __always_inline void spin_lock_bh(spinlock_t *lock)
306 {


Thanks.

[1]. http://www.spinics.net/lists/linux-btrfs/msg57193.html

Re: [sparc64] mkfs.btrfs bus error / align issue?

2016-07-29 Thread Anatoly Pugachev

On Thu, Jul 28, 2016 at 11:34 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
> On Thu, Jul 28, 2016 at 9:04 PM, David Sterba <dste...@suse.cz> wrote:
>> On Thu, Jul 28, 2016 at 04:28:41PM +0200, John Paul Adrian Glaubitz wrote:
>>> On 07/28/2016 04:25 PM, John Paul Adrian Glaubitz wrote:
>>> > On 07/28/2016 04:01 PM, Anatoly Pugachev wrote:
>>> >> Program received signal SIGBUS, Bus error.
>>> >> 0x00177dfc in raid6_gen_syndrome (disks=4, bytes=65536,
>>> >> ptrs=0x2c4510) at raid6.c:87
>>> >> 87  wq0 = wp0 = *(unative_t *)[z0][d+0*NSIZE];
>>> >
>>> > That should be easy to fix. Just make the R values aligned with the
>>> > appropriate get_aligned functions, see David's previous commit [1]:
>>>
>>> Argh, those are called get_UNaligned_*, not get_aligned_*.
>>>
>>> > There are more lines in raid6.c which need the same fix, basically 
>>> > everything
>>> > with * (unative_t *).
>>>
>>> Oh, and you will somehow need to guard this with #if BITS_PER_LONG == 64 ...
>>> #else ... #endif respectively since you need to use different versions
>>> (64 vs. 32) of get_unaligned_* depending on the size of unative_t.
>>
>> And I've fixed it that way, now pushed to devel ("btrfs-progs: fix
>> unaligned access in raid6 calculations" [1]). Would be great if you or
>> Anatoly can test it so I can add it to the 4.7 release (ETA tomorrow).
>
> David,
> well, I think mkfs.btrfs is fixed, since I just tested it with :
> root@nvg5120:/home/mator/xfstests# ./check 'btrfs/06?'
> FSTYP -- btrfs
> PLATFORM  -- Linux/sparc64 nvg5120 4.7.0+
> MKFS_OPTIONS  -- /dev/loop0
> MOUNT_OPTIONS -- /dev/loop0 /mnt/scratch
>
> btrfs/060145s
> btrfs/061158s
> btrfs/062288s
> btrfs/063141s
> btrfs/064129s
> btrfs/06544s
> btrfs/06646s
> btrfs/067- output mismatch (see
> /home/mator/xfstests/results//btrfs/067.out.bad)
> --- tests/btrfs/067.out 2016-07-20 12:12:21.772228422 +0300
> +++ /home/mator/xfstests/results//btrfs/067.out.bad 2016-07-28
> 22:54:00.059192629 +0300
> @@ -1,2 +1,3 @@
>  QA output created by 067
>  Silence is golden
> +Scrub find errors in "-m single -d single" test
> ...
> (Run 'diff -u tests/btrfs/067.out
> /home/mator/xfstests/results//btrfs/067.out.bad'  to see the entire
> diff)
> btrfs/06857s
> btrfs/06945s
> Ran: btrfs/060 btrfs/061 btrfs/062 btrfs/063 btrfs/064 btrfs/065
> btrfs/066 btrfs/067 btrfs/068 btrfs/069
> Failures: btrfs/067
> Failed 1 of 10 tests
>
>
> previously (before mkfs.btrfs fix) , all tests from 06? were bad/failed.
>
> Starting from "tests/btrfs/064" kernel started to log TPC (Trap
> Program Counter register) messages, a lot of them.
>
> Results of the this test i put on a webserver [1].
> Output of journalctl -b (from boot) with TPC messages are at [2].
>
> Not sure what we need to do with sparc64 btrfs module TPC messages.
> Probably fill kernel bugzilla report?
>
> Thanks.
>
> [1] http://u163.east.ru/btrfs/xfstests-btrfs-06x-results.tar.gz
> [2] http://u163.east.ru/btrfs/kernel-4.7.0+-logs-xfstests-06x.txt.gz
>
> PS: my xfstests setup is the following:
>
> # mount tmpfs -t tmpfs -o size=13g /ramdisk/
> /ramdisk# for i in 1 2 3 4 5 6; do fallocate -l 1g scratch${i}; done
> /ramdisk# fallocate -l 4g testvol1
>
> /ramdisk# for i in *; do losetup -f $i; done
> /home/mator/xfstests# losetup
> NAME   SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO
> /dev/loop0 0  0 0  0 /ramdisk/scratch1   0
> /dev/loop1 0  0 0  0 /ramdisk/scratch2   0
> /dev/loop2 0  0 0  0 /ramdisk/scratch3   0
> /dev/loop3 0  0 0  0 /ramdisk/scratch4   0
> /dev/loop4 0  0 0  0 /ramdisk/scratch5   0
> /dev/loop5 0  0 0  0 /ramdisk/scratch6   0
> /dev/loop6 0  0 0  0 /ramdisk/testvol1   0
>
> # mkfs.btrfs /dev/loop6
> btrfs-progs v4.6.1-66-g4367e35
> See http://btrfs.wiki.kernel.org for more information.
>
> Performing full device TRIM (4.00GiB) ...
> Label:  (null)
> UUID:   6a4d5918-adfe-469c-8454-9b28545b88bc
> Node size:  16384
> Sector size:8192
> Filesystem size:4.00GiB
> Block group profiles:
>   Data: single8.00MiB
>   Metadata: DUP 204.75MiB
>   System:   DUP   8.00MiB
> SSD detected:

Re: [sparc64] mkfs.btrfs bus error / align issue?

2016-07-28 Thread Anatoly Pugachev

On Thu, Jul 28, 2016 at 9:04 PM, David Sterba <dste...@suse.cz> wrote:
> On Thu, Jul 28, 2016 at 04:28:41PM +0200, John Paul Adrian Glaubitz wrote:
>> On 07/28/2016 04:25 PM, John Paul Adrian Glaubitz wrote:
>> > On 07/28/2016 04:01 PM, Anatoly Pugachev wrote:
>> >> Program received signal SIGBUS, Bus error.
>> >> 0x00177dfc in raid6_gen_syndrome (disks=4, bytes=65536,
>> >> ptrs=0x2c4510) at raid6.c:87
>> >> 87  wq0 = wp0 = *(unative_t *)[z0][d+0*NSIZE];
>> >
>> > That should be easy to fix. Just make the R values aligned with the
>> > appropriate get_aligned functions, see David's previous commit [1]:
>>
>> Argh, those are called get_UNaligned_*, not get_aligned_*.
>>
>> > There are more lines in raid6.c which need the same fix, basically 
>> > everything
>> > with * (unative_t *).
>>
>> Oh, and you will somehow need to guard this with #if BITS_PER_LONG == 64 ...
>> #else ... #endif respectively since you need to use different versions
>> (64 vs. 32) of get_unaligned_* depending on the size of unative_t.
>
> And I've fixed it that way, now pushed to devel ("btrfs-progs: fix
> unaligned access in raid6 calculations" [1]). Would be great if you or
> Anatoly can test it so I can add it to the 4.7 release (ETA tomorrow).


David,

well, I think mkfs.btrfs is fixed, since I just tested it with :

root@nvg5120:/home/mator/xfstests# ./check 'btrfs/06?'
FSTYP -- btrfs
PLATFORM  -- Linux/sparc64 nvg5120 4.7.0+
MKFS_OPTIONS  -- /dev/loop0
MOUNT_OPTIONS -- /dev/loop0 /mnt/scratch

btrfs/060145s
btrfs/061158s
btrfs/062288s
btrfs/063141s
btrfs/064129s
btrfs/06544s
btrfs/06646s
btrfs/067- output mismatch (see
/home/mator/xfstests/results//btrfs/067.out.bad)
--- tests/btrfs/067.out 2016-07-20 12:12:21.772228422 +0300
+++ /home/mator/xfstests/results//btrfs/067.out.bad 2016-07-28
22:54:00.059192629 +0300
@@ -1,2 +1,3 @@
 QA output created by 067
 Silence is golden
+Scrub find errors in "-m single -d single" test
...
(Run 'diff -u tests/btrfs/067.out
/home/mator/xfstests/results//btrfs/067.out.bad'  to see the entire
diff)
btrfs/06857s
btrfs/06945s
Ran: btrfs/060 btrfs/061 btrfs/062 btrfs/063 btrfs/064 btrfs/065
btrfs/066 btrfs/067 btrfs/068 btrfs/069
Failures: btrfs/067
Failed 1 of 10 tests


previously (before mkfs.btrfs fix) , all tests from 06? were bad/failed.

Starting from "tests/btrfs/064" kernel started to log TPC (Trap
Program Counter register) messages, a lot of them.

Results of the this test i put on a webserver [1].
Output of journalctl -b (from boot) with TPC messages are at [2].

Not sure what we need to do with sparc64 btrfs module TPC messages.
Probably fill kernel bugzilla report?

Thanks.

[1] http://u163.east.ru/btrfs/xfstests-btrfs-06x-results.tar.gz
[2] http://u163.east.ru/btrfs/kernel-4.7.0+-logs-xfstests-06x.txt.gz

PS: my xfstests setup is the following:

# mount tmpfs -t tmpfs -o size=13g /ramdisk/
/ramdisk# for i in 1 2 3 4 5 6; do fallocate -l 1g scratch${i}; done
/ramdisk# fallocate -l 4g testvol1

/ramdisk# for i in *; do losetup -f $i; done
/home/mator/xfstests# losetup
NAME   SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO
/dev/loop0 0  0 0  0 /ramdisk/scratch1   0
/dev/loop1 0  0 0  0 /ramdisk/scratch2   0
/dev/loop2 0  0 0  0 /ramdisk/scratch3   0
/dev/loop3 0  0 0  0 /ramdisk/scratch4   0
/dev/loop4 0  0 0  0 /ramdisk/scratch5   0
/dev/loop5 0  0 0  0 /ramdisk/scratch6   0
/dev/loop6 0  0 0  0 /ramdisk/testvol1   0

# mkfs.btrfs /dev/loop6
btrfs-progs v4.6.1-66-g4367e35
See http://btrfs.wiki.kernel.org for more information.

Performing full device TRIM (4.00GiB) ...
Label:  (null)
UUID:   6a4d5918-adfe-469c-8454-9b28545b88bc
Node size:  16384
Sector size:8192
Filesystem size:4.00GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP 204.75MiB
  System:   DUP   8.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   IDSIZE  PATH
1 4.00GiB  /dev/loop6

root@nvg5120:/home/mator/xfstests# cat local.config
export TEST_DEV=/dev/loop6
export TEST_DIR=/fst
export SCRATCH_DEV_POOL="/dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
/dev/loop4 /dev/loop5"
export SCRATCH_MNT=/mnt/scratch

Re: [sparc64] mkfs.btrfs bus error / align issue?

2016-07-28 Thread Anatoly Pugachev

On Thu, Jul 28, 2016 at 3:24 PM, David Sterba <dste...@suse.cz> wrote:
> On Thu, Jul 28, 2016 at 02:09:03PM +0200, John Paul Adrian Glaubitz wrote:
>> Hi David!
>>
>> On 07/28/2016 01:58 PM, Anatoly Pugachev wrote:
>> >> Can you please test with the current 'devel' branch? Fixed by the patch
>> >> "btrfs-progs: fix unaligned access calculating raid56 data" (depends on
>> >> another patch in devel). Thanks.
>>
>> Are you sure you pushed these changes? I don't see them in the devel branch 
>> [1].
>
> Oh shame, of course not. Now pushed.

David,

after checkout of devel tree:

mator@nvg5120:~/devel$ git describe
v4.6.1-64-g6d1564c
mator@nvg5120:~/devel$ git remote -v
origin  git://repo.or.cz/btrfs-progs-unstable/devel.git (fetch)
origin  git://repo.or.cz/btrfs-progs-unstable/devel.git (push)
mator@nvg5120:~/devel$ git branch
* devel
  master


a new place discovered:

Reading symbols from /opt/btrfs/bin/mkfs.btrfs...done.
(gdb) set args -f -draid6 -mraid6 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
(gdb) run
Starting program: /opt/btrfs/bin/mkfs.btrfs -f -draid6 -mraid6
/dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
btrfs-progs v4.6.1-64-g6d1564c
See http://btrfs.wiki.kernel.org for more information.

ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
Performing full device TRIM (2.00GiB) ...
Performing full device TRIM (2.00GiB) ...
Performing full device TRIM (2.00GiB) ...
Performing full device TRIM (2.00GiB) ...

Program received signal SIGBUS, Bus error.
0x00177dfc in raid6_gen_syndrome (disks=4, bytes=65536,
ptrs=0x2c4510) at raid6.c:87
87  wq0 = wp0 = *(unative_t *)[z0][d+0*NSIZE];
(gdb) bt
#0  0x00177dfc in raid6_gen_syndrome (disks=4, bytes=65536,
ptrs=0x2c4510) at raid6.c:87
#1  0x0015e174 in write_raid56_with_parity (info=0x2b37b0,
eb=0x2c9fe0, multi=0x2c4870, stripe_len=65536, raid_map=0x2c4570)
at volumes.c:2151
#2  0x00119bd0 in write_and_map_eb (trans=0x2ce250,
root=0x2c9d80, eb=0x2c9fe0) at disk-io.c:426
#3  0x00119f14 in write_tree_block (trans=0x2ce250,
root=0x2c9d80, eb=0x2c9fe0) at disk-io.c:459
#4  0x0011a54c in __commit_transaction (trans=0x2ce250,
root=0x2c9d80) at disk-io.c:562
#5  0x0011a858 in btrfs_commit_transaction (trans=0x2ce250,
root=0x2c9d80) at disk-io.c:598
#6  0x001a52e8 in main (argc=8, argv=0x7fef698) at mkfs.c:1809
(gdb)

Re: [sparc64] mkfs.btrfs bus error / align issue?

2016-07-28 Thread Anatoly Pugachev

On Thu, Jul 28, 2016 at 12:44 PM, David Sterba <dste...@suse.cz> wrote:
> On Wed, Jul 27, 2016 at 09:56:09PM +0200, David Sterba wrote:
>> On Wed, Jul 27, 2016 at 04:59:27PM +0300, Anatoly Pugachev wrote:
>> > Hello!
>> >
>> > Running xfstests suite, got in logs mkfs.btrfs bus error, debugging it
>> > shows the following :
>> >
>> > Program received signal SIGBUS, Bus error.
>> > 0x0015e160 in write_raid56_with_parity (info=0x2b17b0,
>> > eb=0x2c7fe0, multi=0x2c2870, stripe_len=65536, raid_map=0x2c2570) at
>> > volumes.c:2156
>> > 2156*(unsigned long *)(p_eb->data + i) 
>> > ^=
>>
>> Yeah, clear unaligned access. We have helpers for so I'll fix it. I was
>> looking for a way to simulate and catch that on x86 or at least let gcc
>> warn but no such thing seems to exist. Which means we might accidentally
>> introduce that in the future.
>
> Can you please test with the current 'devel' branch? Fixed by the patch
> "btrfs-progs: fix unaligned access calculating raid56 data" (depends on
> another patch in devel). Thanks.

David,

but where do I get -devel branch of btrfs-progs?
I just tried git://repo.or.cz/btrfs-progs-unstable/devel.git , but
still seeing last commit in it:

mator@nvg5120:~/1/devel$ git log -1 --pretty=format:"%h %s, %an, %ad"
--date=short
40650bf Btrfs progs v4.6.1, David Sterba, 2016-06-24

so where do i pull git repo to take latest development patches?

Thanks.

[sparc64] mkfs.btrfs bus error / align issue?

2016-07-27 Thread Anatoly Pugachev

Hello!

Running xfstests suite, got in logs mkfs.btrfs bus error, debugging it
shows the following :

mator@nvg5120:~/btrfs-progs$ git log -1 --oneline
40650bf Btrfs progs v4.6.1

root@nvg5120:/home/mator/xfstests# gdb
GNU gdb (Debian 7.11.1-2) 7.11.1
(gdb) file /opt/btrfs/bin/mkfs.btrfs
Reading symbols from /opt/btrfs/bin/mkfs.btrfs...done.
(gdb) set args -f -draid5 -mraid5 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
(gdb) run
Starting program: /opt/btrfs/bin/mkfs.btrfs -f -draid5 -mraid5
/dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
btrfs-progs v4.6.1
See http://btrfs.wiki.kernel.org for more information.

ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
Performing full device TRIM (2.00GiB) ...
Performing full device TRIM (2.00GiB) ...
Performing full device TRIM (2.00GiB) ...
Performing full device TRIM (2.00GiB) ...

Program received signal SIGBUS, Bus error.
0x0015e160 in write_raid56_with_parity (info=0x2b17b0,
eb=0x2c7fe0, multi=0x2c2870, stripe_len=65536, raid_map=0x2c2570) at
volumes.c:2156
2156*(unsigned long *)(p_eb->data + i) ^=
(gdb) bt
#0  0x0015e160 in write_raid56_with_parity (info=0x2b17b0,
eb=0x2c7fe0, multi=0x2c2870, stripe_len=65536, raid_map=0x2c2570)
at volumes.c:2156
#1  0x00119b30 in write_and_map_eb (trans=0x2cc250,
root=0x2c7d80, eb=0x2c7fe0) at disk-io.c:426
#2  0x00119e74 in write_tree_block (trans=0x2cc250,
root=0x2c7d80, eb=0x2c7fe0) at disk-io.c:459
#3  0x0011a4ac in __commit_transaction (trans=0x2cc250,
root=0x2c7d80) at disk-io.c:562
#4  0x0011a7b8 in btrfs_commit_transaction (trans=0x2cc250,
root=0x2c7d80) at disk-io.c:598
#5  0x001a2b04 in main (argc=8, argv=0x7fef698) at mkfs.c:1786
(gdb)

Can someone help please? Thanks.

PS: /dev/loop is ramdisk devices:

# mount tmpfs -t tmpfs -o size=12g /ramdisk
# fallocate -l 3.9G /ramdisk/testvol
# for i in 1 2 3 4; do fallocate -l 2G /ramdisk/scratch${i} ; done
# ls -lh /ramdisk/
total 12G
-rw-r--r-- 1 root root 2.0G Jul 27 16:16 scratch1
-rw-r--r-- 1 root root 2.0G Jul 27 16:16 scratch2
-rw-r--r-- 1 root root 2.0G Jul 27 16:16 scratch3
-rw-r--r-- 1 root root 2.0G Jul 27 16:16 scratch4
-rw-r--r-- 1 root root 3.9G Jul 27 16:15 testvol

# for i in /ramdisk/*; do echo -n "$i : "; losetup -f --show $i; done
/ramdisk/scratch1 : /dev/loop0
/ramdisk/scratch2 : /dev/loop1
/ramdisk/scratch3 : /dev/loop2
/ramdisk/scratch4 : /dev/loop3
/ramdisk/testvol : /dev/loop4

Re: Booting an LDOM (can't find kernel)

2016-07-26 Thread Anatoly Pugachev

On Tue, Jul 26, 2016 at 10:13 PM,   wrote:
> When in rescue mode, i'm not able to install openssh-server due to the issue
> with the repo's keyring which was described earlier. I appreciate all the
> responses so far.

https://wiki.debian.org/SecureApt#How_to_find_and_add_a_key

your key probably would be:

$ sudo apt-key list | grep -A3 705A2CE1
pub   rsa4096/705A2CE1 2016-01-24 [SC] [expires: 2017-02-01]
  Key fingerprint = 69DD B056 0EA8 6E87 E835  99B3 B4C8 6482 705A 2CE1
uid [ unknown] Debian Ports Archive Automatic Signing Key
(2016) 

so a few more steps, boot into rescue, chroot, bring up networking,
install ports key, install ssh and enable console-getty and ssh
services.

Re: Booting an LDOM (can't find kernel)

2016-07-25 Thread Anatoly Pugachev

On Mon, Jul 25, 2016 at 1:55 PM,   wrote:
> On Mon, Jul 25, 2016 at 06:56:35AM +0200, John Paul Adrian Glaubitz wrote:
>>
>> It's panicking because you specified the wrong root device:
>>
>> [90779.943892] EXT4-fs (vdiska1): mounting ext3 file system using the ext4 
>> subsystem
>> [90779.951847] EXT4-fs (vdiska1): mounted filesystem with ordered data mode. 
>> Opts: (null)
>> [90780.025922] Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x0200
>> [90780.025922]
>> [90780.026642] CPU: 0 PID: 1 Comm: init Tainted: GE   
>> 4.5.0-2-sparc64-smp #1 Debian 4.5.2-1
>> [90780.027297] Call Trace:
>> [90780.027511]  [0055777c] panic+0xdc/0x260
>> [90780.027860]  [0046a270] do_exit+0xb30/0xb40
>> [90780.028224]  [0046a330] do_group_exit+0x30/0xc0
>> [90780.029623]  [0046a3dc] SyS_exit_group+0x1c/0x40
>> [90780.030027]  [00406294] linux_sparc_syscall+0x34/0x44
>> [90780.031618] Press Stop-A (L1-A) to return to the boot prom
>> [90780.032028] ---[ end Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x0200
>> [90780.032028]
>>
>> You specified vdiska1 which is your boot part:
>>
>> > Kernel command line: root=/dev/vdiska1 ro console=ttyS0,9600,8n1
>>
>> /dev/vdiska1 is the boot partition for SILO, not the root filesystem.
>
> There's only two partitions: vdiska1 which contains everything and vdiska2
> which is swap.
>

please make a small ext2/ext3 partition (i suggest at least 300Mb in
size) at the beginning of disk, so /boot should be your /dev/sda1 ,
then make everything else (root fs, swap, etc...).  Works for me (in
LDOMs and physical servers).

btrfs on sparc64 results in kernel stack trace in 1 minute test

2016-07-14 Thread Anatoly Pugachev

Hi!

I'm using git (describe, v4.7-rc7-16-gcf875cc) kernel,
with patch "fix extent buffer bitmap tests on big-endian systems", see
[1] (to be able to load/use btrfs module)

and getting brtfs filesystem going to read only mode as well getting
kernel stack trace in 1 minute after started to copying files to fs.

Here's my steps to reproduce:

create a ramdisk and file on it

root@nvg5120:# mount -t tmpfs tmpfs -o size=8G /ramdisk
root@nvg5120:# dd if=/dev/zero of=/ramdisk/disk0 bs=1M count=7000

create btrfs filesystem

root@nvg5120:/home/mator/btrfs-progs# ./mkfs.btrfs /ramdisk/disk0
btrfs-progs v4.6.1
See http://btrfs.wiki.kernel.org for more information.

Label:  (null)
UUID:   81500fe0-da01-44dd-8fa6-d43646dd4916
Node size:  16384
Sector size:8192
Filesystem size:6.84GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP 358.00MiB
  System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   IDSIZE  PATH
1 6.84GiB  /ramdisk/disk0


mount it and start to copy files:

root@nvg5120:/home/mator/btrfs-progs# mount /ramdisk/disk0 /mnt
root@nvg5120:/home/mator/btrfs-progs# mkdir /mnt/1
root@nvg5120:/home/mator/btrfs-progs# chown mator /mnt/1

mator@nvg5120:~$ cnt=0; while true; do let cnt++; echo -n "$cnt ";
date; sleep 2; rm -rf /mnt/1/testdir; rsync -a debian-installer
linux-2.6 gcc-6.1.0 v7.4.1a /mnt/1/testdir;  [ $? != 0 ] && break;
done; date
1 Thu Jul 14 12:37:39 MSK 2016
rsync: rename 
"/mnt/1/testdir/gcc-6.1.0/gcc/testsuite/g++.dg/cpp0x/.variadic13.C.g2qPwQ"
-> "gcc-6.1.0/gcc/testsuite/g++.dg/cpp0x/variadic13.C": No such file
or directory (2)
...
rsync: mkstemp 
"/mnt/1/testdir/gcc-6.1.0/gcc/testsuite/g++.dg/torture/.pr33134.C.Y2O2ac"
failed: Read-only file system (30)
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at
io.c(504) [generator=3.1.1]
rsync: [generator] write error: Broken pipe (32)
Thu Jul 14 12:38:37 MSK 2016


root@nvg5120:/home/mator/btrfs-progs# journalctl -k -f
-- Logs begin at Mon 2016-04-18 15:59:04 MSK. --
Jul 14 12:37:29 nvg5120 kernel: loop: module loaded
Jul 14 12:37:30 nvg5120 kernel: BTRFS: device fsid
81500fe0-da01-44dd-8fa6-d43646dd4916 devid 1 transid 5 /dev/loop0
Jul 14 12:37:30 nvg5120 kernel: BTRFS info (device loop0): disk space
caching is enabled
Jul 14 12:37:30 nvg5120 kernel: BTRFS info (device loop0): has skinny extents
Jul 14 12:37:30 nvg5120 kernel: BTRFS info (device loop0): flagging fs
with big metadata feature
Jul 14 12:37:30 nvg5120 kernel: BTRFS info (device loop0): creating UUID tree
Jul 14 12:38:32 nvg5120 kernel: [ cut here ]
Jul 14 12:38:32 nvg5120 kernel: WARNING: CPU: 12 PID: 11815 at
fs/btrfs/inode.c:9832 btrfs_rename2+0x300/0x1300 [btrfs]
Jul 14 12:38:32 nvg5120 kernel: BTRFS: Transaction aborted (error -2)
Jul 14 12:38:33 nvg5120 kernel: Modules linked in: loop btrfs sg
n2_rng rng_core n2_crypto flash sha256_generic des_generic autofs4
ext4 crc16 jbd2 mbcache zlib_deflate raid10 raid456 async_raid6_recov
async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c
crc32c_generic raid0 multipath linear dm_mod raid1 md_mod sd_mod
mptsas scsi_transport_sas mptscsih scsi_mod mptbase e1000e ptp
pps_core [last unloaded: btrfs]
Jul 14 12:38:33 nvg5120 kernel: CPU: 12 PID: 11815 Comm: rsync
Tainted: GW   4.7.0-rc7+ #45
Jul 14 12:38:33 nvg5120 kernel: Call Trace:
Jul 14 12:38:33 nvg5120 kernel:  [004671c0] __warn+0xc0/0xe0
Jul 14 12:38:33 nvg5120 kernel:  [00467214] warn_slowpath_fmt+0x34/0x60
Jul 14 12:38:33 nvg5120 kernel:  [11a8c340]
btrfs_rename2+0x300/0x1300 [btrfs]
Jul 14 12:38:33 nvg5120 kernel:  [005e38f0] vfs_rename+0x630/0x980
Jul 14 12:38:33 nvg5120 kernel:  [005e9404] SyS_renameat2+0x484/0x500
Jul 14 12:38:33 nvg5120 kernel:  [005e94dc] SyS_rename+0x1c/0x40
Jul 14 12:38:33 nvg5120 kernel:  [004061f4]
linux_sparc_syscall+0x34/0x44
Jul 14 12:38:33 nvg5120 kernel: ---[ end trace 92caaac5f44fc009 ]---
Jul 14 12:38:34 nvg5120 kernel: BTRFS: error (device loop0) in
btrfs_rename:9832: errno=-2 No such entry
Jul 14 12:38:34 nvg5120 kernel: BTRFS info (device loop0): forced readonly

Thanks.

PS:  I can provide machine access to debug this issue (as well access
to serial management console, if it hangs).

1. http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg55792.html

Re: [PATCH] Btrfs: fix extent buffer bitmap tests on big-endian systems

2016-07-13 Thread Anatoly Pugachev

On Wed, Jul 13, 2016 at 2:21 AM, Omar Sandoval  wrote:
> From: Omar Sandoval 
>
> The in-memory bitmap code manipulates words and is therefore sensitive
> to endianness, while the extent buffer bitmap code addresses bytes and
> is byte-order agnostic. Because the byte addressing of the extent buffer
> bitmaps is equivalent to a little-endian in-memory bitmap, the extent
> buffer bitmap tests fail on big-endian systems.
>
> 34b3e6c92af1 ("Btrfs: self-tests: Fix extent buffer bitmap test fail on
> BE system") worked around another endianness bug in the tests but missed
> this one because ed9e4afdb055 ("Btrfs: self-tests: Execute page
> straddling test only when nodesize < PAGE_SIZE") disables this part of
> the test on ppc64. That change lost the original meaning of the test,
> however. We really want to test that an equivalent series of operations
> using the in-memory bitmap API and the extent buffer bitmap API produces
> equivalent results.
>
> To fix this, don't use memcmp_extent_buffer() or write_extent_buffer();
> do everything bit-by-bit.


Just tested patched kernel, able to load btrfs module and mount fs.
Thanks a lot!

btrfs module does not load on sparc64

2016-07-07 Thread Anatoly Pugachev

Hi!

Compiled linux kernel (git version 4.7.0-rc6+) using my own kernel
config file, enabling :

CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y
CONFIG_BTRFS_DEBUG=y
CONFIG_BTRFS_ASSERT=y

and now I can't load btrfs module:

# modprobe btrfs
modprobe: ERROR: could not insert 'btrfs': Invalid argument


and in logs (and on console):

[1897399.942697] Btrfs loaded, crc32c=crc32c-generic, debug=on, assert=on
[1897400.024645] BTRFS: selftest: sectorsize: 8192  nodesize: 8192
[1897400.098089] BTRFS: selftest: Running btrfs free space cache tests
[1897400.175863] BTRFS: selftest: Running extent only tests
[1897400.241871] BTRFS: selftest: Running bitmap only tests
[1897400.307877] BTRFS: selftest: Running bitmap and extent tests
[1897400.380329] BTRFS: selftest: Running space stealing from bitmap to extent
[1897400.470517] BTRFS: selftest: Free space cache tests finished
[1897400.542875] BTRFS: selftest: Running extent buffer operation tests
[1897400.621710] BTRFS: selftest: Running btrfs_split_item tests
[1897400.692929] BTRFS: selftest: Running extent I/O tests
[1897400.757459] BTRFS: selftest: Running find delalloc tests
[1897401.082670] BTRFS: selftest: Running extent buffer bitmap tests
[1897401.161223] BTRFS: selftest: Setting straddling pages failed
[1897401.233661] BTRFS: selftest: Extent I/O tests finished


this is sparc64 sid/unstable debian:

# uname -a
Linux nvg5120 4.7.0-rc6+ #38 SMP Thu Jul 7 14:51:23 MSK 2016 sparc64 GNU/Linux

# getconf PAGESIZE
8192

PS: using btrfs-progs from kdave repo,
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git ,
i'm able to create fs, but unable to mount:

root@nvg5120:/home/mator/btrfs-progs# ./mkfs.btrfs -f /dev/vg1/vol1
btrfs-progs v4.6.1
See http://btrfs.wiki.kernel.org for more information.

WARNING: failed to open /dev/btrfs-control, skipping device
registration: No such device
Label:  (null)
UUID:   ddd8a268-62e5-444c-9baf-6ba1b2d4448b
Node size:  16384
Sector size:8192
Filesystem size:15.00GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP   1.01GiB
  System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   IDSIZE  PATH
115.00GiB  /dev/vg1/vol1


Can someone help please? Thanks.

Re: rsync seem to be broken on sparc64

2016-06-28 Thread Anatoly Pugachev

On Tue, Jun 28, 2016 at 6:43 AM,   wrote:
>>
>> I've traced this down a bit further.
>>
>> Kernel 3.18.26 is working but 3.19.0 is not. Git bisect traced it down
>> to this commit.
>>
>> e5a4b0bb803b39a36478451eae53a880d2663d5b is the first bad commit
>> commit e5a4b0bb803b39a36478451eae53a880d2663d5b
>
>
> here is the gist of that commit...
>
> https://lkml.org/lkml/2014/12/5/25
>
> here is the output of rsync when the error occurs.
>
> root@Magi-01:~# rsync -a /export/test/* /export/test2
> rsync: [sender] write error: Broken pipe (32)
> rsync error: error in socket IO (code 10) at io.c(820) [sender=3.1.1]
> root@Magi-01:~#

Alex,

I can't reproduce on my baremetal T5120 (sun4v) installed with debian
sparc64, using kernel 4.7.0-rc4+ (git).

used the following as a cycle rsync copy:

mator@nvg5120:~$ du -sh debian-installer  linux-2.6 v7.4.1a gcc-6.1.0
494Mdebian-installer
2.6Glinux-2.6
1.2Gv7.4.1a
832Mgcc-6.1.0

mator@nvg5120:~$ cnt=0; while true; do let cnt++; echo $cnt; sleep 2;
rm -rf testdir; rsync -a debian-installer linux-2.6 gcc-6.1.0 v7.4.1a
testdir; done
1
2
3
4
5
6
7
8
9
10
^C

Can you please tell, what is /export/test/* ? Is it big files vs small
files, what is directory structure ?

Thanks.

Re: [sparc] niagara2 cpu, opcodes not available message?

2016-06-08 Thread Anatoly Pugachev

On Wed, Jun 8, 2016 at 8:30 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
> Hello!
>
> Can someone please tell, why do we get a bunch of the following
> messages on niagara2 cpu hardware (SPARC Enterprise T5120, T5220,
> T5140, and T5240 servers)
>
> Asking, because I see the following lines on kernel boot (removing
> first field boot time stamp in cut):
>
> mator@nvg5120:~/linux-sparc-boot-logs/t5120$ grep opcode
> dmesg-4.7.0-rc2+.log  | cut -f2- -d' ' | sort | uniq -c
>   4 aes_sparc64: sparc64 aes opcodes not available.
>   7 camellia_sparc64: sparc64 camellia opcodes not available.
>  37 crc32c_sparc64: sparc64 crc32c opcode not available.
>   5 des_sparc64: sparc64 des opcodes not available.
>   4 md5_sparc64: sparc64 md5 opcode not available.
>   1 sha1_sparc64: sparc64 sha1 opcode not available.
>   2 sha256_sparc64: sparc64 sha256 opcode not available.
>   3 sha512_sparc64: sparc64 sha512 opcode not available.
>
> Can we probably remove this functionality/messages from niagara2 cpus,
> if it does not support it anyway?

Wasn't clear at all, I mean can we please change pr_info in
arch/sparc/crypto/ to pr_debug in xx_sparc64_mod_init() functions?
Thanks.

Re: [sparc] niagara2 cpu, opcodes not available message?

2016-06-08 Thread Anatoly Pugachev

On Jun 8, 2016 8:30 PM, "Anatoly Pugachev" <mator...@gmail.com> wrote:
>
> Hello!
>
> Can someone please tell, why do we get a bunch of the following
> messages on niagara2 cpu hardware (SPARC Enterprise T5120, T5220,
> T5140, and T5240 servers)
>
> Asking, because I see the following lines on kernel boot (removing
> first field boot time stamp in cut):
>
> mator@nvg5120:~/linux-sparc-boot-logs/t5120$ grep opcode
> dmesg-4.7.0-rc2+.log  | cut -f2- -d' ' | sort | uniq -c
>   4 aes_sparc64: sparc64 aes opcodes not available.
>   7 camellia_sparc64: sparc64 camellia opcodes not available.
>  37 crc32c_sparc64: sparc64 crc32c opcode not available.
>   5 des_sparc64: sparc64 des opcodes not available.
>   4 md5_sparc64: sparc64 md5 opcode not available.
>   1 sha1_sparc64: sparc64 sha1 opcode not available.
>   2 sha256_sparc64: sparc64 sha256 opcode not available.
>   3 sha512_sparc64: sparc64 sha512 opcode not available.
>
> Can we probably remove this functionality/messages from niagara2 cpus,
> if it does not support it anyway?

Wasn't clear at all, I mean can we please change pr_info in
arch/sparc/crypto/ to pr_debug in xx_sparc64_mod_init() functions?

[sparc] niagara2 cpu, opcodes not available message?

2016-06-08 Thread Anatoly Pugachev

Hello!

Can someone please tell, why do we get a bunch of the following
messages on niagara2 cpu hardware (SPARC Enterprise T5120, T5220,
T5140, and T5240 servers)

Asking, because I see the following lines on kernel boot (removing
first field boot time stamp in cut):

mator@nvg5120:~/linux-sparc-boot-logs/t5120$ grep opcode
dmesg-4.7.0-rc2+.log  | cut -f2- -d' ' | sort | uniq -c
  4 aes_sparc64: sparc64 aes opcodes not available.
  7 camellia_sparc64: sparc64 camellia opcodes not available.
 37 crc32c_sparc64: sparc64 crc32c opcode not available.
  5 des_sparc64: sparc64 des opcodes not available.
  4 md5_sparc64: sparc64 md5 opcode not available.
  1 sha1_sparc64: sparc64 sha1 opcode not available.
  2 sha256_sparc64: sparc64 sha256 opcode not available.
  3 sha512_sparc64: sparc64 sha512 opcode not available.

But linux kernel sources ( linux-2.6/arch/sparc/kernel/setup_64.c )
define crypto_hwcaps only for CPUs with the following capabilities:

static const char *crypto_hwcaps[] = {
"aes", "des", "kasumi", "camellia", "md5", "sha1", "sha256",
"sha512", "mpmul", "montmul", "montsqr", "crc32c",
};

and we don't have them in niagara2 cpu CAPS:

mator@nvg5120:~/linux-sparc-boot-logs/t5120$ grep CAPS dmesg-4.7.0-rc2+.log
[0.00] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
[0.00] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit]

mator@nvg5120:~/linux-sparc-boot-logs/t5120$ egrep '^cpu|pmu' /proc/cpuinfo
cpu : UltraSparc T2 (Niagara2)
pmu : niagara2
cpucaps :
flush,stbar,swap,muldiv,v9,blkinit,n2,mul32,div32,v8plus,popc,vis,vis2,ASIBlkInit


compare, for example, with sparc CPU which support crypto (T5 cpu,
landau is machine name):

mator@landau:~$ grep CAPS dmesg-4.6.1.txt
[0.00] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
[0.00] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit,fmaf,vis3]
[0.00] CPU CAPS: [hpc,ima,pause,cbcond,aes,des,kasumi,camellia]
[0.00] CPU CAPS: [md5,sha1,sha256,sha512,mpmul,montmul,montsqr,crc32c]

mator@landau:~$ egrep '^cpu|pmu' /proc/cpuinfo
cpu : UltraSparc T5 (Niagara5)
pmu : niagara5
cpucaps :
flush,stbar,swap,muldiv,v9,blkinit,n2,mul32,div32,v8plus,popc,vis,vis2,ASIBlkInit,fmaf,vis3,hpc,ima,pause,cbcond,aes,des,kasumi,camellia,md5,sha1,sha256,sha512,mpmul,montmul,montsqr,crc32c

mator@landau:~$ grep opcode dmesg-4.6.1.txt
[8537574.887049] aes_sparc64: Using sparc64 aes opcodes optimized AES
implementation
[8537574.887611] crc32c_sparc64: Using sparc64 crc32c opcode optimized
CRC32C implementation
[8537576.577455] sha1_sparc64: Using sparc64 sha1 opcode optimized
SHA-1 implementation
[8537576.578928] sha256_sparc64: Using sparc64 sha256 opcode optimized
SHA-256/SHA-224 implementation
[8537576.580908] sha512_sparc64: Using sparc64 sha512 opcode optimized
SHA-512/SHA-384 implementation
[8537576.582964] md5_sparc64: Using sparc64 md5 opcode optimized MD5
implementation
[8537576.596984] des_sparc64: Using sparc64 des opcodes optimized DES
implementation
[8537576.600503] camellia_sparc64: Using sparc64 camellia opcodes
optimized CAMELLIA implementation


I don't understand why niagara2 cpu getting HWCAP_SPARC_CRYPTO flag if
it does not support it.
Can we probably remove this functionality/messages from niagara2 cpus,
if it does not support it anyway?

mator@nvg5120:~$ lsmod | grep -c sparc64
0

mator@landau:~$ lsmod | grep -c sparc64
9




Thanks.

kernel modules does not have signatures, so taints kernel

2016-06-01 Thread Anatoly Pugachev

Ben, hello!

Can you please tell, why do we have in kernel config file:

CONFIG_MODULE_SIG=y
CONFIG_MODULE_SIG_KEY=""

so loading any kernel module (checked with sid/unstable with kernels
linux-image-4.5.0-2-amd64 and linux-image-4.5.0-2-sparc64-smp ) taints
kernel :

on x86_64:

mator@windrunner:~$ dmesg | grep -i taint
[1.056795] fjes: module verification failed: signature and/or
required key missing - tainting kernel
root@windrunner:/home/mator# modinfo fjes
filename:   /lib/modules/4.5.0-2-amd64/kernel/drivers/net/fjes/fjes.ko
version:1.0
license:GPL
description:FUJITSU Extended Socket Network Device Driver
author: Taku Izumi 
srcversion: C09FB90B0DA9890395D27B8
alias:  acpi*:PNP0C02:*
depends:
intree: Y
vermagic:   4.5.0-2-amd64 SMP mod_unload modversions
mator@windrunner:~$ cat /proc/sys/kernel/tainted
8192

[1] states that 8192 code is for "An unsigned module has been loaded
in a kernel supporting module signature."

on sparc64:

mator@nvg5120:~$ dmesg | grep taint
[1800486.552168] aes_sparc64: module verification failed: signature
and/or required key missing - tainting kernel
root@nvg5120:~# modinfo aes_sparc64
filename:
/lib/modules/4.5.0-2-sparc64-smp/kernel/arch/sparc/crypto/aes-sparc64.ko
alias:  crypto-aes
alias:  aes
description:Rijndael (AES) Cipher Algorithm, sparc64 aes opcode accelerated
license:GPL
alias:  of:NcpuT*Csun4vC*
alias:  of:NcpuT*Csun4v
depends:
intree: Y
vermagic:   4.5.0-2-sparc64-smp SMP mod_unload modversions

Looking at the output of modinfo, there's no lines like this (as
example of signed module):

user$ modinfo usbcore | grep '^sig'
signer: Modules
sig_key:B0:3B:5E:DB:57:00:F9:D5:D7:85:EB:2D:6F:3E:19:D3:4A:20:20:5B
sig_hashalgo:   sha512

If module signing only for Secure Boot on EFI [2], why do we have it on sparc64?

Thanks.

[1] https://www.kernel.org/doc/Documentation/sysctl/kernel.txt
[2] 
https://www.decadent.org.uk/ben/blog/experiments-with-signed-kernels-and-modules-in-debian.html

Re: booting sun sparc T5120 with "nosmp" kernel causes OOPS in n2_crypto module

2016-05-25 Thread Anatoly Pugachev

tried to boot git kernel ( 4.6.0+ , git commit
ecc5fbd5ef472a4c659dc56a5739b3f041c0530c ) with "nosmp" , got n2_crypto OOPS
as well ext4 OOPS (unable to finish boot , mount / fs) , boot log :




SPARC Enterprise T5120, No Keyboard
Copyright (c) 1998, 2016, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.33.6.g, 16256 MB memory available, Serial #78400024.
Ethernet address 0:14:4f:ac:4a:18, Host ID: 84ac4a18.



Boot device: disk1  File and args: 
SILO Version 1.4.14
boot: 6
Allocated 64 Megs of memory at 0x4000 for kernel
Uncompressing image...
Loaded kernel version 4.6.0
Loading initial ramdisk (17830856 bytes at 0xC80 phys, 0x40C0 virt)...
|
[0.00] PROMLIB: Sun IEEE Boot Prom 'OBP 4.33.6.g 2016/03/11 06:05'
[0.00] PROMLIB: Root node compatible: sun4v
[0.00] Linux version 4.6.0+ (mator@nvg5120) (gcc version 5.3.1 20160509 
(Debian 5.3.1-19) ) #1 SMP Wed May 25 22:17:28 MSK 2016
[0.00] bootconsole [earlyprom0] enabled
[0.00] ARCH: SUN4V
[0.00] Ethernet address: 00:14:4f:ac:4a:18
[0.00] MM: PAGE_OFFSET is 0x8000 (max_phys_bits == 39)
[0.00] MM: VMALLOC [0x0001 --> 0x6000]
[0.00] MM: VMEMMAP [0x6000 --> 0xc000]
[0.00] Kernel: Using 3 locked TLB entries for main kernel image.
[0.00] Remapping the kernel... done.
[0.00] OF stdout device is: /virtual-devices@100/console@1
[0.00] PROM: Built device tree with 195069 bytes of memory.
[0.00] MDESC: Size is 61728 bytes.
[0.00] PLATFORM: banner-name [SPARC Enterprise T5120]
[0.00] PLATFORM: name [SUNW,SPARC-Enterprise-T5120]
[0.00] PLATFORM: hostid [84ac4a18]
[0.00] PLATFORM: serial# [00ab4130]
[0.00] PLATFORM: stick-frequency [457646c0]
[0.00] PLATFORM: mac-address [144fac4a18]
[0.00] PLATFORM: watchdog-resolution [1000 ms]
[0.00] PLATFORM: watchdog-max-timeout [3153600 ms]
[0.00] PLATFORM: max-cpus [64]
[0.00] Top of RAM: 0x3ffb16000, Total RAM: 0x3f76ac000
[0.00] Memory hole size: 132MB
[0.00] Allocated 16384 bytes for kernel page tables.
[0.00] Zone ranges:
[0.00]   Normal   [mem 0x0840-0x0003ffb15fff]
[0.00] Movable zone start for each node
[0.00] Early memory node ranges
[0.00]   node   0: [mem 0x0840-0x0003ffa89fff]
[0.00]   node   0: [mem 0x0003ffa9a000-0x0003ffaadfff]
[0.00]   node   0: [mem 0x0003ffb08000-0x0003ffb15fff]
[0.00] Initmem setup node 0 [mem 0x0840-0x0003ffb15fff]
[0.00] Booting Linux...
[0.00] CPU CAPS: [flush,stbar,swap,muldiv,v9,blkinit,n2,mul32]
[0.00] CPU CAPS: [div32,v8plus,popc,vis,vis2,ASIBlkInit]
[0.00] percpu: Embedded 10 pages/cpu @8003ff00 s37720 r8192 
d36008 u131072
[0.00] SUN4V: Mondo queue sizes [cpu(8192) dev(16384) r(8192) nr(256)]
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 2061296
[0.00] Kernel command line: root=/dev/mapper/vg1-root ro nosmp
[0.00] log_buf_len individual max cpu contribution: 4096 bytes
[0.00] log_buf_len total cpu_extra contributions: 258048 bytes
[0.00] log_buf_len min size: 131072 bytes
[0.00] log_buf_len: 524288 bytes
[0.00] early log buf free: 127696(97%)
[0.00] PID hash table entries: 4096 (order: 2, 32768 bytes)
[0.00] Dentry cache hash table entries: 2097152 (order: 11, 16777216 
bytes)
[0.00] Inode-cache hash table entries: 1048576 (order: 10, 8388608 
bytes)
[0.00] Sorting __ex_table...
[0.00] Memory: 16429712K/16636592K available (5626K kernel code, 737K 
rwdata, 1408K rodata, 464K init, 750K bss, 206880K reserved, 0K cma-reserved)
[0.00] Hierarchical RCU implementation.
[0.00]  Build-time adjustment of leaf fanout to 64.
[0.00]  RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=64.
[0.00] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=64
[0.00] NR_IRQS:2048 nr_irqs:2048 1
[0.00] SUN4V: Using IRQ API major 1, cookie only virqs disabled
[1319829.255759] clocksource: stick: mask: 0x max_cycles: 
0x10cc5ac4c8a, max_idle_ns: 440795218862 ns
[1319829.257796] clocksource: mult[dbabc5] shift[24]
[1319829.258492] clockevent: mult[952b25d1] shift[31]
[1319829.261610] Console: colour dummy device 80x25
[1319829.262298] console [tty0] enabled
[1319829.262589] bootconsole [earlyprom0] disabled
[0.00] PROMLIB: Sun IEEE Boot Prom 'OBP 4.33.6.g 2016/03/11 06:05'
[0.00] PROMLIB: Root node compatible: sun4v
[0.00] Linux version 4.6.0+ (mator@nvg5120) (gcc version 5.3.1 20160509 
(Debian 5.3.1-19) ) #1 SMP Wed May 25 22:17:28 MSK 2016
[0.00] bootconsole [earlyprom0] enabled
[0.00] ARCH: SUN4V
[

Re: booting sun sparc T5120 with "nosmp" kernel 4.5.4 causes OOPS in n2_crypto module

2016-05-24 Thread Anatoly Pugachev

(re-sent in plain text)

Hello!

Tried to boot T5120 with nosmp kernel option, gives OOPS in n2_crypto module:


May 24 13:11:48 nvg5120 kernel: Kernel command line:
root=/dev/mapper/vg1-root ro nosmp
...
May 24 13:11:48 nvg5120 kernel: Loading compiled-in X.509 certificates
May 24 13:11:48 nvg5120 kernel: Kernel unaligned access at TPC[739430]
mpi_read_buffer+0xd0/0x120
May 24 13:11:48 nvg5120 kernel: Loaded X.509 cert 'Debian Project: Ben
Hutchings: 008a018dca80932630'
May 24 13:11:48 nvg5120 kernel: rtc-sun4v rtc-sun4v: setting system
clock to 2016-05-24 10:11:26 UTC (1464084686)
May 24 13:11:48 nvg5120 kernel: aes_sparc64: module verification
failed: signature and/or required key missing - tainting kernel
May 24 13:11:48 nvg5120 kernel: aes_sparc64: sparc64 aes opcodes not available.
...
May 24 13:11:50 nvg5120 kernel: sha256_sparc64: sparc64 sha256 opcode
not available.
May 24 13:11:50 nvg5120 kernel: n2rng.c:v0.2 (July 27, 2011)
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: Registered RNG HVAPI
major 2 minor 0
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: Found single-unit RNG, units: 1
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: Selftest passed on unit 0
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: RNG ready
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not available.
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not available.
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not available.
May 24 13:11:50 nvg5120 kernel: sha1_sparc64: sparc64 sha1 opcode not available.
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not available.
May 24 13:11:50 nvg5120 kernel: n2_crypto: n2_crypto.c:v0.2 (July 28, 2011)
May 24 13:11:50 nvg5120 kernel: n2_crypto: Found N2CP at
/virtual-devices@100/n2cp@7
May 24 13:11:50 nvg5120 kernel: n2_crypto: Registered NCS HVAPI version 2.0
May 24 13:11:50 nvg5120 kernel: genirq: Flags mismatch irq 1. 
(cwq-0) vs.  (cwq-0)
May 24 13:11:50 nvg5120 kernel: [ cut here ]
May 24 13:11:50 nvg5120 kernel: WARNING: CPU: 0 PID: 260 at
/build/linux-c06pcb/linux-4.5.4/kernel/irq/manage.c:1449
__free_irq+0xac/0x2a0()
May 24 13:11:50 nvg5120 kernel: Trying to free already-free IRQ 1
May 24 13:11:50 nvg5120 kernel: Modules linked in: n2_crypto(E+)
n2_rng(E+) sha512_sparc64(E+) rng_core(E) des_generic(E) autofs4(E)
ext4(E) ecb(E)
May 24 13:11:50 nvg5120 kernel: CPU: 0 PID: 260 Comm: systemd-udevd
Tainted: GE   4.5.0-2-sparc64-smp #1 Debian 4.5.4-1
May 24 13:11:50 nvg5120 kernel: Call Trace:
May 24 13:11:50 nvg5120 kernel:  [004669d0]
warn_slowpath_common+0x70/0xc0
May 24 13:11:50 nvg5120 kernel:  [00466a50] warn_slowpath_fmt+0x30/0x40
May 24 13:11:50 nvg5120 kernel:  [004bdd0c] __free_irq+0xac/0x2a0
May 24 13:11:50 nvg5120 kernel:  [004bdfa0] free_irq+0x40/0x80
May 24 13:11:50 nvg5120 kernel:  [10aae24c]
spu_list_destroy+0xec/0x100 [n2_crypto]
May 24 13:11:50 nvg5120 kernel:  [10aafc98]
spu_mdesc_scan+0x298/0x4a0 [n2_crypto]
May 24 13:11:50 nvg5120 kernel:  [10ab0204]
n2_crypto_probe+0x1a4/0x680 [n2_crypto]
May 24 13:11:50 nvg5120 kernel:  [007c95f4] platform_drv_probe+0x34/0xc0
May 24 13:11:50 nvg5120 kernel:  [007c708c]
driver_probe_device+0x24c/0x460
May 24 13:11:50 nvg5120 kernel:  [007c7328] __driver_attach+0x88/0xa0
May 24 13:11:50 nvg5120 kernel:  [007c497c] bus_for_each_dev+0x5c/0xa0
May 24 13:11:50 nvg5120 kernel:  [007c669c] driver_attach+0x1c/0x40
May 24 13:11:50 nvg5120 kernel:  [007c60b0] bus_add_driver+0x1f0/0x2a0
May 24 13:11:50 nvg5120 kernel:  [007c7db4] driver_register+0x74/0x120
May 24 13:11:50 nvg5120 kernel:  [007c97c4]
__platform_register_drivers+0x64/0x160
May 24 13:11:50 nvg5120 kernel:  [10ab6014] n2_init+0x14/0x24
[n2_crypto]
May 24 13:11:50 nvg5120 kernel: ---[ end trace 7aa1f0163177edff ]---
May 24 13:11:50 nvg5120 kernel: camellia_sparc64: sparc64 camellia opcodes
not available.


Full boot logs, "nosmp" and usual (smp) are in [1].

1. https://bugzilla.kernel.org/show_bug.cgi?id=118831

booting sun sparc T5120 with "nosmp" kernel 4.5.4 causes OOPS in n2_crypto module

2016-05-24 Thread Anatoly Pugachev

Hello!

Tried to boot T5120 with nosmp kernel option, gives OOPS in n2_crypto
module:


May 24 13:11:48 nvg5120 kernel: Kernel command line:
root=/dev/mapper/vg1-root ro nosmp
...
May 24 13:11:48 nvg5120 kernel: Loading compiled-in X.509 certificates
May 24 13:11:48 nvg5120 kernel: Kernel unaligned access at TPC[739430]
mpi_read_buffer+0xd0/0x120
May 24 13:11:48 nvg5120 kernel: Loaded X.509 cert 'Debian Project: Ben
Hutchings: 008a018dca80932630'
May 24 13:11:48 nvg5120 kernel: rtc-sun4v rtc-sun4v: setting system clock
to 2016-05-24 10:11:26 UTC (1464084686)
May 24 13:11:48 nvg5120 kernel: aes_sparc64: module verification failed:
signature and/or required key missing - tainting kernel
May 24 13:11:48 nvg5120 kernel: aes_sparc64: sparc64 aes opcodes not
available.
...
May 24 13:11:50 nvg5120 kernel: sha256_sparc64: sparc64 sha256 opcode not
available.
May 24 13:11:50 nvg5120 kernel: n2rng.c:v0.2 (July 27, 2011)
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: Registered RNG HVAPI major
2 minor 0
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: Found single-unit RNG,
units: 1
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: Selftest passed on unit 0
May 24 13:11:50 nvg5120 kernel: n2rng f0286a1c: RNG ready
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not
available.
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not
available.
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not
available.
May 24 13:11:50 nvg5120 kernel: sha1_sparc64: sparc64 sha1 opcode not
available.
May 24 13:11:50 nvg5120 kernel: des_sparc64: sparc64 des opcodes not
available.
May 24 13:11:50 nvg5120 kernel: n2_crypto: n2_crypto.c:v0.2 (July 28, 2011)
May 24 13:11:50 nvg5120 kernel: n2_crypto: Found N2CP at
/virtual-devices@100/n2cp@7
May 24 13:11:50 nvg5120 kernel: n2_crypto: Registered NCS HVAPI version 2.0
May 24 13:11:50 nvg5120 kernel: genirq: Flags mismatch irq 1. 
(cwq-0) vs.  (cwq-0)
May 24 13:11:50 nvg5120 kernel: [ cut here ]
May 24 13:11:50 nvg5120 kernel: WARNING: CPU: 0 PID: 260 at
/build/linux-c06pcb/linux-4.5.4/kernel/irq/manage.c:1449
__free_irq+0xac/0x2a0()
May 24 13:11:50 nvg5120 kernel: Trying to free already-free IRQ 1
May 24 13:11:50 nvg5120 kernel: Modules linked in: n2_crypto(E+) n2_rng(E+)
sha512_sparc64(E+) rng_core(E) des_generic(E) autofs4(E) ext4(E) ecb(E)
May 24 13:11:50 nvg5120 kernel: CPU: 0 PID: 260 Comm: systemd-udevd
Tainted: GE   4.5.0-2-sparc64-smp #1 Debian 4.5.4-1
May 24 13:11:50 nvg5120 kernel: Call Trace:
May 24 13:11:50 nvg5120 kernel:  [004669d0]
warn_slowpath_common+0x70/0xc0
May 24 13:11:50 nvg5120 kernel:  [00466a50]
warn_slowpath_fmt+0x30/0x40
May 24 13:11:50 nvg5120 kernel:  [004bdd0c] __free_irq+0xac/0x2a0
May 24 13:11:50 nvg5120 kernel:  [004bdfa0] free_irq+0x40/0x80
May 24 13:11:50 nvg5120 kernel:  [10aae24c]
spu_list_destroy+0xec/0x100 [n2_crypto]
May 24 13:11:50 nvg5120 kernel:  [10aafc98]
spu_mdesc_scan+0x298/0x4a0 [n2_crypto]
May 24 13:11:50 nvg5120 kernel:  [10ab0204]
n2_crypto_probe+0x1a4/0x680 [n2_crypto]
May 24 13:11:50 nvg5120 kernel:  [007c95f4]
platform_drv_probe+0x34/0xc0
May 24 13:11:50 nvg5120 kernel:  [007c708c]
driver_probe_device+0x24c/0x460
May 24 13:11:50 nvg5120 kernel:  [007c7328]
__driver_attach+0x88/0xa0
May 24 13:11:50 nvg5120 kernel:  [007c497c]
bus_for_each_dev+0x5c/0xa0
May 24 13:11:50 nvg5120 kernel:  [007c669c] driver_attach+0x1c/0x40
May 24 13:11:50 nvg5120 kernel:  [007c60b0]
bus_add_driver+0x1f0/0x2a0
May 24 13:11:50 nvg5120 kernel:  [007c7db4]
driver_register+0x74/0x120
May 24 13:11:50 nvg5120 kernel:  [007c97c4]
__platform_register_drivers+0x64/0x160
May 24 13:11:50 nvg5120 kernel:  [10ab6014] n2_init+0x14/0x24
[n2_crypto]
May 24 13:11:50 nvg5120 kernel: ---[ end trace 7aa1f0163177edff ]---
May 24 13:11:50 nvg5120 kernel: camellia_sparc64: sparc64 camellia opcodes
not available.


Full boot logs, "nosmp" and usual (smp) are in [1].

1. https://bugzilla.kernel.org/show_bug.cgi?id=118831

Re: bug#23296: parted 3.2 (libparted) , mklabel sun, bug, sparc64

2016-04-26 Thread Anatoly Pugachev

On Tue, Apr 19, 2016 at 1:42 AM, Phillip Susi <ps...@ubuntu.com> wrote:
> On 04/15/2016 09:35 AM, Anatoly Pugachev wrote:
>> You found a bug in GNU Parted! Here's what you have to do: 
>>
>> Assertion (bios_geom->cylinders == (PedSector) (dev->length /
>> cyl_size)) at ../../../libparted/labels/sun.c:190 in function
>> sun_alloc() failed.
>
> I don't know how the sun disklabel works, so it might blow up in a
> shower of sparks, but you might try commenting out that assert and see
> if it works.  See if you can manipulate the existing partition table
> too, before creating a new one.

just tried to create a new partition and remove it after - works.

tried to compile parted.git , but it failed:

$ git clone git://git.savannah.gnu.org/parted.git
$ ./bootstrap
$ ./configure && make
...
make[4]: Entering directory '/home/mator/parted.git/libparted/labels'
  CC   aix.lo
aix.c: In function 'aix_label_magic_get':
aix.c:45:10: error: cast increases required alignment of target type
[-Werror=cast-align]
  return *(unsigned int *)label;
  ^
aix.c: In function 'aix_label_magic_set':
aix.c:51:3: error: cast increases required alignment of target type
[-Werror=cast-align]
  *(unsigned int *)label = magic_val;
   ^
cc1: all warnings being treated as errors
Makefile:1238: recipe for target 'aix.lo' failed
make[4]: *** [aix.lo] Error 1
make[4]: Leaving directory '/home/mator/parted.git/libparted/labels'
Makefile:1151: recipe for target 'all' failed
make[3]: *** [all] Error 2
...

mator@deb4g:~/parted.git$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/sparc64-linux-gnu/5/lto-wrapper
Target: sparc64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian
5.3.1-13' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++
--prefix=/usr --program-suffix=-5 --enable-shared
--enable-linker-build-id --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --libdir=/usr/lib
--enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-libquadmath --enable-plugin --with-system-zlib
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-sparc64/jre
--enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-sparc64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-sparc64
--with-arch-directory=sparc64
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc
--enable-multiarch --enable-targets=all --with-long-double-128
--enable-multilib --enable-checking=release --build=sparc64-linux-gnu
--host=sparc64-linux-gnu --target=sparc64-linux-gnu
Thread model: posix
gcc version 5.3.1 20160323 (Debian 5.3.1-13)

the only way to compile parted for me , is to pass
--disable-gcc-warnings to configure.
Creating a new partition and removing it with git compiled parted works.

Commenting out PED_ASSERT at libparted/labels/sun.c , line 189 , and
compiling again, allow to create sun/gpt/msdos labels.

(after msdos label has been created)

(parted) p
Model: SEAGATE ST914602SSUN146G (scsi)
Disk /dev/sda: 147GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number  Start  End  Size  Type  File system  Flags

(parted) mklabel
New disk label type? Sun
Warning: The existing disk label on /dev/sda will be destroyed and all
data on this disk will be lost. Do you want to continue?
Yes/No? Yes
Warning: The disk has 562233 cylinders, which is greater than the
maximum of 65536.
(parted) p
Model: SEAGATE ST914602SSUN146G (scsi)
Disk /dev/sda: 147GB
Sector size (logical/physical): 512B/512B
Partition Table: sun
Disk Flags:

Number  Start  End  Size  File system  Flags

(parted)

Thanks for help.

parted 3.2 (libparted) , mklabel sun, bug, sparc64

2016-04-15 Thread Anatoly Pugachev

Hello!

I can't make sun partition table with parted over existing one on
debian sparc64 physical machine Sun SPARC T5120:


root@nvg5120:~# parted /dev/sda
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit co
(parted) print
Warning: The disk CHS geometry (17848,255,63) reported by the
operating system does not match the geometry stored on the disk label
(65535,16,273).
Ignore/Cancel? I
Model: SEAGATE ST914602SSUN146G (scsi)
Disk /dev/sda: 147GB
Sector size (logical/physical): 512B/512B
Partition Table: sun
Disk Flags:

Number  Start   End SizeFile system  Flags
 1  0.00B   16.1GB  16.1GB  sun-ufs  root
 2  16.1GB  33.3GB  17.2GB
 4  33.3GB  49.4GB  16.1GB  sun-ufs
 5  49.4GB  146GB   97.0GB  sun-ufs

(parted) unit s
(parted) print
Model: SEAGATE ST914602SSUN146G (scsi)
Disk /dev/sda: 286739329s
Sector size (logical/physical): 512B/512B
Partition Table: sun
Disk Flags:

Number  Start  End SizeFile system  Flags
 1  0s 31458335s   31458336s   sun-ufs  root
 2  31458336s  65013311s   33554976s
 4  65013312s  96471647s   31458336s   sun-ufs
 5  96471648s  285837551s  189365904s  sun-ufs

(parted)
(parted) mklabel sun
Warning: The existing disk label on /dev/sda will be destroyed and all
data on this disk will be lost. Do you want to continue?
Yes/No? Yes
Backtrace has 1 calls on stack:
  1: /lib/sparc64-linux-gnu/libparted.so.2(ped_assert+0x2c) [0x8001001344c4]

You found a bug in GNU Parted! Here's what you have to do:


Assertion (bios_geom->cylinders == (PedSector) (dev->length /
cyl_size)) at ../../../libparted/labels/sun.c:190 in function
sun_alloc() failed.

Aborted



PS: OS is Linux debian sid/unstable (kernel 4.5 , glibc 2.22) sparc64.

Re: Debian SPARC on SPARC M7?

2016-04-10 Thread Anatoly Pugachev

On Sat, Apr 9, 2016 at 2:39 AM, Bryce  wrote:
> {0} ok boot bryce-deb
> ...
> SILO Version 1.4.14
> boot:
> LinuxLinuxOLD
>
> endless looping of
>
> Begin: Running /scripts/local-block ...   lvmetad is not active yet, using
> direct activation during sysinit
>   Volume group "bryce-deb-vg" not found
>   Cannot process volume group bryce-deb-vg
> done.
> ALERT!  /dev/mapper/bryce--deb--vg-root does not exist.  Dropping to a
> shell!
> Gave up waiting for root device.  Common problems:
>  - Boot args (cat /proc/cmdline)
>- Check rootdelay= (did the system wait long enough?)
>- Check root= (did the system wait for the right device?)
>  - Missing modules (cat /proc/modules; ls /dev)
>
>
> at a guess I thik this is because the system is running in an ldm and hasn;t
> caught on that it needs to load the sunvdc sunvnet modules
> that'll be a udev item

you're right here, it's modules problem, they are not being present in
initramfs, so to fix it, boot from cdrom again into rescue mode and
add this 2 modules to:

root@deb4g:/tmp# grep -v ^# /etc/initramfs-tools/modules
sunvdc
sunvnet

and update kernel initrd from rescue shell with
# update-initramfs -u

exit rescue shell and reboot, it should boot ok now... (but there
would be no telnet console present on boot, since it needs to be
enabled, before reboot from rescue shell with :

root@debian:/# cd /etc/systemd/system/getty.target.wants#
root@debian:/etc/systemd/system/getty.target.wants# ln -sf
/lib/systemd/system/console-getty.service console-getty.service

or if you know IP address of installed debian, login via ssh and
enable with systemctl :

# systemctl enable console-getty

Re: Select and install software fails on t2000

2016-03-22 Thread Anatoly Pugachev

On Tue, Mar 22, 2016 at 12:45 AM, david  wrote:
> Hello,
>
> the select and install software step fails on my t2000
>  ┌───┤ [!!] Select and install software ├┐
> │  │
>  │ Installation step failed  │
>  │ An installation step failed. You can try to run the failing item  │
>  │ again from the menu, or skip it and choose something else. The│
>  │ failing step is: Select and install software  │
> │ │
>  │  │
> │ │
> └───┘
>
> is this an known issue?



David,

can you please be more specific? Like what ISO have you used for
installation? As well, what in logs? You can execute/run shell from
withing the installer and expect/show /var/log/syslog messages?

Re: New sparc64 installation images - 2016-02-24

2016-03-05 Thread Anatoly Pugachev

On Sat, Mar 5, 2016 at 12:44 AM, John Paul Adrian Glaubitz
 wrote:
> On 02/26/2016 01:11 AM, waz0wski wrote:
>> - no login presented on ldom console -- I see init messages[3], but no login 
>> prompt. workaround: install openssh-server from chroot before reboot
>
> After another user on the list saw the same issue, I now realize your
> problem. You need to pass "console=ttyS0,115200,8n1" on the kernel
> command line to actually activate login on the serial console.
>
> This affects all versions of Debian, not just sparc64. Maybe we could
> change the defaults here.

I'm not sure about making it as a system defaults for all the arches,
but we could make it as default on sparc64 somewhere in postinstall
scripts in debian-installer.

Re: New sparc64 installation images - 2016-02-24

2016-02-26 Thread Anatoly Pugachev

On Thu, Feb 25, 2016 at 7:11 PM, waz0wski  wrote:
>> There have been issues reported with missing sunvdc/sunvnet modules.
>
> I tested in an LDOM via Solaris 11.3 on a T5220 and had this issue.
>
> The sunvdc/sunvnet modules are not probed/loaded from the installer, nor in 
> the installed system.
> workaround: load modules from installer shell, post-install chroot into the 
> installed system, add modules to /etc/initramfs-tools/modules and run 
> update-initramfs -uv
> The Debian wiki mentions adding custom modules to the installer is 
> possible[1], and it seems that the modules used to be there[2]
>
>
> No other issues with the installer, auto-partitioning (EXT4), etc were 
> encountered
> Post-install, I see few other issues
>
> - no login presented on ldom console -- I see init messages[3], but no login 
> prompt. workaround: install openssh-server from chroot before reboot

enable and start console-getty.service via systemctl

> - device initialization / fdisk logic issue[4] (line 13)
> - n2_crypto failures[5] (possibly due to my LDOM config, debugging)

the only part I touched kernel is n2rng module, and looking at [5] it
is successfully works. I've no idea about n2cp, but we could probably
try to debug...

> [1] https://wiki.debian.org/DebianInstaller/Modify/CustomKernel
> [2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=504702
> [3] http://paste.debian.net/403612
> [4] http://paste.debian.net/403282
> [5] http://paste.debian.net/403616
>
>> On Feb 25, 2016, at 1:39 AM, John Paul Adrian Glaubitz 
>>  wrote:
>> On 02/24/2016 08:37 PM, John Paul Adrian Glaubitz wrote:
>>> I just created fresh sparc64 installation images [1].
>>
>> There have been issues reported with missing sunvdc/sunvnet modules.
>>
>> However, I need more detailed feedback in order to investigate into
>> this. Thus, it would be great that anyone who has actually tested
>> the images could provide some detailed feedback on this thread.
>>
>> The more information I have, the easier it becomes to address
>> these issues.

Re: New sparc64 installation images - 2016-02-24

2016-02-25 Thread Anatoly Pugachev

On Thu, Feb 25, 2016 at 8:18 PM, Bryce  wrote:
> Oh, I remember hitting this. I wrote this to get around it.
>
> [root@ca-qasparc10 rules.d]# pwd
> /lib/udev/rules.d
> [root@ca-qasparc10 rules.d]# cat 10-sunv.rules
> # Theory
> # Linux under solaris's ldm exposes a pile of /devices/channel-devices/v*
> # devices. The drivers should not be reloaded as that would likely
> # crash the system.
> # If the vio subsystem exists, we check for an environment var (sunv_ran)
> # if it doesn't exist or does not have the value '1' then we look for
> # a glob match for each driver,.. should we find one we set sunv_ran to
> # '1' permently using ':=' and load the associated module
> #
>
> SUBSYSTEM!="vio", ENV{sunv_ran}!="1", GOTO="vio_end"
> DEVPATH=="/devices/channel-devices/vnet-*", ENV{sunv_ran}:="1",
> RUN+="/sbin/modprobe -b sunvnet"
> DEVPATH=="/devices/channel-devices/vdc-*", ENV{sunv_ran}:="1",
> RUN+="/sbin/modprobe -b sunvdc"
> LABEL="vio_end"

I believe we need to write a patch for virt-what as well, having sysfs
information from /devices/channel-devices/v*
I just don't know how to implement it better, a simple bash check for
directory " -d /devices/channel-devices/v* " or something
particular...
Any thoughts are welcome.

Thanks.

libengine-pkcs11-openssl: Engine is installed at wrong location (sparc64 as well)

2016-02-25 Thread Anatoly Pugachev

Package: libengine-pkcs11-openssl
Version: 0.2.1-1
Followup-For: Bug #815004

Dear Maintainer,


sparc64 as well

mator@deb4g:~$ openssl speed rsa2048 -engine pkcs11
invalid engine "pkcs11"
1892278191086752:error:25066067:DSO support routines:DLFCN_LOAD:could not 
load the shared 
library:dso_dlfcn.c:187:filename(/usr/lib/sparc64-linux-gnu/openssl-1.0.2/engines/libpkcs11.so):
 /usr/lib/sparc64-linux-gnu/openssl-1.0.2/engines/libpkcs11.so: cannot open 
shared object file: No such file or directory
1892278191086752:error:25070067:DSO support routines:DSO_load:could not 
load the shared library:dso_lib.c:232:
1892278191086752:error:260B6084:engine routines:DYNAMIC_LOAD:dso not 
found:eng_dyn.c:465:
1892278191086752:error:2606A074:engine routines:ENGINE_by_id:no such 
engine:eng_list.c:390:id=pkcs11
1892278191086752:error:25066067:DSO support routines:DLFCN_LOAD:could not 
load the shared library:dso_dlfcn.c:187:filename(libpkcs11.so): libpkcs11.so: 
cannot open shared object file: No such file or directory
1892278191086752:error:25070067:DSO support routines:DSO_load:could not 
load the shared library:dso_lib.c:232:
1892278191086752:error:260B6084:engine routines:DYNAMIC_LOAD:dso not 
found:eng_dyn.c:465:
Doing 2048 bit private rsa's for 10s: 12688 2048 bit private RSA's in 10.00s


mator@deb4g:~$ dpkg -L libengine-pkcs11-openssl | grep libpkcs11.so
/usr/lib/ssl/engines/libpkcs11.so

-- System Information:
Debian Release: stretch/sid
  APT prefers unreleased
  APT policy: (500, 'unreleased'), (500, 'unstable')
Architecture: sparc64

Kernel: Linux 4.4.0-trunk-sparc64-smp (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages libengine-pkcs11-openssl depends on:
ii  libc62.21-9
ii  libp11-2 0.3.1-1
ii  libssl1.0.2  1.0.2f-2

libengine-pkcs11-openssl recommends no packages.

libengine-pkcs11-openssl suggests no packages.

-- no debconf information

Re: lvm2 on sparc64 = bus error

2016-02-11 Thread Anatoly Pugachev

On Tue, Feb 9, 2016 at 10:58 PM, John Paul Adrian Glaubitz
<glaub...@physik.fu-berlin.de> wrote:
>
> On 02/09/2016 08:15 PM, Anatoly Pugachev wrote:
> > continue from https://bugs.debian.org/809685
>
> You don't have to mention the previous bug report here, your
> message is automatically appended to the existing bug report
> the moment you CC the bug report's address :).
>
> > if I get lvm2 source from git , compile and try to run , there's no "bus 
> > error":
> > (...)
> > root@deb4g:/mnt/1/lvm2# tools/lvm version
> >   LVM version: 2.02.142(2)-git (2016-01-25)
> >   Library version: 1.02.116-git (2016-01-25)
> >   Driver version:  4.34.0
>
> Interesting. Can you post the version numbers for lvm2 taken from
> the Debian package? I have had a look at the lvm2 git repository
> and there don't seem be any big changes after 2.02.142 which
> could cause this issue. If we can pinpoint the change that fixed
> the bug, we could just cherry-pick the necessary patch.


upstream commit
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=0baf66a992fbac92fa2c30e9bb8e74a5535ff45a

pulling latest git source code for lvm2 , compiling with debian
./configure options (taken from last lvm2 package log from
https://buildd.debian.org/status/fetch.php?pkg=lvm2=sparc64=2.02.141-2=1454711070
) and running , does not return bus error.

PS: thanks for fix to Zdenek Kabelac <zkabe...@redhat.com>

lvm2 on sparc64 = bus error

2016-02-09 Thread Anatoly Pugachev


continue from https://bugs.debian.org/809685

updated to 2.02.141-2 , still "bus error":

mator@deb4g:/mnt/1$ dpkg -l lvm2
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ NameVersion  Architecture 
Description
+++-===---=
ii  lvm22.02.141-2   sparc64  
Linux Logical Volume Manager
mator@deb4g:/mnt/1$ /sbin/lvm lvs
  WARNING: Running as a non-root user. Functionality may be unavailable.
Bus error

if I get lvm2 source from git , compile and try to run , there's no "bus error":

mator@deb4g:/mnt/1$ git clone https://git.fedorahosted.org/git/lvm2.git 
mator@deb4g:/mnt/1$ cd lvm2
mator@deb4g:/mnt/1/lvm2$ ./configure && make -j
mator@deb4g:/mnt/1/lvm2$ find . -name lvm
./tools/lvm
mator@deb4g:/mnt/1/lvm2$ sudo -s
root@deb4g:/mnt/1/lvm2# export LD_LIBRARY_PATH=./libdm
root@deb4g:/mnt/1/lvm2# tools/lvm lvs
root@deb4g:/mnt/1/lvm2# tools/lvm pvs
root@deb4g:/mnt/1/lvm2# tools/lvm pvcreate /dev/vdiske1 
  Physical volume "/dev/vdiske1" successfully created
root@deb4g:/mnt/1/lvm2# tools/lvm pvcreate /dev/vdiske2
  Physical volume "/dev/vdiske2" successfully created
root@deb4g:/mnt/1/lvm2# tools/lvm pvs
  PV   VG   Fmt  Attr PSize   PFree  
  /dev/vdiske1  lvm2 ---  953.00m 953.00m
  /dev/vdiske2  lvm2 ---  953.00m 953.00m
root@deb4g:/mnt/1/lvm2# tools/lvm vgs
root@deb4g:/mnt/1/lvm2# tools/lvm vgcreate vg1 /dev/vdiske1 /dev/vdiske2
  Volume group "vg1" successfully created
root@deb4g:/mnt/1/lvm2# tools/lvm lvcreate -n vg1/lv1 -L100M
  Logical volume "lv1" created.
root@deb4g:/mnt/1/lvm2# mkfs.ext4 /dev/vg1/lv1
mke2fs 1.43-WIP (18-May-2015)
Creating filesystem with 102400 1k blocks and 25688 inodes
Filesystem UUID: ae62d8e9-37fb-4282-a536-3a529739817c
Superblock backups stored on blocks: 
8193, 24577, 40961, 57345, 73729

Allocating group tables: done
Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done 

root@deb4g:/mnt/1/lvm2# mkdir /lvtest
root@deb4g:/mnt/1/lvm2# mount /dev/vg1/lv1 /lvtest
root@deb4g:/mnt/1/lvm2# 
root@deb4g:/mnt/1/lvm2# tools/lvm version
  LVM version: 2.02.142(2)-git (2016-01-25)
  Library version: 1.02.116-git (2016-01-25)
  Driver version:  4.34.0

Bug#814028: btrfs-tools: btrfs filesystem usage - BUS error

2016-02-07 Thread Anatoly Pugachev

Package: btrfs-tools
Version: 4.4-1
Severity: normal
Tags: sparc64
X-Debbugs-Cc: debian-sparc@lists.debian.org

Dear Maintainer,

/mnt is being mounted as btrfs:

mator@deb4g:/srv/1/linux-2.6$ findmnt /mnt
TARGET SOURCE   FSTYPE OPTIONS
/mnt   /dev/vdiskd1 btrfs  rw,relatime,space_cache,subvolid=5,subvol=/

command "btrfs fi usage /mnt" run as unprivileged user works as expected,
but with elevated privileges, it crashes with BUS error:

mator@deb4g:/srv/1/linux-2.6$ sudo btrfs fi usage /mnt
Bus error

since I'm on unstable/sid, getting btrfs-progs from git and trying to reproduce:

mator@deb4g:/srv/1$ git clone https://github.com/kdave/btrfs-progs.git
mator@deb4g:/srv/1$ cd btrfs-progs && ./autogen.sh && CFLAGS="-g" ./configure 
&& make -j
mator@deb4g:/srv/1/btrfs-progs$ ./btrfs --version
btrfs-progs v4.4
mator@deb4g:/srv/1/btrfs-progs$ ./btrfs fi usage /mnt
WARNING: cannot read detailed chunk info, RAID5/6 numbers will be incorrect, 
run as root
Overall:
Device size:   3.00GiB
Device allocated:331.00MiB
Device unallocated:2.67GiB
Device missing:3.00GiB
Used:384.00KiB
Free (estimated):  2.68GiB  (min: 1.34GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:   16.00MiB  (used: 0.00B)
mator@deb4g:/srv/1/btrfs-progs$ sudo ./btrfs fi usage /mnt
Bus error   
mator@deb4g:/srv/1/btrfs-progs$ sudo -s
root@deb4g:/srv/1/btrfs-progs# ulimit -c unlimited
root@deb4g:/srv/1/btrfs-progs# ./btrfs fi usage /mnt
Bus error (core dumped)
root@deb4g:/srv/1/btrfs-progs# gdb -c core
Core was generated by `./btrfs fi usage /mnt'.
Program terminated with signal SIGUSR1, User defined signal 1.
#0  0x00174730 in ?? ()
(gdb) bt
#0  0x00174730 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)   
(gdb) set args fi usage /mnt  
(gdb) file ./btrfs
Load new symbol table from "./btrfs"? (y or n) y
Reading symbols from ./btrfs...done.
(gdb) run
Starting program: /srv/1/btrfs-progs/btrfs fi usage /mnt
BFD: /usr/lib/debug/.build-id/10/2220230fb152bed171674ffb66092972cf0276.debug: 
unable to initialize decompress status for section .debug_aranges
BFD: /usr/lib/debug/.build-id/10/2220230fb152bed171674ffb66092972cf0276.debug: 
unable to initialize decompress status for section .debug_aranges
warning: File 
"/usr/lib/debug/.build-id/10/2220230fb152bed171674ffb66092972cf0276.debug" has 
no build-id, file skipped
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
BFD: /usr/lib/debug/.build-id/27/97a1230a6c622a2d0362aace029b5fda6c3474.debug: 
unable to initialize decompress status for section .debug_aranges
BFD: /usr/lib/debug/.build-id/27/97a1230a6c622a2d0362aace029b5fda6c3474.debug: 
unable to initialize decompress status for section .debug_aranges
warning: File 
"/usr/lib/debug/.build-id/27/97a1230a6c622a2d0362aace029b5fda6c3474.debug" has 
no build-id, file skipped

Program received signal SIGBUS, Bus error.
0x00174730 in load_chunk_info (fd=3, info_ptr=0x7fef0f0, 
info_count=0x7fef0e4) at cmds-fi-usage.c:188
188 off += sh->len;
(gdb) bt
#0  0x00174730 in load_chunk_info (fd=3, info_ptr=0x7fef0f0, 
info_count=0x7fef0e4) at cmds-fi-usage.c:188
#1  0x00175dac in load_chunk_and_device_info (fd=3, 
chunkinfo=0x7fef0f0, chunkcount=0x7fef0e4, devinfo=0x7fef0e8, 
devcount=0x7fef0e0) at cmds-fi-usage.c:577
#2  0x00177418 in cmd_filesystem_usage (argc=2, argv=0x7fef6f8) at 
cmds-fi-usage.c:961
#3  0x0010996c in handle_command_group (grp=0x324560 
, argc=2, argv=0x7fef6f8) at btrfs.c:135
#4  0x0011197c in cmd_filesystem (argc=3, argv=0x7fef6f0) at 
cmds-filesystem.c:1294
#5  0x00109d54 in main (argc=3, argv=0x7fef6f0) at btrfs.c:243
(gdb) 


-- System Information:
Debian Release: stretch/sid
  APT prefers unreleased
  APT policy: (500, 'unreleased'), (500, 'experimental'), (500, 'unstable')
Architecture: sparc64

Kernel: Linux 4.4.0-trunk-sparc64-smp (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages btrfs-tools depends on:
ii  e2fslibs1.43~WIP-2015-05-18-1
ii  libblkid1   2.27.1-3
ii  libc6   2.21-7
ii  libcomerr2  1.42.13-1
ii  liblzo2-2   2.08-1.2
ii  libuuid12.27.1-3
ii  zlib1g  1:1.2.8.dfsg-2+b1

btrfs-tools recommends no packages.

btrfs-tools suggests no packages.

-- no debconf information

Re: Bug#814028: btrfs-tools: btrfs filesystem usage - BUS error

2016-02-07 Thread Anatoly Pugachev

On Sun, Feb 7, 2016 at 8:50 PM, John Paul Adrian Glaubitz <
glaub...@physik.fu-berlin.de> wrote:

> Hi Anatoly!
>
> On 02/07/2016 06:42 PM, Anatoly Pugachev wrote:
> > Package: btrfs-tools
> > Version: 4.4-1
> > Severity: normal
>
> Thanks for looking at this!
>
> Would you mind reporting this bug upstream as well?
>
>
done, https://bugzilla.kernel.org/show_bug.cgi?id=112131

debian sparc page on debian.org

2016-01-30 Thread Anatoly Pugachev

Hello!

Can someone edit https://www.debian.org/ports/sparc/ page to point sparc64
port paragraph to wiki page https://wiki.debian.org/Sparc64. Or you know,
how is it possible to edit pages on www.debian.org ?

Thanks.

Bug#812928: udev: cdrom_id terminated by signal BUS

2016-01-27 Thread Anatoly Pugachev

Package: udev
Version: 228-4
Severity: minor
Tags: sparc64
X-Debbugs-Cc: debian-sparc@lists.debian.org


cdrom_id from udev package dies with "Bus error"

runs as following (upon mounting cdrom):

Jan 24 19:51:45 ristkon systemd-udevd[443]: Process 'cdrom_id --lock-media 
/dev/sr0' terminated by signal BUS.

manually run from shell:

mator@deb4g:~$ sudo /lib/udev/cdrom_id -l /dev/vdiskb
Bus error

recompilling with debug info (from git):

mator@deb4g:~/systemd/src/udev/cdrom_id$ gdb -c core ./cdrom_id -l /dev/vdiskb
GNU gdb (Debian 7.10-1+b1) 7.10
Reading symbols from ./cdrom_id...done.
[New LWP 95672]
BFD: /usr/lib/debug/.build-id/b0/2fcd8d81fb0edf7ef519ca096cf5ad33cecb8b.debug: 
unable to initialize decompress status for section .debug_aranges
BFD: /usr/lib/debug/.build-id/b0/2fcd8d81fb0edf7ef519ca096cf5ad33cecb8b.debug: 
unable to initialize decompress status for section .debug_aranges
warning: File 
"/usr/lib/debug/.build-id/b0/2fcd8d81fb0edf7ef519ca096cf5ad33cecb8b.debug" has 
no build-id, file skipped
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
BFD: /usr/lib/debug/.build-id/27/97a1230a6c622a2d0362aace029b5fda6c3474.debug: 
unable to initialize decompress status for section .debug_aranges
BFD: /usr/lib/debug/.build-id/27/97a1230a6c622a2d0362aace029b5fda6c3474.debug: 
unable to initialize decompress status for section .debug_aranges
warning: File 
"/usr/lib/debug/.build-id/27/97a1230a6c622a2d0362aace029b5fda6c3474.debug" has 
no build-id, file skipped
BFD: /usr/lib/debug/.build-id/10/2220230fb152bed171674ffb66092972cf0276.debug: 
unable to initialize decompress status for section .debug_aranges
BFD: /usr/lib/debug/.build-id/10/2220230fb152bed171674ffb66092972cf0276.debug: 
unable to initialize decompress status for section .debug_aranges
warning: File 
"/usr/lib/debug/.build-id/10/2220230fb152bed171674ffb66092972cf0276.debug" has 
no build-id, file skipped
Core was generated by `./cdrom_id --lock-media /dev/vdiskb'.
Program terminated with signal SIGUSR1, User defined signal 1.
#0  0x0101b9b8 in initialize_srand () at src/basic/random-util.c:107
107 x ^= *(unsigned*) auxv;
(gdb) bt
#0  0x0101b9b8 in initialize_srand () at src/basic/random-util.c:107
#1  0x0100e2c0 in main (argc=3, argv=0x7feffcab728) at 
src/udev/cdrom_id/cdrom_id.c:916
(gdb) quit


if I comment out initialize_srand() from src/udev/cdrom_id/cdrom_id.c:916, and
recompile, there's no bus error on cdrom_id run.


-- Package-specific info:

-- System Information:
Debian Release: stretch/sid
  APT prefers unreleased
  APT policy: (500, 'unreleased'), (500, 'unstable')
Architecture: sparc64

Kernel: Linux 4.3.0-1-sparc64-smp (SMP w/1 CPU core)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages udev depends on:
ii  adduser3.113+nmu3
ii  debconf [debconf-2.0]  1.5.58
ii  dpkg   1.18.4
ii  libacl12.2.52-2
ii  libblkid1  2.27.1-1
ii  libc6  2.21-7
ii  libkmod2   22-1
ii  libselinux12.4-3
ii  libudev1   228-4
ii  lsb-base   9.20160110
ii  procps 2:3.3.11-3
ii  util-linux 2.27.1-1

udev recommends no packages.

udev suggests no packages.

Versions of packages udev is related to:
ii  systemd  228-4

-- debconf information:
  udev/title/upgrade:
  udev/new_kernel_needed: false
  udev/sysfs_deprecated_incompatibility:
  udev/reboot_needed:

Re: Bug#812928: udev: cdrom_id terminated by signal BUS

2016-01-27 Thread Anatoly Pugachev

On Thu, Jan 28, 2016 at 3:31 AM, Patrick Baggett <baggett.patr...@gmail.com>
wrote:
>
> On Wed, Jan 27, 2016 at 5:23 PM, Ben Hutchings <b...@decadent.org.uk>
wrote:
> > Control: tag -1 moreinfo
> >
> > On Wed, 2016-01-27 at 23:54 +0100, Marco d'Itri wrote:
> >> Control: reassign -1 src:linux
> >> Control: found -1 4.3.0-1
> >> Control: retitle -1 getauxval(AT_RANDOM) broken on sparc64
> >>
> >> On Jan 27, Anatoly Pugachev <mator...@gmail.com> wrote:
> >>
> >> > Program terminated with signal SIGUSR1, User defined signal 1.
> >> > #0  0x0101b9b8 in initialize_srand () at src/basic/random-
> >> > util.c:107
> >> > 107 x ^= *(unsigned*) auxv;
> >> > (gdb) bt
> >> Looks like getauxval(AT_RANDOM) returns garbage on sparc64:
> >>
> >> x = 0;
> >> auxv = (void*) getauxval(AT_RANDOM);
> >> if (auxv)
> >> x ^= *(unsigned*) auxv;
> >
> > There is no documented alignment guarantee for the AT_RANDOM bytes so I
> > think this caller is wrong to treat it as an array of unsigned int.
>
> Also, you can verify that from a debugger without changing the code,
> by printing the value of the pointer `auxv` and check if either of the
> lower two bits are set.
>
> > What happens if you change it to:
> >
> > if (auxv)
> > memcpy(, auxv, sizeof(x));
> >


restored original cdrom_id.c (with initialize_srand() function call) and
recompiled with memcpy() and run:

mator@deb4g:~/systemd$ sudo ./cdrom_id -l /dev/vdiskb
ID_CDROM=1

there's no SIGBUS. And I don't know what it should output. Probably fixed.
Thanks.

Re: debian-installer: cdrom vs. netboot

2016-01-24 Thread Anatoly Pugachev

On Sun, Jan 24, 2016 at 10:34 PM, rod  wrote:

> I re-ran this.
>
> Deleted all partitions on drive.
> Selected Guided use largest..
> Changed root from ext4 to ext3.
> now it has either hung at the 33% error or it's taking its time or ...
>
>
Rod, it is mkfs stuck waiting for user console input (like yes/no), you
can do partitioning (select not to format) and run console before install,
and run mkfs by hand to create filesystem, then go ahead and select install
system.

Re: debian-installer: cdrom vs. netboot

2016-01-22 Thread Anatoly Pugachev

On Fri, Jan 22, 2016 at 1:59 PM, John Paul Adrian Glaubitz
 wrote:
> On 01/21/2016 11:14 PM, John Paul Adrian Glaubitz wrote:
>> In any case, I'm home now, my sparc64 machine is at work and I won't be
>> able to test until tomorrow. So anyone else here gets the honour to test
>> this image which should work much better now.
>
> Haven't tested it yet, but I am now working on an updated image which
> will contain updated silo and silo-installer packages. I have added
> Phil's patch [1] to silo to avoid the ext2 warnings and added sparc64
> as a detected architecture to the check.d/silo_check script in
> silo-installer. Not sure though whether this script is actually
> important.
>
> Will post the updated NETINST ISO once it has been built.

Adrian,

can you please include 2 more patches (which are not yet in upstream
silo.git) - GPT warning removal and timeout patch for silo?

http://marc.info/?l=linux-sparc=145333545028306=2

Thanks.

Re: Bug#809815: [feature request] linux-image-4.3.0-1-sparc64-smp: tpm random module for linux LDOMs

2016-01-13 Thread Anatoly Pugachev

On Mon, Jan 11, 2016 at 3:08 AM, Ben Hutchings <b...@decadent.org.uk> wrote:
> On Thu, 2016-01-07 at 20:30 +0300, Anatoly Pugachev wrote:
>> Can you please suggest, what to do next? Close this bugreport as
>> invalid, and fill new one against n2_rng module in debian, or report
>> first to lkml? Thanks.
> [...]
>
> You should send this patch upstream (linux-cry...@vger.kernel.org and
> sparcli...@vger.kernel.org mailing lists).

Ben,
submitted to both mentioned mailing lists, it got to DaveM processing
queue, see http://patchwork.ozlabs.org/project/sparclinux/list/?submitter=68078
As I told earlier, I'm not a kernel developer in any form, not even
C/C++ programmer. I'm not sure I would be able to answer to any
objections on this patch.
But thanks anyway, probably someone else (oracle guys, with their
linux for sparc [L4S] project) would be able to make this patch to
kernel.

Re: Bug#809815: [feature request] linux-image-4.3.0-1-sparc64-smp: tpm random module for linux LDOMs

2016-01-06 Thread Anatoly Pugachev

On Wed, Jan 6, 2016 at 5:21 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
> On Wed, Jan 6, 2016 at 5:24 AM, Ben Hutchings <b...@decadent.org.uk> wrote:
>> Control: tag -1 moreinfo
>>
>> On Mon, 2016-01-04 at 13:48 +0300, Anatoly Pugachev wrote:
>>> Package: src:linux
>>> Version: 4.3.3-2
>>> Severity: wishlist
>>>
>>> Dear Maintainer,
>>>
>>> Can you please enable CONFIG_TCG_TPM (TPM security chip) and
>>> CONFIG_HW_RANDOM_TPM linux kernel config options (as modules), to
>>> enable hardware RNG device for use in LDOM (containers) of debian
>>> sparc64.
>>>
>>> Right now, there's no hardware RNG provider is available :
>> [...]
>>
>> Both of those are generic TPM code and won't help you without a driver
>> for the specific TPM that's present in LDOMs.
>>
>> I can't find any hint in the kernel source of which driver is needed
>> for an LDOM, even in the UEK patched source, so perhaps it is out-of-
>> tree?
>
> Ben, well,
>
> I'm going to build a generic (vanilla) kernel with this CONFIGs and
> test how it would work. Going to report back soon. Thanks.

Ben,

you was right, this modules does not help.

root@deb4g:/home/mator# lsmod | grep rng
tpm_rng 1020  0
n2_rng  6878  0
rng_core8172  2 n2_rng,tpm_rng
root@deb4g:/home/mator# cat /sys/class/misc/hw_random/rng_available
tpm-rng

rngd still gives error:

root@deb4g:/home/mator# rngd -f -r /dev/hwrng
error reading from entropy source:: No such device

I don't know, but I probably should report to upstream kernel
bugzilla, about n2_rng, that it does not work.
Openbsd says [1] it does support it (starting from T1 and T2 processors),
Solaris says [2] it does support it (from T2 till M6 processors,
including this machine T5 cpu)

running show-devs from openboot console for this LDOM, i can see
random-number-generator device is being present:

{0} ok show-devs
/cpu@3
/cpu@2
/cpu@1
/cpu@0
/virtual-devices@100
/reboot-memory@0
/iscsi-hba
/virtual-memory
/memory@m0,3000
/aliases
/options
/openprom
/chosen
/packages
/virtual-devices@100/channel-devices@200
/virtual-devices@100/console@1
/virtual-devices@100/random-number-generator@e
/virtual-devices@100/flashprom@0
/virtual-devices@100/channel-devices@200/virtual-domain-service@0
/virtual-devices@100/channel-devices@200/pciv-communication@0
/virtual-devices@100/channel-devices@200/disk@1
/virtual-devices@100/channel-devices@200/disk@0
/virtual-devices@100/channel-devices@200/network@0
/iscsi-hba/disk
/openprom/client-services
/packages/vnet-helper-pkg
/packages/vdisk-helper-pkg
/packages/obp-tftp
/packages/kbd-translator
/packages/SUNW,asr
/packages/dropins
/packages/terminal-emulator
/packages/disk-label
/packages/deblocker
/packages/SUNW,builtin-drivers
{0} ok

but n2_rng does not see it. I'm going to test a more recent kernel,
instead of 4.1.15. The choice of old 4.1.15 kernel to test, was
because oracle sparc linux is using 4.1.8, and i wanted to test it
first. Compiling 4.4rc8 right now...

Searching on the web, found [3], where cpu is T4 and 4.3.0 kernel, but
n2rng gives more messages on boot.

Sorry for wrong feature request, please close this bug as non-valid. Thanks.

1. http://undeadly.org/cgi?action=article=20090201164147
2. http://prsync.com/oracle/solaris-random-number-generation-570469/
3. https://lkml.org/lkml/2015/10/30/678

Re: [feature request] linux-image-4.3.0-1-sparc64-smp: tpm random module for linux LDOMs

2016-01-04 Thread Anatoly Pugachev

On Mon, Jan 4, 2016 at 1:48 PM, Anatoly Pugachev <mator...@gmail.com> wrote:
> Package: src:linux
> Version: 4.3.3-2
> Severity: wishlist
>
> Dear Maintainer,
>
> Can you please enable CONFIG_TCG_TPM (TPM security chip) and
> CONFIG_HW_RANDOM_TPM linux kernel config options (as modules), to
> enable hardware RNG device for use in LDOM (containers) of debian
> sparc64.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=809815

1 2 >

1 - 100 of 121 matches

Mail list logo