from:"Tomasz Chmielewski"


On 2016-09-23 23:10, Mike Galbraith wrote:

I did some experiments to see when the problem first appeared. 
Thousands

of kworker processes start to show up in 4.7.0-rc5.

kernel version | kworker count after boot
---
4.6.3 >  > 37
4.6.4 >  > 47
4.6.5 >  > 46
4.6.6 >  > 49
4.6.7 >  > 49
4.7.0-rc1 >  > 46
4.7.0-rc2 >  > 49
4.7.0-rc3>   > 45
4.7.0-rc4>   > 47
4.7.0-rc5>   > 1592


Best bet would be to use 'git bisect' to locate the exact commit that
caused this, and post the bisection result along with your config.

AFAIK, nobody else is seeing this, is the kernel virgin source?


Yes, it's a kernel.org kernel.

I found some similar reports, though without much more info:

https://github.com/zfsonlinux/zfs/issues/5036 - kernel 4.7.2, initially 
attributed to ZFS on Linux, but then reproduced without ZFS


https://github.com/systemd/systemd/issues/4069 - kernel 4.7.2


I'll try to bisect.



Tomasz Chmielewski
https://lxadm.com

Re: thousands of kworker processes with 4.7.x and 4.8-rc*


On 2016-09-23 23:10, Mike Galbraith wrote:

I did some experiments to see when the problem first appeared. 
Thousands

of kworker processes start to show up in 4.7.0-rc5.

kernel version | kworker count after boot
---
4.6.3 >  > 37
4.6.4 >  > 47
4.6.5 >  > 46
4.6.6 >  > 49
4.6.7 >  > 49
4.7.0-rc1 >  > 46
4.7.0-rc2 >  > 49
4.7.0-rc3>   > 45
4.7.0-rc4>   > 47
4.7.0-rc5>   > 1592


Best bet would be to use 'git bisect' to locate the exact commit that
caused this, and post the bisection result along with your config.

AFAIK, nobody else is seeing this, is the kernel virgin source?


Yes, it's a kernel.org kernel.

I found some similar reports, though without much more info:

https://github.com/zfsonlinux/zfs/issues/5036 - kernel 4.7.2, initially 
attributed to ZFS on Linux, but then reproduced without ZFS


https://github.com/systemd/systemd/issues/4069 - kernel 4.7.2


I'll try to bisect.



Tomasz Chmielewski
https://lxadm.com

Re: thousands of kworker processes with 4.7.x and 4.8-rc*


On 2016-09-19 16:08, Tomasz Chmielewski wrote:

On several servers running 4.7.x and 4.8-rc6/7 kernels I'm seeing
thousands of kworker processes.
# ps auxf|grep -c kworker
2104
Load average goes into hundreds on a pretty much idle server (biggest
CPU and RAM consumers are probably SSHD with one user logged in and
rsyslog writing ~1 line per minute):
# uptime
 06:58:56 up 26 min,  1 user,  load average: 146.11, 215.46, 105.70
# uptime
 06:59:48 up 26 min,  1 user,  load average: 305.20, 240.84, 120.25
Sometimes seeing lots of them in "D" state:
root 19474  0.0  0.0  0 0 ?D06:54   0:00  \_
[kworker/0:208]
root 19475  0.0  0.0  0 0 ?D06:54   0:00  \_
[kworker/0:209]



I did some experiments to see when the problem first appeared. Thousands 
of kworker processes start to show up in 4.7.0-rc5.


kernel version | kworker count after boot
---
4.6.3   37
4.6.4   47
4.6.5   46
4.6.6   49
4.6.7   49
4.7.0-rc1   46
4.7.0-rc2   49
4.7.0-rc3   45
4.7.0-rc4   47
4.7.0-rc5   1592
4.7.0-rc6   1714
4.7.0-rc7   1955
4.7.0   2088
4.7.1   1222
4.7.2   1699
4.7.3   1446
4.7.4   1781
4.8-rc1 (not tested)
4.8-rc2 2012
4.8-rc3 1696
4.8-rc4 1210
4.8-rc5 1890
4.8-rc6 1657
4.8-rc7     1647


Tomasz Chmielewski
https://lxadm.com

Re: thousands of kworker processes with 4.7.x and 4.8-rc*


On 2016-09-19 16:08, Tomasz Chmielewski wrote:

On several servers running 4.7.x and 4.8-rc6/7 kernels I'm seeing
thousands of kworker processes.
# ps auxf|grep -c kworker
2104
Load average goes into hundreds on a pretty much idle server (biggest
CPU and RAM consumers are probably SSHD with one user logged in and
rsyslog writing ~1 line per minute):
# uptime
 06:58:56 up 26 min,  1 user,  load average: 146.11, 215.46, 105.70
# uptime
 06:59:48 up 26 min,  1 user,  load average: 305.20, 240.84, 120.25
Sometimes seeing lots of them in "D" state:
root 19474  0.0  0.0  0 0 ?D06:54   0:00  \_
[kworker/0:208]
root 19475  0.0  0.0  0 0 ?D06:54   0:00  \_
[kworker/0:209]



I did some experiments to see when the problem first appeared. Thousands 
of kworker processes start to show up in 4.7.0-rc5.


kernel version | kworker count after boot
---
4.6.3   37
4.6.4   47
4.6.5   46
4.6.6   49
4.6.7   49
4.7.0-rc1   46
4.7.0-rc2   49
4.7.0-rc3   45
4.7.0-rc4   47
4.7.0-rc5   1592
4.7.0-rc6   1714
4.7.0-rc7   1955
4.7.0   2088
4.7.1   1222
4.7.2   1699
4.7.3   1446
4.7.4   1781
4.8-rc1 (not tested)
4.8-rc2 2012
4.8-rc3 1696
4.8-rc4 1210
4.8-rc5 1890
4.8-rc6 1657
4.8-rc7     1647


Tomasz Chmielewski
https://lxadm.com

thousands of kworker processes with 4.7.x and 4.8-rc*

2016-09-19 Thread Tomasz Chmielewski

On several servers running 4.7.x and 4.8-rc6/7 kernels I'm seeing 
thousands of kworker processes.


# ps auxf|grep -c kworker
2104


Load average goes into hundreds on a pretty much idle server (biggest 
CPU and RAM consumers are probably SSHD with one user logged in and 
rsyslog writing ~1 line per minute):


# uptime
 06:58:56 up 26 min,  1 user,  load average: 146.11, 215.46, 105.70


# uptime
 06:59:48 up 26 min,  1 user,  load average: 305.20, 240.84, 120.25


Sometimes seeing lots of them in "D" state:

root 19474  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:208]
root 19475  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:209]
root 19477  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:211]
root 19480  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:214]
root 19483  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:217]
root 19485  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:219]
root 19486  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:220]
root 19487  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:221]
root 19492  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:226]
root 19533  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/4:257]



Is it a known issue?

The server has 8 CPUs and 32 GB RAM.


Tomasz Chmielewski
https://lxadm.com

thousands of kworker processes with 4.7.x and 4.8-rc*

2016-09-19 Thread Tomasz Chmielewski

On several servers running 4.7.x and 4.8-rc6/7 kernels I'm seeing 
thousands of kworker processes.


# ps auxf|grep -c kworker
2104


Load average goes into hundreds on a pretty much idle server (biggest 
CPU and RAM consumers are probably SSHD with one user logged in and 
rsyslog writing ~1 line per minute):


# uptime
 06:58:56 up 26 min,  1 user,  load average: 146.11, 215.46, 105.70


# uptime
 06:59:48 up 26 min,  1 user,  load average: 305.20, 240.84, 120.25


Sometimes seeing lots of them in "D" state:

root 19474  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:208]
root 19475  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:209]
root 19477  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:211]
root 19480  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:214]
root 19483  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:217]
root 19485  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:219]
root 19486  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:220]
root 19487  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:221]
root 19492  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/0:226]
root 19533  0.0  0.0  0 0 ?D06:54   0:00  \_ 
[kworker/4:257]



Is it a known issue?

The server has 8 CPUs and 32 GB RAM.


Tomasz Chmielewski
https://lxadm.com

4.0.5: WARNING: CPU: 3 PID: 249 at /home/kernel/COD/linux/mm/backing-dev.c:372 bdi_unregister+0x36/0x40()

2015-06-15 Thread Tomasz Chmielewski


Got this after stopping a RAID-1 array:

[  626.694737] md: md3 still in use.
[  626.694946] md: delaying resync of md3 until md2 has finished (they 
share one or more physical units)

[  628.256210] md3: detected capacity change from 388873344 to 0
[  628.256372] md: md3 stopped.
[  628.256383] md: unbind
[  628.274852] md: export_rdev(sdb4)
[  628.274909] md: unbind
[  628.282856] md: export_rdev(sda4)
[  628.283246] [ cut here ]
[  628.283258] WARNING: CPU: 3 PID: 249 at 
/home/kernel/COD/linux/mm/backing-dev.c:372 bdi_unregister+0x36/0x40()
[  628.283261] Modules linked in: intel_rapl iosf_mbi 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 
lrw gf128mul glue_helper ablk_helper eeepc_wmi asus_wmi ppdev 
sparse_keymap parport_pc cryptd video shpchp 8250_fintek lpc_ich lp 
tpm_infineon wmi mac_hid parport serio_raw btrfs pata_acpi raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor 
raid6_pq raid1 ahci r8169 libahci raid0 mii pata_via multipath linear
[  628.283320] CPU: 3 PID: 249 Comm: kworker/3:1 Not tainted 
4.0.5-040005-generic #201506061639
[  628.283322] Hardware name: System manufacturer System Product 
Name/P8H67-M PRO, BIOS 3904 04/27/2013

[  628.283328] Workqueue: md_misc mddev_delayed_delete
[  628.283341]  0174 88040867bca8 817e4a5d 
0007
[  628.283347]   88040867bce8 81079227 
81adc895
[  628.283352]  880409197c00   
88041f2da900

[  628.283365] Call Trace:
[  628.283374]  [] dump_stack+0x45/0x57
[  628.283381]  [] warn_slowpath_common+0x97/0xe0
[  628.283386]  [] warn_slowpath_null+0x1a/0x20
[  628.283390]  [] bdi_unregister+0x36/0x40
[  628.283397]  [] del_gendisk+0x108/0x260
[  628.283402]  [] md_free+0x4c/0x70
[  628.283408]  [] kobject_cleanup+0x82/0x1c0
[  628.283413]  [] kobject_put+0x30/0x70
[  628.283417]  [] mddev_delayed_delete+0x34/0x40
[  628.283422]  [] process_one_work+0x144/0x490
[  628.283426]  [] worker_thread+0x11e/0x450
[  628.283431]  [] ? create_worker+0x1f0/0x1f0
[  628.283436]  [] kthread+0xc9/0xe0
[  628.283442]  [] ? flush_kthread_worker+0x90/0x90
[  628.283448]  [] ret_from_fork+0x58/0x90
[  628.283453]  [] ? flush_kthread_worker+0x90/0x90
[  628.283457] ---[ end trace 2c187f15cc11aca4 ]---



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

4.0.5: WARNING: CPU: 3 PID: 249 at /home/kernel/COD/linux/mm/backing-dev.c:372 bdi_unregister+0x36/0x40()

2015-06-15 Thread Tomasz Chmielewski


Got this after stopping a RAID-1 array:

[  626.694737] md: md3 still in use.
[  626.694946] md: delaying resync of md3 until md2 has finished (they 
share one or more physical units)

[  628.256210] md3: detected capacity change from 388873344 to 0
[  628.256372] md: md3 stopped.
[  628.256383] md: unbindsdb4
[  628.274852] md: export_rdev(sdb4)
[  628.274909] md: unbindsda4
[  628.282856] md: export_rdev(sda4)
[  628.283246] [ cut here ]
[  628.283258] WARNING: CPU: 3 PID: 249 at 
/home/kernel/COD/linux/mm/backing-dev.c:372 bdi_unregister+0x36/0x40()
[  628.283261] Modules linked in: intel_rapl iosf_mbi 
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 
lrw gf128mul glue_helper ablk_helper eeepc_wmi asus_wmi ppdev 
sparse_keymap parport_pc cryptd video shpchp 8250_fintek lpc_ich lp 
tpm_infineon wmi mac_hid parport serio_raw btrfs pata_acpi raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor 
raid6_pq raid1 ahci r8169 libahci raid0 mii pata_via multipath linear
[  628.283320] CPU: 3 PID: 249 Comm: kworker/3:1 Not tainted 
4.0.5-040005-generic #201506061639
[  628.283322] Hardware name: System manufacturer System Product 
Name/P8H67-M PRO, BIOS 3904 04/27/2013

[  628.283328] Workqueue: md_misc mddev_delayed_delete
[  628.283341]  0174 88040867bca8 817e4a5d 
0007
[  628.283347]   88040867bce8 81079227 
81adc895
[  628.283352]  880409197c00   
88041f2da900

[  628.283365] Call Trace:
[  628.283374]  [817e4a5d] dump_stack+0x45/0x57
[  628.283381]  [81079227] warn_slowpath_common+0x97/0xe0
[  628.283386]  [8107928a] warn_slowpath_null+0x1a/0x20
[  628.283390]  [811a3f16] bdi_unregister+0x36/0x40
[  628.283397]  [813955f8] del_gendisk+0x108/0x260
[  628.283402]  [81648eec] md_free+0x4c/0x70
[  628.283408]  [813b6b62] kobject_cleanup+0x82/0x1c0
[  628.283413]  [813b69f0] kobject_put+0x30/0x70
[  628.283417]  [81649c44] mddev_delayed_delete+0x34/0x40
[  628.283422]  [81092204] process_one_work+0x144/0x490
[  628.283426]  [81092c6e] worker_thread+0x11e/0x450
[  628.283431]  [81092b50] ? create_worker+0x1f0/0x1f0
[  628.283436]  [81098999] kthread+0xc9/0xe0
[  628.283442]  [810988d0] ? flush_kthread_worker+0x90/0x90
[  628.283448]  [817f1118] ret_from_fork+0x58/0x90
[  628.283453]  [810988d0] ? flush_kthread_worker+0x90/0x90
[  628.283457] ---[ end trace 2c187f15cc11aca4 ]---



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel panic when writing to 32 GB SDHC card

2014-07-22 Thread Tomasz Chmielewski

I'm experiencing kernel panics when writing random data (say, rsyncing 
data from a different host) to a SDHC card.

Typically, this happens after 5 mins - 1 hour of such activity.

The trace contains __usb_hdc_giveback_urb, sg_complete, but 
unfortunately I don't have it in text format.


I've uploaded photo captures here:

http://www.virtall.com/files/temp/usb/


Kernels where I saw the panic:

- 3.13.0-32.57 - Ubuntu 14.04 x86 kernel
- 3.15.6 - from kernel.org


Samsung Ativ laptop:

Manufacturer: SAMSUNG ELECTRONICS CO., LTD.
Product Name: 905S3G/906S3G/915S3G
Version: P06RBV

Bus 001 Device 002: ID 0bda:0129 Realtek Semiconductor Corp. RTS5129 
Card Reader Controller




--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

kernel panic when writing to 32 GB SDHC card

2014-07-22 Thread Tomasz Chmielewski

I'm experiencing kernel panics when writing random data (say, rsyncing 
data from a different host) to a SDHC card.

Typically, this happens after 5 mins - 1 hour of such activity.

The trace contains __usb_hdc_giveback_urb, sg_complete, but 
unfortunately I don't have it in text format.


I've uploaded photo captures here:

http://www.virtall.com/files/temp/usb/


Kernels where I saw the panic:

- 3.13.0-32.57 - Ubuntu 14.04 x86 kernel
- 3.15.6 - from kernel.org


Samsung Ativ laptop:

Manufacturer: SAMSUNG ELECTRONICS CO., LTD.
Product Name: 905S3G/906S3G/915S3G
Version: P06RBV

Bus 001 Device 002: ID 0bda:0129 Realtek Semiconductor Corp. RTS5129 
Card Reader Controller




--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

BUG: unable to handle kernel NULL pointer dereference + panic on 3.2.11 (with various networking pointers, on Dell r720xd)

2012-12-16 Thread Tomasz Chmielewski

 Integrated Memory 
Controller Target Address Decoder 1 (rev 07)
7f:0f.4 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Target Address Decoder 2 (rev 07)
7f:0f.5 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Target Address Decoder 3 (rev 07)
7f:0f.6 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Target Address Decoder 4 (rev 07)
7f:10.0 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 0 (rev 07)
7f:10.1 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 1 (rev 07)
7f:10.2 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 0 (rev 07)
7f:10.3 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 1 (rev 07)
7f:10.4 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 2 (rev 07)
7f:10.5 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 3 (rev 07)
7f:10.6 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 2 (rev 07)
7f:10.7 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 3 (rev 07)
7f:11.0 System peripheral: Intel Corporation Sandy Bridge DDRIO (rev 07)
7f:13.0 System peripheral: Intel Corporation Sandy Bridge R2PCIe (rev 07)
7f:13.1 Performance counters: Intel Corporation Sandy Bridge Ring to PCI 
Express Performance Monitor (rev 07)
7f:13.4 Performance counters: Intel Corporation Sandy Bridge QuickPath 
Interconnect Agent Ring Registers (rev 07)
7f:13.5 Performance counters: Intel Corporation Sandy Bridge Ring to QuickPath 
Interconnect Link 0 Performance Monitor (rev 07)
7f:13.6 System peripheral: Intel Corporation Sandy Bridge Ring to QuickPath 
Interconnect Link 1 Performance Monitor (rev 07)



-- 
Tomasz Chmielewski
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

BUG: unable to handle kernel NULL pointer dereference + panic on 3.2.11 (with various networking pointers, on Dell r720xd)

2012-12-16 Thread Tomasz Chmielewski

 Integrated Memory 
Controller Target Address Decoder 1 (rev 07)
7f:0f.4 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Target Address Decoder 2 (rev 07)
7f:0f.5 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Target Address Decoder 3 (rev 07)
7f:0f.6 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Target Address Decoder 4 (rev 07)
7f:10.0 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 0 (rev 07)
7f:10.1 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 1 (rev 07)
7f:10.2 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 0 (rev 07)
7f:10.3 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 1 (rev 07)
7f:10.4 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 2 (rev 07)
7f:10.5 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller Channel 0-3 Thermal Control 3 (rev 07)
7f:10.6 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 2 (rev 07)
7f:10.7 System peripheral: Intel Corporation Sandy Bridge Integrated Memory 
Controller ERROR Registers 3 (rev 07)
7f:11.0 System peripheral: Intel Corporation Sandy Bridge DDRIO (rev 07)
7f:13.0 System peripheral: Intel Corporation Sandy Bridge R2PCIe (rev 07)
7f:13.1 Performance counters: Intel Corporation Sandy Bridge Ring to PCI 
Express Performance Monitor (rev 07)
7f:13.4 Performance counters: Intel Corporation Sandy Bridge QuickPath 
Interconnect Agent Ring Registers (rev 07)
7f:13.5 Performance counters: Intel Corporation Sandy Bridge Ring to QuickPath 
Interconnect Link 0 Performance Monitor (rev 07)
7f:13.6 System peripheral: Intel Corporation Sandy Bridge Ring to QuickPath 
Interconnect Link 1 Performance Monitor (rev 07)



-- 
Tomasz Chmielewski
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-21 Thread Tomasz Chmielewski


On 07/22/2012 12:48 PM, Roland Dreier wrote:

Thanks Hugh.  I just went ahead and built 3.5 final, and suspend/resume
look to be working again.

I'm not even going to try to understand how a timekeeping bug broke resume...


Yep, seems to be working fine here, too.

--
Tomasz Chmielewski
http://www.ptraveler.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-21 Thread Tomasz Chmielewski


On 07/22/2012 12:48 PM, Roland Dreier wrote:

Thanks Hugh.  I just went ahead and built 3.5 final, and suspend/resume
look to be working again.

I'm not even going to try to understand how a timekeeping bug broke resume...


Yep, seems to be working fine here, too.

--
Tomasz Chmielewski
http://www.ptraveler.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-18 Thread Tomasz Chmielewski

27] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: 
link becomes ready



-- 
Tomasz Chmielewski
http://www.ptraveler.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

3.5-rc7 - can no longer wake up from suspend to RAM

2012-07-18 Thread Tomasz Chmielewski

 s9 kernel: [57817.933344] iwlwifi :01:00.0: Radio 
type=0x0-0x3-0x1
Jul 18 01:34:50 s9 kernel: [57818.057158] IPv6: ADDRCONF(NETDEV_UP): wlan0: 
link is not ready
Jul 18 01:34:50 s9 kernel: [57818.162539] r8169 :02:00.0: eth0: link down
Jul 18 01:34:50 s9 kernel: [57818.164080] IPv6: ADDRCONF(NETDEV_UP): eth0: link 
is not ready
Jul 18 01:34:59 s9 kernel: [57826.890665] wlan0: authenticate with 
00:27:22:44:ee:4f
Jul 18 01:34:59 s9 kernel: [57826.914877] wlan0: send auth to 00:27:22:44:ee:4f 
(try 1/3)
Jul 18 01:34:59 s9 kernel: [57826.916329] wlan0: authenticated
Jul 18 01:34:59 s9 kernel: [57826.918532] wlan0: associate with 
00:27:22:44:ee:4f (try 1/3)
Jul 18 01:34:59 s9 kernel: [57826.921970] wlan0: RX AssocResp from 
00:27:22:44:ee:4f (capab=0x431 status=0 aid=5)
Jul 18 01:34:59 s9 kernel: [57826.921982] wlan0: associated
Jul 18 01:34:59 s9 kernel: [57826.929627] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: 
link becomes ready



-- 
Tomasz Chmielewski
http://www.ptraveler.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Chris Mason schrieb:

On Tuesday 19 February 2008, Tomasz Chmielewski wrote:

Theodore Tso schrieb:

(...)


The following ld_preload can help in some cases.  Mutt has this hack
encoded in for maildir directories, which helps.

It doesn't work very reliable for me.

For some reason, it hangs for me sometimes (doesn't remove any files, rm
-rf just stalls), or segfaults.


You can go the low-tech route (assuming your file names don't have spaces in 
them)


find . -printf "%i %p\n" | sort -n | awk '{print $2}' | xargs rm


Why should it make a difference?

Does "find" find filenames/paths faster than "rm -r"?

Or is "find once/remove once" faster than "find files/rm files/find 
files/rm files/...", which I suppose "rm -r" does?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Theodore Tso schrieb:

(...)


The following ld_preload can help in some cases.  Mutt has this hack
encoded in for maildir directories, which helps.


It doesn't work very reliable for me.

For some reason, it hangs for me sometimes (doesn't remove any files, rm 
-rf just stalls), or segfaults.



As most of the ideas here in this thread assume (re)creating a new 
filesystem from scratch - would perhaps playing with 
/proc/sys/vm/dirty_ratio and /proc/sys/vm/dirty_background_ratio help a bit?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Theodore Tso schrieb:

(...)


The following ld_preload can help in some cases.  Mutt has this hack
encoded in for maildir directories, which helps.


It doesn't work very reliable for me.

For some reason, it hangs for me sometimes (doesn't remove any files, rm 
-rf just stalls), or segfaults.



As most of the ideas here in this thread assume (re)creating a new 
filesystem from scratch - would perhaps playing with 
/proc/sys/vm/dirty_ratio and /proc/sys/vm/dirty_background_ratio help a bit?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Chris Mason schrieb:

On Tuesday 19 February 2008, Tomasz Chmielewski wrote:

Theodore Tso schrieb:

(...)


The following ld_preload can help in some cases.  Mutt has this hack
encoded in for maildir directories, which helps.

It doesn't work very reliable for me.

For some reason, it hangs for me sometimes (doesn't remove any files, rm
-rf just stalls), or segfaults.


You can go the low-tech route (assuming your file names don't have spaces in 
them)


find . -printf %i %p\n | sort -n | awk '{print $2}' | xargs rm


Why should it make a difference?

Does find find filenames/paths faster than rm -r?

Or is find once/remove once faster than find files/rm files/find 
files/rm files/..., which I suppose rm -r does?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Theodore Tso schrieb:

Are there better choices than ext3 for a filesystem with lots of hardlinks? 
ext4, once it's ready? xfs?


All filesystems are going to have problems keeping inodes close to
directories when you have huge numbers of hard links.

I'd really need to know exactly what kind of operations you were
trying to do that were causing problems before I could say for sure.
Yes, you said you were removing unneeded files, but how were you doing
it?  With rm -r of old hard-linked directories?


Yes, with rm -r.



How big are the
average files involved?  Etc.


It's hard to estimate the average size of a file. I'd say there are not 
many files bigger than 50 MB.


Basically, it's a filesystem where backups are kept. Backups are made 
with BackupPC [1].


Imagine a full rootfs backup of 100 Linux systems.

Instead of compressing and writing "/bin/bash" 100 times for each 
separate system, we do it once, and hardlink. Then, keep 40 copies back, 
and you have 4000 hardlinks.


For individual or user files, the number of hardlinks will be smaller of 
course.


The directories I want to remove have usually a structure of a "normal" 
Linux rootfs, nothing special there (other than most of the files will 
have multiple hardlinks).



I noticed using write back helps a tiny bit, but as dm and md don't 
support write barriers, I'm not very eager to use it.



[1] http://backuppc.sf.net
http://backuppc.sourceforge.net/faq/BackupPC.html#some_design_issues



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Theodore Tso schrieb:

(...)


What has helped a bit was to recreate the file system with -O^dir_index
dir_index seems to cause more seeks.


Part of it may have simply been recreating the filesystem, not
necessarily removing the dir_index feature.


You mean, copy data somewhere else, mkfs a new filesystem, and copy data 
back?


Unfortunately, doing it on a file level is not possible with a 
reasonable amount of time.


I tried to copy that filesystem once (when it was much smaller) with 
"rsync -a -H", but after 3 days, rsync was still building an index and 
didn't copy any file.



Also, as files/hardlinks come and go, it would degrade again.


Are there better choices than ext3 for a filesystem with lots of 
hardlinks? ext4, once it's ready? xfs?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

very poor ext3 write performance on big filesystems?


I have a 1.2 TB (of which 750 GB is used) filesystem which holds
almost 200 millions of files.
1.2 TB doesn't make this filesystem that big, but 200 millions of files 
is a decent number.



Most of the files are hardlinked multiple times, some of them are
hardlinked thousands of times.


Recently I began removing some of unneeded files (or hardlinks) and to 
my surprise, it takes longer than I initially expected.



After cache is emptied (echo 3 > /proc/sys/vm/drop_caches) I can usually 
remove about 5-20 files with moderate performance. I see up to 
5000 kB read/write from/to the disk, wa reported by top is usually 20-70%.



After that, waiting for IO grows to 99%, and disk write speed is down to 
50 kB/s - 200 kB/s (fifty - two hundred kilobytes/s).



Is it normal to expect the write speed go down to only few dozens of 
kilobytes/s? Is it because of that many seeks? Can it be somehow 
optimized? The machine has loads of free memory, perhaps it could be 
uses better?



Also, writing big files is very slow - it takes more than 4 minutes to 
write and sync a 655 MB file (so, a little bit more than 1 MB/s) - 
fragmentation perhaps?


+ dd if=/dev/zero of=testfile bs=64k count=1
1+0 records in
1+0 records out
65536 bytes (655 MB) copied, 3,12109 seconds, 210 MB/s
+ sync
0.00user 2.14system 4:06.76elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+883minor)pagefaults 0swaps


# df -h
FilesystemSize  Used Avail Use% Mounted on
/dev/sda  1,2T  697G  452G  61% /mnt/iscsi_backup

# df -i
FilesystemInodes   IUsed   IFree IUse% Mounted on
/dev/sda154M 20M134M   13% /mnt/iscsi_backup




--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

very poor ext3 write performance on big filesystems?


I have a 1.2 TB (of which 750 GB is used) filesystem which holds
almost 200 millions of files.
1.2 TB doesn't make this filesystem that big, but 200 millions of files 
is a decent number.



Most of the files are hardlinked multiple times, some of them are
hardlinked thousands of times.


Recently I began removing some of unneeded files (or hardlinks) and to 
my surprise, it takes longer than I initially expected.



After cache is emptied (echo 3  /proc/sys/vm/drop_caches) I can usually 
remove about 5-20 files with moderate performance. I see up to 
5000 kB read/write from/to the disk, wa reported by top is usually 20-70%.



After that, waiting for IO grows to 99%, and disk write speed is down to 
50 kB/s - 200 kB/s (fifty - two hundred kilobytes/s).



Is it normal to expect the write speed go down to only few dozens of 
kilobytes/s? Is it because of that many seeks? Can it be somehow 
optimized? The machine has loads of free memory, perhaps it could be 
uses better?



Also, writing big files is very slow - it takes more than 4 minutes to 
write and sync a 655 MB file (so, a little bit more than 1 MB/s) - 
fragmentation perhaps?


+ dd if=/dev/zero of=testfile bs=64k count=1
1+0 records in
1+0 records out
65536 bytes (655 MB) copied, 3,12109 seconds, 210 MB/s
+ sync
0.00user 2.14system 4:06.76elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+883minor)pagefaults 0swaps


# df -h
FilesystemSize  Used Avail Use% Mounted on
/dev/sda  1,2T  697G  452G  61% /mnt/iscsi_backup

# df -i
FilesystemInodes   IUsed   IFree IUse% Mounted on
/dev/sda154M 20M134M   13% /mnt/iscsi_backup




--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Theodore Tso schrieb:

(...)


What has helped a bit was to recreate the file system with -O^dir_index
dir_index seems to cause more seeks.


Part of it may have simply been recreating the filesystem, not
necessarily removing the dir_index feature.


You mean, copy data somewhere else, mkfs a new filesystem, and copy data 
back?


Unfortunately, doing it on a file level is not possible with a 
reasonable amount of time.


I tried to copy that filesystem once (when it was much smaller) with 
rsync -a -H, but after 3 days, rsync was still building an index and 
didn't copy any file.



Also, as files/hardlinks come and go, it would degrade again.


Are there better choices than ext3 for a filesystem with lots of 
hardlinks? ext4, once it's ready? xfs?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: very poor ext3 write performance on big filesystems?


Theodore Tso schrieb:

Are there better choices than ext3 for a filesystem with lots of hardlinks? 
ext4, once it's ready? xfs?


All filesystems are going to have problems keeping inodes close to
directories when you have huge numbers of hard links.

I'd really need to know exactly what kind of operations you were
trying to do that were causing problems before I could say for sure.
Yes, you said you were removing unneeded files, but how were you doing
it?  With rm -r of old hard-linked directories?


Yes, with rm -r.



How big are the
average files involved?  Etc.


It's hard to estimate the average size of a file. I'd say there are not 
many files bigger than 50 MB.


Basically, it's a filesystem where backups are kept. Backups are made 
with BackupPC [1].


Imagine a full rootfs backup of 100 Linux systems.

Instead of compressing and writing /bin/bash 100 times for each 
separate system, we do it once, and hardlink. Then, keep 40 copies back, 
and you have 4000 hardlinks.


For individual or user files, the number of hardlinks will be smaller of 
course.


The directories I want to remove have usually a structure of a normal 
Linux rootfs, nothing special there (other than most of the files will 
have multiple hardlinks).



I noticed using write back helps a tiny bit, but as dm and md don't 
support write barriers, I'm not very eager to use it.



[1] http://backuppc.sf.net
http://backuppc.sourceforge.net/faq/BackupPC.html#some_design_issues



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: currently active Linux kernel versions

2008-02-12 Thread Tomasz Chmielewski


Wagner Ferenc wrote:


which are the "currently active Linux kernel versions" at any point in
time?  The quote is taken from http://lkml.org/lkml/2008/2/11/29.
Or more precisely: which are the "stable" versions I can depend on for
a more or less critical server, those that have active security
support or receive at least critical bugfixes?  I know about the
2.6.2[34].y stable git trees, but I wonder how long will those receive
attention (that is, security fixes).  Can I find a written policy
somewhere?


I would say:

a) the kernel your distro provides,
b) if you're not using a kernel provided by your distribution, the 
newest kernel from kernel.org (there are some older, still maintaned 
kernels with security fixes, too).



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: currently active Linux kernel versions

2008-02-12 Thread Tomasz Chmielewski


Wagner Ferenc wrote:


which are the currently active Linux kernel versions at any point in
time?  The quote is taken from http://lkml.org/lkml/2008/2/11/29.
Or more precisely: which are the stable versions I can depend on for
a more or less critical server, those that have active security
support or receive at least critical bugfixes?  I know about the
2.6.2[34].y stable git trees, but I wonder how long will those receive
attention (that is, security fixes).  Can I find a written policy
somewhere?


I would say:

a) the kernel your distro provides,
b) if you're not using a kernel provided by your distribution, the 
newest kernel from kernel.org (there are some older, still maintaned 
kernels with security fixes, too).



--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: why kexec insists on syncing with recent kernels?

2008-02-08 Thread Tomasz Chmielewski


Randy Dunlap schrieb:

(...)


Even if you did -f, it must have shutdown the network. I think somehow
in latest kernels there is some dependency on network and that's why
not shutting down network in this case is helping you.


I'm seeing NFS mounts take forever to unmount (at shutdown/reboot).
(forever => 1 hour ... or never completes)

Is this similar to the problem that the OP is asking about?


Is it a diskless station?

Even in not, just make sure you don't shut the network down before NFS 
is actually unmounted...?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: why kexec insists on syncing with recent kernels?

2008-02-08 Thread Tomasz Chmielewski


Randy Dunlap schrieb:

(...)


Even if you did -f, it must have shutdown the network. I think somehow
in latest kernels there is some dependency on network and that's why
not shutting down network in this case is helping you.


I'm seeing NFS mounts take forever to unmount (at shutdown/reboot).
(forever = 1 hour ... or never completes)

Is this similar to the problem that the OP is asking about?


Is it a diskless station?

Even in not, just make sure you don't shut the network down before NFS 
is actually unmounted...?



--
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: why kexec insists on syncing with recent kernels?

Vivek Goyal schrieb:

On Thu, Feb 07, 2008 at 03:13:30PM +0100, Tomasz Chmielewski wrote:

According to kernel/kexec.c:

* kexec does not sync, or unmount filesystems so if you need
* that to happen you need to do that yourself.

In latest kexec code I do see it syncing. But it does not unmount the
filesystems. So this comment looks like partially wrong.

I saw this was true with 2.6.18 kernel (i.e., it didn't sync), but kexec
syncs with recent kernels (I checked 2.6.23.14 and 2.6.24):

# kexec -e
md: stopping all md devices
sd 2:0:0:0: [sdb] Synchronizing SCSI cache

Which kexec-tools you are using?

# kexec -v
kexec 1.101 released 15 February 2005

syncing is initiated by user space so changing kernel will not have
any effect (as long as user space is same). I think just that message
are spitted by kernel, so probably 2.6.18 did not spit any message and
2.6.24 does.

Yes and no.
I just booted 2.6.24 on a diskless system (Mandriva) I normally use with
2.6.18 kernel, did kexec -e... And it executed the kernel immediately,
without any syncing.

On Debian, with the same 2.6.24 kernel, it does sync.

So what user space part does the syncing (and how to prevent it)?

(...)

The way kexec works now makes rebooting unreliable again:
- network interfaces are brought down,
- kernel tries to sync - it never will, as we're booted off network, which
is down

Kexec has got an option -x --no-ifdown, which will not bring the network
down. Try that. "kexec- -e -x"

It does seem to help, thanks.

Why it has to be the last option specified?

I tried -f option before (don't call shutdown), but it didn't help.

Any ideas why kexec insists on syncing?

To me it makes sense. Just making sure that cache changes make to the file
system before you boot into new kernel.

In latest kexec-tools, I do see sync() is done first and then network
interfaces are brought down.

Try latest kexec tools from:

http://www.vergenet.net/~horms/linux/kexec/kexec-tools/testing/kexec-tools-testing-20071017-rc.tar.gz

Good to have a newer version, I'll try that, too.

--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

why kexec insists on syncing with recent kernels?


According to kernel/kexec.c:

 * kexec does not sync, or unmount filesystems so if you need
 * that to happen you need to do that yourself.


I saw this was true with 2.6.18 kernel (i.e., it didn't sync), but kexec 
syncs with recent kernels (I checked 2.6.23.14 and 2.6.24):


# kexec -e
md: stopping all md devices
sd 2:0:0:0: [sdb] Synchronizing SCSI cache


With kexec on 2.6.18 it was executing a loaded kernel immediately.


Generally, it's a good thing to sync before jumping into a new kernel, 
but it breaks my setup here after upgrading from 2.6.18 to 2.6.24.


Why?

I have a couple of diskless (iSCSI-boot) machines with a buggy BIOS (old 
Supermicro P4SBR/P4SBE) which randomly freeze after rebooting (the 
machine shuts down just fine, but instead of booting again, showing BIOS 
bootup messages etc. you can just see blank screen).


Therefore, I use kexec as a workaround for this rebooting problem.

The way kexec works now makes rebooting unreliable again:
- network interfaces are brought down,
- kernel tries to sync - it never will, as we're booted off network, 
which is down


Any ideas why kexec insists on syncing?


--
Tomasz Chmielewski
http://blog.wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

why kexec insists on syncing with recent kernels?


According to kernel/kexec.c:

 * kexec does not sync, or unmount filesystems so if you need
 * that to happen you need to do that yourself.


I saw this was true with 2.6.18 kernel (i.e., it didn't sync), but kexec 
syncs with recent kernels (I checked 2.6.23.14 and 2.6.24):


# kexec -e
md: stopping all md devices
sd 2:0:0:0: [sdb] Synchronizing SCSI cache


With kexec on 2.6.18 it was executing a loaded kernel immediately.


Generally, it's a good thing to sync before jumping into a new kernel, 
but it breaks my setup here after upgrading from 2.6.18 to 2.6.24.


Why?

I have a couple of diskless (iSCSI-boot) machines with a buggy BIOS (old 
Supermicro P4SBR/P4SBE) which randomly freeze after rebooting (the 
machine shuts down just fine, but instead of booting again, showing BIOS 
bootup messages etc. you can just see blank screen).


Therefore, I use kexec as a workaround for this rebooting problem.

The way kexec works now makes rebooting unreliable again:
- network interfaces are brought down,
- kernel tries to sync - it never will, as we're booted off network, 
which is down


Any ideas why kexec insists on syncing?


--
Tomasz Chmielewski
http://blog.wpkg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: why kexec insists on syncing with recent kernels?

Vivek Goyal schrieb:

On Thu, Feb 07, 2008 at 03:13:30PM +0100, Tomasz Chmielewski wrote:

According to kernel/kexec.c:

* kexec does not sync, or unmount filesystems so if you need
* that to happen you need to do that yourself.

In latest kexec code I do see it syncing. But it does not unmount the
filesystems. So this comment looks like partially wrong.

I saw this was true with 2.6.18 kernel (i.e., it didn't sync), but kexec
syncs with recent kernels (I checked 2.6.23.14 and 2.6.24):

# kexec -e
md: stopping all md devices
sd 2:0:0:0: [sdb] Synchronizing SCSI cache

Which kexec-tools you are using?

# kexec -v
kexec 1.101 released 15 February 2005

Yes and no.
I just booted 2.6.24 on a diskless system (Mandriva) I normally use with
2.6.18 kernel, did kexec -e... And it executed the kernel immediately,
without any syncing.

On Debian, with the same 2.6.24 kernel, it does sync.

So what user space part does the syncing (and how to prevent it)?

(...)

The way kexec works now makes rebooting unreliable again:
- network interfaces are brought down,
- kernel tries to sync - it never will, as we're booted off network, which
is down

Kexec has got an option -x --no-ifdown, which will not bring the network
down. Try that. kexec- -e -x

It does seem to help, thanks.

Why it has to be the last option specified?

I tried -f option before (don't call shutdown), but it didn't help.

Any ideas why kexec insists on syncing?

To me it makes sense. Just making sure that cache changes make to the file
system before you boot into new kernel.

In latest kexec-tools, I do see sync() is done first and then network
interfaces are brought down.

Try latest kexec tools from:

http://www.vergenet.net/~horms/linux/kexec/kexec-tools/testing/kexec-tools-testing-20071017-rc.tar.gz

Good to have a newer version, I'll try that, too.

--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

What is the limit size of tmpfs /dev/shm ?

2008-02-06 Thread Tomasz Chmielewski


Hello Kernel Users,

is there a size limit for tmpfs for the /dev/shm filesystem?
Normally its default size is set to 2 GB. Is it possible to create a 2 
TB (Terrabyte) filesystem with tmpfs?

Or is there a maximum size defined in the linux kernel?


Depends on your arch.

If it's 32 bit, it's limited to 16TB:

# mount -o size=16383G -t tmpfs tmpfs /mnt/2
# df -h
(...)
tmpfs  16T 0   16T   0% /mnt/2

# umount /mnt/2

# mount -o size=16385G -t tmpfs tmpfs /mnt/2
# df -h
(...)
tmpfs 1.0G 0  1.0G   0% /mnt/2


So 16384G would mean the same as 0.


If you're 64 bit, you need to have really loads of storage and/or RAM to 
accumulate 16EB:


# mount -t tmpfs -o size=171798691839G tmpfs /mnt/2
# df -h
(...)
tmpfs  16E 0   16E   0% /mnt/2


--
Tomasz Chmielewski
http://lists.wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

What is the limit size of tmpfs /dev/shm ?

2008-02-06 Thread Tomasz Chmielewski


Hello Kernel Users,

is there a size limit for tmpfs for the /dev/shm filesystem?
Normally its default size is set to 2 GB. Is it possible to create a 2 
TB (Terrabyte) filesystem with tmpfs?

Or is there a maximum size defined in the linux kernel?


Depends on your arch.

If it's 32 bit, it's limited to 16TB:

# mount -o size=16383G -t tmpfs tmpfs /mnt/2
# df -h
(...)
tmpfs  16T 0   16T   0% /mnt/2

# umount /mnt/2

# mount -o size=16385G -t tmpfs tmpfs /mnt/2
# df -h
(...)
tmpfs 1.0G 0  1.0G   0% /mnt/2


So 16384G would mean the same as 0.


If you're 64 bit, you need to have really loads of storage and/or RAM to 
accumulate 16EB:


# mount -t tmpfs -o size=171798691839G tmpfs /mnt/2
# df -h
(...)
tmpfs  16E 0   16E   0% /mnt/2


--
Tomasz Chmielewski
http://lists.wpkg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Tomasz Chmielewski


FUJITA Tomonori schrieb:

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:


James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.

I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).


I don't know what "iSCSI initiator connections are corrupted"
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


The problem with tgtd is that you can't start it (configured) in an
"atomic" way.
Usually, one will start tgtd and it's configuration in a script (I 
replaced some parameters with "..." to make it shorter and more readable):



tgtd
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


However, this won't work - tgtd goes immediately in the background as it 
is still starting, and the first tgtadm commands will fail:


# bash -x tgtd-start
+ tgtd
+ tgtadm --op new --mode target ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op new --mode account ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op bind --mode account --tid 1 ...
tgtadm: can't find the target
+ tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
tgtadm: can't find the target
+ tgtadm --op bind --mode target --tid 1 -I ALL
tgtadm: can't find the target
+ tgtadm --op new --mode target --tid 2 ...
+ tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
+ tgtadm --op bind --mode target --tid 2 -I ALL


OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
second right after tgtd?


tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


No, it is not a good idea - if tgtd listens on port 3260 *and* is 
unconfigured yet,  any reconnecting initiator will fail, like below:


end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
Aborting journal on device sdb.
ext3_abort called.
EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 6728
Buffer I/O error on device sdb, logical block 841
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data


Ouch.

So the only way to start/restart tgtd reliably is to do hacks which are 
needed with yet another iSCSI kernel implementation (IET): use iptables.


iptables 
tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...
iptables 


A bit ugly, isn't it?
Having to tinker with a firewall in order to start a daemon is by no 
means a sign of a well-tested and mature project.


That's why I asked how many people use stgt in a production environment 
- James was worried about a potential migration path for current users.




--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Tomasz Chmielewski


FUJITA Tomonori schrieb:

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski [EMAIL PROTECTED] wrote:


James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.

I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).


I don't know what iSCSI initiator connections are corrupted
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


The problem with tgtd is that you can't start it (configured) in an
atomic way.
Usually, one will start tgtd and it's configuration in a script (I 
replaced some parameters with ... to make it shorter and more readable):



tgtd
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


However, this won't work - tgtd goes immediately in the background as it 
is still starting, and the first tgtadm commands will fail:


# bash -x tgtd-start
+ tgtd
+ tgtadm --op new --mode target ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op new --mode account ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op bind --mode account --tid 1 ...
tgtadm: can't find the target
+ tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
tgtadm: can't find the target
+ tgtadm --op bind --mode target --tid 1 -I ALL
tgtadm: can't find the target
+ tgtadm --op new --mode target --tid 2 ...
+ tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
+ tgtadm --op bind --mode target --tid 2 -I ALL


OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
second right after tgtd?


tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


No, it is not a good idea - if tgtd listens on port 3260 *and* is 
unconfigured yet,  any reconnecting initiator will fail, like below:


end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
Aborting journal on device sdb.
ext3_abort called.
EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 6728
Buffer I/O error on device sdb, logical block 841
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data


Ouch.

So the only way to start/restart tgtd reliably is to do hacks which are 
needed with yet another iSCSI kernel implementation (IET): use iptables.


iptables block iSCSI traffic
tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...
iptables unblock iSCSI traffic


A bit ugly, isn't it?
Having to tinker with a firewall in order to start a daemon is by no 
means a sign of a well-tested and mature project.


That's why I asked how many people use stgt in a production environment 
- James was worried about a potential migration path for current users.




--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Tomasz Chmielewski


James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.


I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).

Imagine you have to reboot all your NFS clients when you reboot your NFS
server. Not only that - your data is probably corrupted, or at least the
filesystem deserves checking...


--
Tomasz Chmielewski
http://wpkg.org



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Tomasz Chmielewski


James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.


I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).

Imagine you have to reboot all your NFS clients when you reboot your NFS
server. Not only that - your data is probably corrupted, or at least the
filesystem deserves checking...


--
Tomasz Chmielewski
http://wpkg.org



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: (ondemand) CPU governor regression between 2.6.23 and 2.6.24

2008-01-26 Thread Tomasz Chmielewski


Toralf Förster wrote:


I use a 1-liner for a simple performance check : "time factor 
819734028463158891"
Here is the result for the new (Gentoo) kernel 2.6.24:

With the  ondemand governor of the  I get:

[EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891
819734028463158891: 3 273244676154386297

real0m32.997s
user0m15.732s
sys 0m0.014s

With the ondemand governor the CPU runs at 600 MHz,
whereas with the performance governor I get :

[EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891
819734028463158891: 3 273244676154386297

real0m10.893s
user0m5.444s
sys 0m0.000s

(~5.5 sec as I expected) b/c the CPU is set to 1.7 GHz.

The ondeman governor of previous kernel versions however automatically increased
the CPU speed from 600 MHz to 1.7 GHz.

My system is a ThinkPad T41, I'll attach the .config 


During the test, run top, and watch your CPU usage. Does it go above 80% 
(the default for 
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold).


ondemand CPUfreq governor has a few tunables, described in 
Documentation/cpu-freq. One of them is up_threshold:


up_threshold: defines what the average CPU usaged between the samplings
of 'sampling_rate' needs to be for the kernel to make a decision on
whether it should increase the frequency. For example when it is set
to its default value of '80' it means that between the checking
intervals the CPU needs to be on average more than 80% in use to then
decide that the CPU frequency needs to be increased.

What CPUFreq processor driver are you using?


I had a similar problem with CPUfreq and dm-crypt (slow reads), see 
(more setup problem than something kernel-related):


http://blog.wpkg.org/2008/01/22/cpufreq-and-dm-crypt-performance-problems/


--
Tomasz Chmielewski
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: (ondemand) CPU governor regression between 2.6.23 and 2.6.24

2008-01-26 Thread Tomasz Chmielewski


Toralf Förster wrote:


I use a 1-liner for a simple performance check : time factor 
819734028463158891
Here is the result for the new (Gentoo) kernel 2.6.24:

With the  ondemand governor of the  I get:

[EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891
819734028463158891: 3 273244676154386297

real0m32.997s
user0m15.732s
sys 0m0.014s

With the ondemand governor the CPU runs at 600 MHz,
whereas with the performance governor I get :

[EMAIL PROTECTED] ~/tmp $ time factor 819734028463158891
819734028463158891: 3 273244676154386297

real0m10.893s
user0m5.444s
sys 0m0.000s

(~5.5 sec as I expected) b/c the CPU is set to 1.7 GHz.

The ondeman governor of previous kernel versions however automatically increased
the CPU speed from 600 MHz to 1.7 GHz.

My system is a ThinkPad T41, I'll attach the .config 


During the test, run top, and watch your CPU usage. Does it go above 80% 
(the default for 
/sys/devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold).


ondemand CPUfreq governor has a few tunables, described in 
Documentation/cpu-freq. One of them is up_threshold:


up_threshold: defines what the average CPU usaged between the samplings
of 'sampling_rate' needs to be for the kernel to make a decision on
whether it should increase the frequency. For example when it is set
to its default value of '80' it means that between the checking
intervals the CPU needs to be on average more than 80% in use to then
decide that the CPU frequency needs to be increased.

What CPUFreq processor driver are you using?


I had a similar problem with CPUfreq and dm-crypt (slow reads), see 
(more setup problem than something kernel-related):


http://blog.wpkg.org/2008/01/22/cpufreq-and-dm-crypt-performance-problems/


--
Tomasz Chmielewski
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM: Celeron Core

2008-01-20 Thread Tomasz Chmielewski

Clock throttling is not likely to save your battery, unless you have 
tasks that are running at 100% CPU for an unlimited time or something, 
and you force your CPU to throttle. Normally most people have tasks that 
run and then the CPU idles - loading an email, displaying a web page, 
etc. Clock throttling will just make these tasks utilize the CPU for a 
longer time proportional to the amount clock throttling and therefore 
negate any gains in battery usage.


Aren't you forgetting about CPUfreq governors? Which mean: use the 
maximum CPU frequency when the system is busy, throttle down (or lower 
voltage) when the system is idle.


So yes, throttling will save the battery.

Besides, not all CPUs support power management (voltage control).



IMO clock throttling (as opposed to the reduction of the frequency of an idle
CPU) is only useful for preventing the CPU from overheating.


And for reducing power on CPUs that can't do any power management, just 
throttling.


For example, a server that doesn't crunch any numbers at night will 
certainly use less power when throttled.



--
Tomasz Chmielewski
http://wpkg.org


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PROBLEM: Celeron Core

2008-01-20 Thread Tomasz Chmielewski

Clock throttling is not likely to save your battery, unless you have 
tasks that are running at 100% CPU for an unlimited time or something, 
and you force your CPU to throttle. Normally most people have tasks that 
run and then the CPU idles - loading an email, displaying a web page, 
etc. Clock throttling will just make these tasks utilize the CPU for a 
longer time proportional to the amount clock throttling and therefore 
negate any gains in battery usage.


Aren't you forgetting about CPUfreq governors? Which mean: use the 
maximum CPU frequency when the system is busy, throttle down (or lower 
voltage) when the system is idle.


So yes, throttling will save the battery.

Besides, not all CPUs support power management (voltage control).



IMO clock throttling (as opposed to the reduction of the frequency of an idle
CPU) is only useful for preventing the CPU from overheating.


And for reducing power on CPUs that can't do any power management, just 
throttling.


For example, a server that doesn't crunch any numbers at night will 
certainly use less power when throttled.



--
Tomasz Chmielewski
http://wpkg.org


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bugzilla is FPOS (was: Re: "buggy cmd640" message

2007-11-25 Thread Tomasz Chmielewski


Rafael J. Wysocki wrote:

(...)


* After each major kernel release bugzilla should send a kind request for
  retesting to all open bugs.


Good idea, IMO.


Another alternative would be to send such a request if a given bug had 
no activity for, say, 6 months.


(...)


* Last but not least our bugzilla just looks ugly (it is _very_ important,
  I feel disgusted each time I have to work with it, OTOH I love using
  gitweb - you get the idea).


Well, that doesn't matter to me as long as it's useful.  Any ideas how to
improve that? ;-)


Upgrade to Bugzilla 3.0.x. Its interface looks a bit better (and has a 
handful of useful features).


(...)


Hmm, what about switching to some proprietary bug tracking system just to
talk Linus into writing a superior one?  ;-)


I think that we just have to get an idea of what exactly is needed.  IOW, we
need to know exactly how we're going to handle bugs as much as we needed
to know exactly how we were going the handle the flow of changes.
Perhaps it would be necessary to use a proprietary bug tracking system for some
time for this purpose, but _maybe_ we can figure it out without anything like
that.


How do others track bugs for software projects? RedHat, Novell, IBM, 
others - anyone reading this thread?



--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kernel bugzilla is FPOS (was: Re: buggy cmd640 message

2007-11-25 Thread Tomasz Chmielewski


Rafael J. Wysocki wrote:

(...)


* After each major kernel release bugzilla should send a kind request for
  retesting to all open bugs.


Good idea, IMO.


Another alternative would be to send such a request if a given bug had 
no activity for, say, 6 months.


(...)


* Last but not least our bugzilla just looks ugly (it is _very_ important,
  I feel disgusted each time I have to work with it, OTOH I love using
  gitweb - you get the idea).


Well, that doesn't matter to me as long as it's useful.  Any ideas how to
improve that? ;-)


Upgrade to Bugzilla 3.0.x. Its interface looks a bit better (and has a 
handful of useful features).


(...)


Hmm, what about switching to some proprietary bug tracking system just to
talk Linus into writing a superior one?  ;-)


I think that we just have to get an idea of what exactly is needed.  IOW, we
need to know exactly how we're going to handle bugs as much as we needed
to know exactly how we were going the handle the flow of changes.
Perhaps it would be necessary to use a proprietary bug tracking system for some
time for this purpose, but _maybe_ we can figure it out without anything like
that.


How do others track bugs for software projects? RedHat, Novell, IBM, 
others - anyone reading this thread?



--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strage buffer behaviour

2007-11-14 Thread Tomasz Chmielewski


Denys Vlasenko wrote:


On Sunday 11 November 2007 11:33, Tino Keitel wrote:

The dd command reads 100 MB from each partition two times in a row. It
looks like sda1 and sda2 are not bufferd (the first 4 dd runs), but
sda3 and sda4 are (the last 4 dd runs).

The computer is a Mac mini with a 2,5" SATA hard disk. The first 2
partitions contain EFI and MacOS X, and are unused in Linux. The last 2
partitions are an ext3 partition for / and an LVM for the rest of the
sytem.

Any hints how the dd/buffering behaviour could be explained? The system
was mostly idle, and the numbers are reproducible across reboots.


IIRC only mounted partitions' reads are cached.


Or, in general, those devices which kernel actually "uses" (mounted, but 
also LVM, RAID, which don't have to be mounted to get cached).



--
Tomasz Chmielewski
http://lists.wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Strage buffer behaviour

2007-11-14 Thread Tomasz Chmielewski


Denys Vlasenko wrote:


On Sunday 11 November 2007 11:33, Tino Keitel wrote:

The dd command reads 100 MB from each partition two times in a row. It
looks like sda1 and sda2 are not bufferd (the first 4 dd runs), but
sda3 and sda4 are (the last 4 dd runs).

The computer is a Mac mini with a 2,5 SATA hard disk. The first 2
partitions contain EFI and MacOS X, and are unused in Linux. The last 2
partitions are an ext3 partition for / and an LVM for the rest of the
sytem.

Any hints how the dd/buffering behaviour could be explained? The system
was mostly idle, and the numbers are reproducible across reboots.


IIRC only mounted partitions' reads are cached.


Or, in general, those devices which kernel actually uses (mounted, but 
also LVM, RAID, which don't have to be mounted to get cached).



--
Tomasz Chmielewski
http://lists.wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Possible 2.6.23 regression - Disappearing disk space

2007-10-26 Thread Tomasz Chmielewski


James Ausmus wrote:


OK, false alarm, this is definitely a userpsace (or a user... :)
problem - had a 12GB .xsession-errors file that I had deleted but was
still being held open - now I just have to determine why I have a 12GB
.xsession-errors file... :(


Surely, .xsession-errors was 12GB large because of a 2.6.23 regression ;)


--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Possible 2.6.23 regression - Disappearing disk space

2007-10-26 Thread Tomasz Chmielewski


James Ausmus wrote:


OK, false alarm, this is definitely a userpsace (or a user... :)
problem - had a 12GB .xsession-errors file that I had deleted but was
still being held open - now I just have to determine why I have a 12GB
.xsession-errors file... :(


Surely, .xsession-errors was 12GB large because of a 2.6.23 regression ;)


--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Possible 2.6.23 regression - Disappearing disk space

2007-10-24 Thread Tomasz Chmielewski


James Ausmus wrote:


As a note - when I first see the issue, I have exactly 0 free space
available on root, as per df - I then delete some random things in
order to have enough free space to operate, which is why in my first
df you see 55M available


Perhaps some program still uses some (deleted) files.

Here's how you can achieve a similar effect manually:

# dd if=/dev/zero of=/file

And on another terminal:

# rm -f /file


Now watch your space decreasing with "df -h", although "the file was 
deleted". Was it really?


# lsof -n|grep /file
dd 6406   root1w  REG8,1  807791616 
45 /file (deleted)




That said, you might want to use lsof and search for "deleted" before 
concluding any further.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Possible 2.6.23 regression - Disappearing disk space

2007-10-24 Thread Tomasz Chmielewski


James Ausmus wrote:


As a note - when I first see the issue, I have exactly 0 free space
available on root, as per df - I then delete some random things in
order to have enough free space to operate, which is why in my first
df you see 55M available


Perhaps some program still uses some (deleted) files.

Here's how you can achieve a similar effect manually:

# dd if=/dev/zero of=/file

And on another terminal:

# rm -f /file


Now watch your space decreasing with df -h, although the file was 
deleted. Was it really?


# lsof -n|grep /file
dd 6406   root1w  REG8,1  807791616 
45 /file (deleted)




That said, you might want to use lsof and search for deleted before 
concluding any further.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-10-09 Thread Tomasz Chmielewski

ernel:  [] do_trap+0x89/0xa2
Oct  3 10:14:12 tomek kernel:  [] do_invalid_op+0x88/0x92
Oct  3 10:14:12 tomek kernel:  [] error_code+0x6a/0x70
Oct  3 10:14:12 tomek kernel:  [] unmap_vmas+0x236/0x425
Oct  3 10:14:12 tomek kernel:  [] exit_mmap+0x68/0xf0
Oct  3 10:14:12 tomek kernel:  [] mmput+0x1e/0x88
Oct  3 10:14:12 tomek kernel:  [] exit_mm+0xbb/0xc1
Oct  3 10:14:12 tomek kernel:  [] do_exit+0x1f0/0x720
Oct  3 10:14:12 tomek kernel:  [] sys_exit_group+0x0/0x11
Oct  3 10:14:12 tomek kernel:  [] sys_exit_group+0xf/0x11
Oct  3 10:14:12 tomek kernel:  [] sysenter_past_esp+0x5f/0x99
Oct  3 10:14:12 tomek kernel:  ===



--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22.6: kernel BUG at fs/locks.c:171

2007-10-09 Thread Tomasz Chmielewski

] __sched_text_start+0x6e/0x5d5
Oct  3 10:14:12 tomek kernel:  [c011bd52] do_exit+0xf1/0x720
Oct  3 10:14:12 tomek kernel:  [c01052ab] die+0x1ce/0x1d6
Oct  3 10:14:12 tomek kernel:  [c02be0af] do_trap+0x89/0xa2
Oct  3 10:14:12 tomek kernel:  [c0105605] do_invalid_op+0x88/0x92
Oct  3 10:14:12 tomek kernel:  [c02bde8a] error_code+0x6a/0x70
Oct  3 10:14:12 tomek kernel:  [c0154edc] unmap_vmas+0x236/0x425
Oct  3 10:14:12 tomek kernel:  [c0157a49] exit_mmap+0x68/0xf0
Oct  3 10:14:12 tomek kernel:  [c0117553] mmput+0x1e/0x88
Oct  3 10:14:12 tomek kernel:  [c011aa52] exit_mm+0xbb/0xc1
Oct  3 10:14:12 tomek kernel:  [c011be51] do_exit+0x1f0/0x720
Oct  3 10:14:12 tomek kernel:  [c011c3ef] sys_exit_group+0x0/0x11
Oct  3 10:14:12 tomek kernel:  [c011c3fe] sys_exit_group+0xf/0x11
Oct  3 10:14:12 tomek kernel:  [c0103da2] sysenter_past_esp+0x5f/0x99
Oct  3 10:14:12 tomek kernel:  ===



--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: software unplug and plug USB


Miguel schrieb:

I also have a mobile phone connected, a gps and a keyboard, i do the
following:


[EMAIL PROTECTED]:~$ lsmod | grep usb
usbserial  29928  1 pl2303
usbtouchscreen  8708  0 
usbcore   130304  7

libusual,pl2303,usbserial,usbtouchscreen,ehci_hcd,uhci_hcd
[EMAIL PROTECTED]:~$ rmmod pl2303
ERROR: Removing 'pl2303': Operation not permitted
[EMAIL PROTECTED]:~$ sudo rmmod pl2303
[EMAIL PROTECTED]:~$ sudo rmmod usbserial


 Only then It's not enough. You also have to remove ehci_hcd / uhci_hcd 
/ ohci_hcd.the device will be powered off (unless the method described 
in the link given by Jiri Slaby powers the device off, too).




[EMAIL PROTECTED]:~$ modprobe usbserial
FATAL: Error inserting usbserial
(/lib/modules/2.6.17-11-386/kernel/drivers/usb/serial/usbserial.ko):
Operation not permitted
(reverse-i-search)`mod': cd backup_modules/


cd backup_modules/? Where do you keep your modules?


--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: software unplug and plug USB


Miguel schrieb:

actually, I have a USB keyboard, could that be the problem? why?


Because when you remove USB modules, USB devices, including USB 
keyboards, won't work...




removing and adding modules doesn't solve my problem... The USB has
several functions: modem and usb mass storage and only the storage one
is recognized 
:(


What modules exactly do you unload/load again, in what order, using 
which commands?



--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: software unplug and plug USB


Miguel schrieb:

actually, I have a USB keyboard, could that be the problem? why?


Because when you remove USB modules, USB devices, including USB 
keyboards, won't work...




removing and adding modules doesn't solve my problem... The USB has
several functions: modem and usb mass storage and only the storage one
is recognized 
:(


What modules exactly do you unload/load again, in what order, using 
which commands?



--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: software unplug and plug USB


Miguel schrieb:

I also have a mobile phone connected, a gps and a keyboard, i do the
following:


[EMAIL PROTECTED]:~$ lsmod | grep usb
usbserial  29928  1 pl2303
usbtouchscreen  8708  0 
usbcore   130304  7

libusual,pl2303,usbserial,usbtouchscreen,ehci_hcd,uhci_hcd
[EMAIL PROTECTED]:~$ rmmod pl2303
ERROR: Removing 'pl2303': Operation not permitted
[EMAIL PROTECTED]:~$ sudo rmmod pl2303
[EMAIL PROTECTED]:~$ sudo rmmod usbserial


 Only then It's not enough. You also have to remove ehci_hcd / uhci_hcd 
/ ohci_hcd.the device will be powered off (unless the method described 
in the link given by Jiri Slaby powers the device off, too).




[EMAIL PROTECTED]:~$ modprobe usbserial
FATAL: Error inserting usbserial
(/lib/modules/2.6.17-11-386/kernel/drivers/usb/serial/usbserial.ko):
Operation not permitted
(reverse-i-search)`mod': cd backup_modules/


cd backup_modules/? Where do you keep your modules?


--
Tomasz Chmielewski
http://blog.wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: software unplug and plug USB

2007-09-28 Thread Tomasz Chmielewski


I'm using a huawei E220 modem and occasionally it doesn't work and I
must unplug and plug it again, run the scripts connection to make it
works again.
how can I unplug an plug an USB modem via software?


If the modem doesn't have any other power sources than USB, just 
removing the modem's and USB modules, and the modprobing them again 
should help.


Just take care if you have it connected to the PC with a USB keyboard ;)


--
Tomasz Chmielewski
http://blog.wpkg.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: software unplug and plug USB

2007-09-28 Thread Tomasz Chmielewski


I'm using a huawei E220 modem and occasionally it doesn't work and I
must unplug and plug it again, run the scripts connection to make it
works again.
how can I unplug an plug an USB modem via software?


If the modem doesn't have any other power sources than USB, just 
removing the modem's and USB modules, and the modprobing them again 
should help.


Just take care if you have it connected to the PC with a USB keyboard ;)


--
Tomasz Chmielewski
http://blog.wpkg.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

patch/option to wipe memory at boot?

2007-09-20 Thread Tomasz Chmielewski


Is there a patch or a boot option or something which wipes all
available (physical) RAM at boot (or better, fills it with a fixed
signature like 0xdeadbeef)?  I'm getting phony ECC errors and I'd like
to test whether they go away when the RAM is properly initialized.
Also, I'd like to know exactly which parts of RAM are being used and
which are untouched since boot (hence the 0xdeadbeef signature).

If this patch/option doesn't exist, can anyone give me a hint as to
where and how it would be best to add this?  (I'm afraid I'm very
ignorant as to how Linux sets up its RAM mapping.)  I'm concerned
about x86 and x86_64.

PS: I'm not finicky: it's all right if a couple of megabytes at the
bottom of RAM are not scrubbed (I'm more interested about the top
gigabyte-or-so), especially if they're guaranteed to be used by the
kernel.


As a side note to what others said, you can always use initrd/initramfs 
to start your favourite program that wipes the memory...



--
Tomasz Chmielewski
http://blog.wpkg.org


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

patch/option to wipe memory at boot?

2007-09-20 Thread Tomasz Chmielewski


Is there a patch or a boot option or something which wipes all
available (physical) RAM at boot (or better, fills it with a fixed
signature like 0xdeadbeef)?  I'm getting phony ECC errors and I'd like
to test whether they go away when the RAM is properly initialized.
Also, I'd like to know exactly which parts of RAM are being used and
which are untouched since boot (hence the 0xdeadbeef signature).

If this patch/option doesn't exist, can anyone give me a hint as to
where and how it would be best to add this?  (I'm afraid I'm very
ignorant as to how Linux sets up its RAM mapping.)  I'm concerned
about x86 and x86_64.

PS: I'm not finicky: it's all right if a couple of megabytes at the
bottom of RAM are not scrubbed (I'm more interested about the top
gigabyte-or-so), especially if they're guaranteed to be used by the
kernel.


As a side note to what others said, you can always use initrd/initramfs 
to start your favourite program that wipes the memory...



--
Tomasz Chmielewski
http://blog.wpkg.org


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [DRIVER SUBMISSION] DRBD wants to go mainline

2007-07-22 Thread Tomasz Chmielewski


Jan Engelhardt wrote


On Jul 22 2007 00:43, Lars Ellenberg wrote:



Think of it as RAID1 over TCP.


And what does it do better than raid1-over-NBD? (Which is already N-disk,
and, logically, seems to support cluster filesystems)


I don't know about DRDB, but NBD doesn't handle network disconnections 
at all (well, almost).


So basically, disconnect a NBD-connected system for a while (switch, 
cabling problem, operator error etc.), and you need lots of effort, 
perhaps restarts, to get the things to a functioning state (devices 
offlined, kicked out etc.).


I wouldn't call such raid-over-NBD setup reliable.


A better question would be: what does it do better than raid1-over-iSCSI?

iSCSI can recover from disconnections very well when configured 
properly; but when a disconnection is in place, most of the system will 
just "hang/freeze" (that is, from the user perspective - the system will 
be waiting for the I/O to complete, until the systems are connected again).



A brief reading of "official DRBD FAQ" didn't give me an answer to that 
problem.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [DRIVER SUBMISSION] DRBD wants to go mainline

2007-07-22 Thread Tomasz Chmielewski


Jan Engelhardt wrote


On Jul 22 2007 00:43, Lars Ellenberg wrote:



Think of it as RAID1 over TCP.


And what does it do better than raid1-over-NBD? (Which is already N-disk,
and, logically, seems to support cluster filesystems)


I don't know about DRDB, but NBD doesn't handle network disconnections 
at all (well, almost).


So basically, disconnect a NBD-connected system for a while (switch, 
cabling problem, operator error etc.), and you need lots of effort, 
perhaps restarts, to get the things to a functioning state (devices 
offlined, kicked out etc.).


I wouldn't call such raid-over-NBD setup reliable.


A better question would be: what does it do better than raid1-over-iSCSI?

iSCSI can recover from disconnections very well when configured 
properly; but when a disconnection is in place, most of the system will 
just hang/freeze (that is, from the user perspective - the system will 
be waiting for the I/O to complete, until the systems are connected again).



A brief reading of official DRBD FAQ didn't give me an answer to that 
problem.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ext2 on flash memory


Jan Knutar wrote:


On Wednesday 13 June 2007 16:48, DervishD wrote:

But anyway the memory should last long. Even cheap flash memories
with poor wear leveling (if any at all) usually long last. Given that
I won't be writing continuously, wear shouldn't be a problem. I'm
going to use this as a backup copy of my home. Of course, I can use a
tarball too...


I did a test on my kingston datatraveler recently, I didn't expect it to 
survive, but it did. I put reiserfs on it, and copied 394M of data in 
200,000 files to it. Reiserfs was slw at writing, the device was 
probably doing alot of work. ext2 was about 10X faster, but there was 
hardly any free space left at all at the end :)


Considering it surived ReiserFS, I suspect it would last ages with ext2, 
especially for your backup purposes.


I have a couple of years old USB stick, which was used for swap, and for 
compiling stuff natively on some small mipsel devices, and generally 
moving files back and forth a lot (ext3 + noatime).


Still, it works just fine.


--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ext2 on flash memory


Jörn Engel schrieb:

On Mon, 11 June 2007 13:53:00 +0200, Tomasz Chmielewski wrote:
jffs2 only works on mtd devices, and that excludes pendrives, which are 
block devices. I know LogFS will work with block devices one day, but 
currently, it doesn't (and is not in the kernel yet as well).


Actually, LogFS does work on block devices now.  Performance on hard
disks is quite bad.  Hopefully that becomes untrue just as fast as your
slightly outdated claim. :)


Cool, does it mean we have the first Linux filesystem supporting 
compression, which can be used on USB-sticks (I don't count old 
ext2+compression patches)? :)



--
Tomasz Chmielewski
http://wpkg.org


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ext2 on flash memory


Jörn Engel schrieb:

On Mon, 11 June 2007 13:53:00 +0200, Tomasz Chmielewski wrote:
jffs2 only works on mtd devices, and that excludes pendrives, which are 
block devices. I know LogFS will work with block devices one day, but 
currently, it doesn't (and is not in the kernel yet as well).


Actually, LogFS does work on block devices now.  Performance on hard
disks is quite bad.  Hopefully that becomes untrue just as fast as your
slightly outdated claim. :)


Cool, does it mean we have the first Linux filesystem supporting 
compression, which can be used on USB-sticks (I don't count old 
ext2+compression patches)? :)



--
Tomasz Chmielewski
http://wpkg.org


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ext2 on flash memory


Jan Knutar wrote:


On Wednesday 13 June 2007 16:48, DervishD wrote:

But anyway the memory should last long. Even cheap flash memories
with poor wear leveling (if any at all) usually long last. Given that
I won't be writing continuously, wear shouldn't be a problem. I'm
going to use this as a backup copy of my home. Of course, I can use a
tarball too...


I did a test on my kingston datatraveler recently, I didn't expect it to 
survive, but it did. I put reiserfs on it, and copied 394M of data in 
200,000 files to it. Reiserfs was slw at writing, the device was 
probably doing alot of work. ext2 was about 10X faster, but there was 
hardly any free space left at all at the end :)


Considering it surived ReiserFS, I suspect it would last ages with ext2, 
especially for your backup purposes.


I have a couple of years old USB stick, which was used for swap, and for 
compiling stuff natively on some small mipsel devices, and generally 
moving files back and forth a lot (ext3 + noatime).


Still, it works just fine.


--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ext2 on flash memory

2007-06-11 Thread Tomasz Chmielewski


Eduard-Gabriel Munteanu wrote:

(...)

Your USB pendrive will wear faster if you use an inappropriate 
filesystem. Such filesystems require frequent writes and change their 
internal state often.


This could be alleviated by COWing the filesystem somehow and flushing 
writes when you're finished. But the modifications will be lost if 
crashes occur. The filesystem structures will still change a lot and 
require  big writes to update it.


Really, why don't you try a more suitable fs for your pendrive, one that 
changes itself less than usual fs's?


Hmm, are there any fs (read+write) alternatives for pendrives?

jffs2 only works on mtd devices, and that excludes pendrives, which are 
block devices. I know LogFS will work with block devices one day, but 
currently, it doesn't (and is not in the kernel yet as well).



Also, ext2 provides a nice feature other filesystems lack: xip. 
Especially, if a pendrive is used as a rootfs for a small device.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ext2 on flash memory

2007-06-11 Thread Tomasz Chmielewski


Eduard-Gabriel Munteanu wrote:

(...)

Your USB pendrive will wear faster if you use an inappropriate 
filesystem. Such filesystems require frequent writes and change their 
internal state often.


This could be alleviated by COWing the filesystem somehow and flushing 
writes when you're finished. But the modifications will be lost if 
crashes occur. The filesystem structures will still change a lot and 
require  big writes to update it.


Really, why don't you try a more suitable fs for your pendrive, one that 
changes itself less than usual fs's?


Hmm, are there any fs (read+write) alternatives for pendrives?

jffs2 only works on mtd devices, and that excludes pendrives, which are 
block devices. I know LogFS will work with block devices one day, but 
currently, it doesn't (and is not in the kernel yet as well).



Also, ext2 provides a nice feature other filesystems lack: xip. 
Especially, if a pendrive is used as a rootfs for a small device.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-06-04 Thread Tomasz Chmielewski


Fred Moyer wrote:

Sounds like SMART is likely disabled on that drive. You can try doing 
"smartctl -s on /dev/sda" and see if that will turn it on.




Sorry - that last post of mine was brain dead.  Here's the one with 
(hopefully) useful data.


app2 ~ # smartctl  -d ata -a /dev/sda
smartctl version 5.36 [x86_64-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen



(...)


   -- -- -- -- -- -- --
   84 51 00 b5 c9 73 e0  Error: ICRC, ABRT at LBA = 0x0073c9b5 = 7588277

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --    
   25 00 20 96 c9 73 e0 00  01:25:42.886  READ DMA EXT
   b0 d0 01 00 4f c2 00 02  01:25:42.868  SMART READ DATA
   35 00 08 ae b6 42 e0 00  01:25:42.456  WRITE DMA EXT
   b0 da 00 00 4f c2 00 00  01:25:42.430  SMART RETURN STATUS
   35 00 08 60 81 04 e0 00  01:25:42.376  WRITE DMA EXT


I was getting very similar SMART results, and kernel errors, when used 
PATA drive and SATA_VIA (no freezes or hangs though):



SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata3.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
 res 51/04:00:0b:ff:bf/00:00:00:00:00/00 Emask 0x1 (device error)
ata3.00: configured for UDMA/100
ata3: EH complete
SCSI device sda: write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA

ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata3.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in
 res 51/04:01:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata3.00: configured for UDMA/100
ata3: EH complete



The problem was that I started smartd with wrong parameters:

DEVICESCAN -a -o on -S on -s (S/../.././10|L/../../6/11)


It was solved when I added "-d sat" to smartd parameters:

DEVICESCAN -d sat -a -o on -S on -s (S/../.././10|L/../../6/11)


From that time on, smartctl -a /dev/sda gives "normal" output, and no 
more strange kernel errors.


Hopefully, it'll get fixed in smartmontools soon (or is fixed already, 
but not yet mainline).



--
Tomasz Chmielewski
http://wpkg.org


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.21.1] SATA freeze

2007-06-04 Thread Tomasz Chmielewski


Fred Moyer wrote:

Sounds like SMART is likely disabled on that drive. You can try doing 
smartctl -s on /dev/sda and see if that will turn it on.




Sorry - that last post of mine was brain dead.  Here's the one with 
(hopefully) useful data.


app2 ~ # smartctl  -d ata -a /dev/sda
smartctl version 5.36 [x86_64-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen



(...)


   -- -- -- -- -- -- --
   84 51 00 b5 c9 73 e0  Error: ICRC, ABRT at LBA = 0x0073c9b5 = 7588277

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --    
   25 00 20 96 c9 73 e0 00  01:25:42.886  READ DMA EXT
   b0 d0 01 00 4f c2 00 02  01:25:42.868  SMART READ DATA
   35 00 08 ae b6 42 e0 00  01:25:42.456  WRITE DMA EXT
   b0 da 00 00 4f c2 00 00  01:25:42.430  SMART RETURN STATUS
   35 00 08 60 81 04 e0 00  01:25:42.376  WRITE DMA EXT


I was getting very similar SMART results, and kernel errors, when used 
PATA drive and SATA_VIA (no freezes or hangs though):



SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata3.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 0
 res 51/04:00:0b:ff:bf/00:00:00:00:00/00 Emask 0x1 (device error)
ata3.00: configured for UDMA/100
ata3: EH complete
SCSI device sda: write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA

ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata3.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in
 res 51/04:01:00:4f:c2/00:00:00:00:00/00 Emask 0x1 (device error)
ata3.00: configured for UDMA/100
ata3: EH complete



The problem was that I started smartd with wrong parameters:

DEVICESCAN -a -o on -S on -s (S/../.././10|L/../../6/11)


It was solved when I added -d sat to smartd parameters:

DEVICESCAN -d sat -a -o on -S on -s (S/../.././10|L/../../6/11)


From that time on, smartctl -a /dev/sda gives normal output, and no 
more strange kernel errors.


Hopefully, it'll get fixed in smartmontools soon (or is fixed already, 
but not yet mainline).



--
Tomasz Chmielewski
http://wpkg.org


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO1X de/compression support

2007-05-20 Thread Tomasz Chmielewski


Bill Rugolsky Jr. wrote:


I'm certainly missing something but what are the advantages of this
code (over current gzip etc.), and what will be using it?


Richard's patchset added it to the crypto library and wired it into
the JFFS2 file system.  We recently started using LZO in a userland UDP
proxy to do stateless per-packet payload compression over a WAN link.
With ~1000 octet packets, our particular data stream sees 60% compression
with zlib, and 50% compression with (mini-)LZO, but LZO runs at ~5.6x
the speed of zlib.  IIRC, that translates into > 700Mbps on the input
side on a 2GHZ Opteron, without any further tuning.

Once LZO is in the kernel, I'd like to see it wired into IPComp.
Unfortunately, last I checked only the "deflate" algorithm had an
assigned compression parameter index (CPI), so one will have to use a
private index until an official one is assigned.


I also though of using LZO compression for some of the diskless nodes
which use iSCSI over 100 Mbit or slower.


Certainly, a fast de/compression algorithm in the kernel could bring
some new, innovative uses:

- there are talks about compressed filesystems (jffs2, reiser4, LogFS) -
why no one thought about a compressed tmpfs (should be way easier than a
compressed on-disk filesystem, as we don't have to care about data
recovery in event of a failure)?

- using compression for networking (like Bill mentioned)

- compressed caching

- compressed suspend-to-disk images (should suspend/restore faster this way)


--
Tomasz Chmielewski
http://wpkg.org





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] LZO1X de/compression support

2007-05-20 Thread Tomasz Chmielewski


Bill Rugolsky Jr. wrote:


I'm certainly missing something but what are the advantages of this
code (over current gzip etc.), and what will be using it?


Richard's patchset added it to the crypto library and wired it into
the JFFS2 file system.  We recently started using LZO in a userland UDP
proxy to do stateless per-packet payload compression over a WAN link.
With ~1000 octet packets, our particular data stream sees 60% compression
with zlib, and 50% compression with (mini-)LZO, but LZO runs at ~5.6x
the speed of zlib.  IIRC, that translates into  700Mbps on the input
side on a 2GHZ Opteron, without any further tuning.

Once LZO is in the kernel, I'd like to see it wired into IPComp.
Unfortunately, last I checked only the deflate algorithm had an
assigned compression parameter index (CPI), so one will have to use a
private index until an official one is assigned.


I also though of using LZO compression for some of the diskless nodes
which use iSCSI over 100 Mbit or slower.


Certainly, a fast de/compression algorithm in the kernel could bring
some new, innovative uses:

- there are talks about compressed filesystems (jffs2, reiser4, LogFS) -
why no one thought about a compressed tmpfs (should be way easier than a
compressed on-disk filesystem, as we don't have to care about data
recovery in event of a failure)?

- using compression for networking (like Bill mentioned)

- compressed caching

- compressed suspend-to-disk images (should suspend/restore faster this way)


--
Tomasz Chmielewski
http://wpkg.org





-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sata_mv: new exception handling (hotplug, NCQ framework)

2007-05-19 Thread Tomasz Chmielewski


Jeff Garzik schrieb:

Below is a refresh of my on-going effort to convert sata_mv to the new
exception handling framework.  sata_mv is one of the last hold-outs, and
its old-EH implementation blocks new features like hotplug and NCQ.

It works for me on the one 50xx and one 60xx card I tested it on, but
other testers reported regressions, which is why it is not yet upstream.


Hi,

If I'm correct, this patch won't make to 2.6.22, and the first possible 
inclusion would be 2.6.23?


Could you summarize what other regressions were reported? I can't find 
much information about sata_mv regressiobs on linux-ide list (at least 
when looking at the subjects: lots of patches from you, and two reports 
from me).



--
Tomasz Chmielewski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sata_mv: new exception handling (hotplug, NCQ framework)

2007-05-19 Thread Tomasz Chmielewski


Jeff Garzik schrieb:

Below is a refresh of my on-going effort to convert sata_mv to the new
exception handling framework.  sata_mv is one of the last hold-outs, and
its old-EH implementation blocks new features like hotplug and NCQ.

It works for me on the one 50xx and one 60xx card I tested it on, but
other testers reported regressions, which is why it is not yet upstream.


Hi,

If I'm correct, this patch won't make to 2.6.22, and the first possible 
inclusion would be 2.6.23?


Could you summarize what other regressions were reported? I can't find 
much information about sata_mv regressiobs on linux-ide list (at least 
when looking at the subjects: lots of patches from you, and two reports 
from me).



--
Tomasz Chmielewski
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

what does e2fsck's "non-contiguous" really say?

2007-05-15 Thread Tomasz Chmielewski

What does e2fsck's "non-contiguous" really say? I always thought it may 
give a clue about how a filesystem is fragmented.


However, I had set up a filesystem on a 365 GB RAID-5 array:

/dev/sdao 365G  195M  347G   1% /mnt/1


The filesystem contains only one directory (lost+found).

I ran e2fsck on that filesystem, and it says "9.1% non-contiguous":

# e2fsck -f part
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
part: 11/48594944 files (9.1% non-contiguous), 1574757/97187200 blocks


"9.1% non-contiguous" - what meaning does it really have?


--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

what does e2fsck's non-contiguous really say?

2007-05-15 Thread Tomasz Chmielewski

What does e2fsck's non-contiguous really say? I always thought it may 
give a clue about how a filesystem is fragmented.


However, I had set up a filesystem on a 365 GB RAID-5 array:

/dev/sdao 365G  195M  347G   1% /mnt/1


The filesystem contains only one directory (lost+found).

I ran e2fsck on that filesystem, and it says 9.1% non-contiguous:

# e2fsck -f part
e2fsck 1.39 (29-May-2006)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
part: 11/48594944 files (9.1% non-contiguous), 1574757/97187200 blocks


9.1% non-contiguous - what meaning does it really have?


--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/16] raid acceleration and asynchronous offload api for 2.6.22


Tomasz Chmielewski schrieb:

Ronen Shitrit wrote:


The resync numbers you sent, looks very promising :)
Do you have any performance numbers that you can share for these set of
patches, which shows the Rd/Wr IO bandwidth.


I have some simple tests made with hdparm, with the results I don't 
understand.


We see hdparm results are fine if we access the whole device:

thecus:~# hdparm -Tt /dev/sdd

/dev/sdd:
 Timing cached reads:   392 MB in  2.00 seconds = 195.71 MB/sec
 Timing buffered disk reads:  146 MB in  3.01 seconds =  48.47 MB/sec


But are 10 times worse (Timing buffered disk reads) when we access 
partitions:


There seems to be another side effect when comparing DMA engine in 
2.6.17-iop1 to 2.6.21-iop1: network performance.



For simple network tests, I use "netperf" tool to measure network 
performance.


With 2.6.17-iop1 and all DMA offloading options enabled (selectable in 
System type ---> IOP3xx Implementation Options  --->), I get nearly 25 
MB/s throughput.


With 2.6.21-iop1 and all DMA offloading optons enabled (moved to Device 
Drivers  ---> DMA Engine support  --->), I get only about 10 MB/s 
throughput.
Additionally, on 2.6.21-iop1, I get lots of "dma_cookie < 0" printed by 
the kernel.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 00/16] raid acceleration and asynchronous offload api for 2.6.22


Ronen Shitrit wrote:


The resync numbers you sent, looks very promising :)
Do you have any performance numbers that you can share for these set of
patches, which shows the Rd/Wr IO bandwidth.


I have some simple tests made with hdparm, with the results I don't 
understand.


We see hdparm results are fine if we access the whole device:

thecus:~# hdparm -Tt /dev/sdd

/dev/sdd:
 Timing cached reads:   392 MB in  2.00 seconds = 195.71 MB/sec
 Timing buffered disk reads:  146 MB in  3.01 seconds =  48.47 MB/sec


But are 10 times worse (Timing buffered disk reads) when we access 
partitions:


thecus:/# hdparm -Tt /dev/sdc1 /dev/sdd1

/dev/sdc1:
 Timing cached reads:   396 MB in  2.01 seconds = 197.18 MB/sec
 Timing buffered disk reads:   16 MB in  3.32 seconds =   4.83 MB/sec

/dev/sdd1:
 Timing cached reads:   394 MB in  2.00 seconds = 196.89 MB/sec
 Timing buffered disk reads:   16 MB in  3.13 seconds =   5.11 MB/sec


Why is it so much worse?


I used 2.6.21-iop1 patches from http://sf.net/projects/xscaleiop; right 
now I use 2.6.17-iop1, for which the results are ~35 MB/s when accessing 
a device (/dev/sdd) or a partition (/dev/sdd1).



In kernel config, I enabled Intel DMA engines.

The device I use is Thecus n4100, it is "Platform: IQ31244 (XScale)", 
and has 600 MHz CPU.



--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add LZO1X compression support to the kernel


David Woodhouse wrote:


On Wed, 2007-05-09 at 23:21 -0700, Andrew Morton wrote:

Well that's attractive-looking code.


It's compression code. I've never seen compression code look nice :)


Why is this needed?  What code plans to use it?


I'm itching to use it in JFFS2. Richard claims a 10% boot time speedup
and 40% improvement on file read speed, with only a slight drop in the
file compression ratio (when compared to zlib).


Would be interesting to have it as a shared base for 
planned-for-May-2020 "compressed tmpfs" (and, perhaps a filesystem with 
compression, which seems even harder to engineer properly).



--
Tomasz Chmielewski

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add LZO1X compression support to the kernel


David Woodhouse wrote:


On Wed, 2007-05-09 at 23:21 -0700, Andrew Morton wrote:

Well that's attractive-looking code.


It's compression code. I've never seen compression code look nice :)


Why is this needed?  What code plans to use it?


I'm itching to use it in JFFS2. Richard claims a 10% boot time speedup
and 40% improvement on file read speed, with only a slight drop in the
file compression ratio (when compared to zlib).


Would be interesting to have it as a shared base for 
planned-for-May-2020 compressed tmpfs (and, perhaps a filesystem with 
compression, which seems even harder to engineer properly).



--
Tomasz Chmielewski

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 00/16] raid acceleration and asynchronous offload api for 2.6.22


Ronen Shitrit wrote:


The resync numbers you sent, looks very promising :)
Do you have any performance numbers that you can share for these set of
patches, which shows the Rd/Wr IO bandwidth.


I have some simple tests made with hdparm, with the results I don't 
understand.


We see hdparm results are fine if we access the whole device:

thecus:~# hdparm -Tt /dev/sdd

/dev/sdd:
 Timing cached reads:   392 MB in  2.00 seconds = 195.71 MB/sec
 Timing buffered disk reads:  146 MB in  3.01 seconds =  48.47 MB/sec


But are 10 times worse (Timing buffered disk reads) when we access 
partitions:


thecus:/# hdparm -Tt /dev/sdc1 /dev/sdd1

/dev/sdc1:
 Timing cached reads:   396 MB in  2.01 seconds = 197.18 MB/sec
 Timing buffered disk reads:   16 MB in  3.32 seconds =   4.83 MB/sec

/dev/sdd1:
 Timing cached reads:   394 MB in  2.00 seconds = 196.89 MB/sec
 Timing buffered disk reads:   16 MB in  3.13 seconds =   5.11 MB/sec


Why is it so much worse?


I used 2.6.21-iop1 patches from http://sf.net/projects/xscaleiop; right 
now I use 2.6.17-iop1, for which the results are ~35 MB/s when accessing 
a device (/dev/sdd) or a partition (/dev/sdd1).



In kernel config, I enabled Intel DMA engines.

The device I use is Thecus n4100, it is Platform: IQ31244 (XScale), 
and has 600 MHz CPU.



--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/16] raid acceleration and asynchronous offload api for 2.6.22


Tomasz Chmielewski schrieb:

Ronen Shitrit wrote:


The resync numbers you sent, looks very promising :)
Do you have any performance numbers that you can share for these set of
patches, which shows the Rd/Wr IO bandwidth.


I have some simple tests made with hdparm, with the results I don't 
understand.


We see hdparm results are fine if we access the whole device:

thecus:~# hdparm -Tt /dev/sdd

/dev/sdd:
 Timing cached reads:   392 MB in  2.00 seconds = 195.71 MB/sec
 Timing buffered disk reads:  146 MB in  3.01 seconds =  48.47 MB/sec


But are 10 times worse (Timing buffered disk reads) when we access 
partitions:


There seems to be another side effect when comparing DMA engine in 
2.6.17-iop1 to 2.6.21-iop1: network performance.



For simple network tests, I use netperf tool to measure network 
performance.


With 2.6.17-iop1 and all DMA offloading options enabled (selectable in 
System type --- IOP3xx Implementation Options  ---), I get nearly 25 
MB/s throughput.


With 2.6.21-iop1 and all DMA offloading optons enabled (moved to Device 
Drivers  --- DMA Engine support  ---), I get only about 10 MB/s 
throughput.
Additionally, on 2.6.21-iop1, I get lots of dma_cookie  0 printed by 
the kernel.



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS

2007-05-09 Thread Tomasz Chmielewski

On Wed, 9 May 2007 11:35:03 +0200, Marcus Better wrote:

Lennert Buytenhek wrote:
> Does that mean that the Debian ARM people have their heads so far
> up their collective asses that they think that every form of change
> is bad and are unable to accept that some forms of change might be
> for the better?

Well, I am not one of the Debian ARM people, just a user... and I do hope the
EABI port becomes supported in the future! But in the meatime there is a
crowd of users running Debian on consumer devices like the NSLU2, and they
need a LE network driver.

1) Development _should_ happen in small individually-manageable steps.
   It's wrong to delay integration of the new IXP4xx eth driver just
   because it's not yet LE-compatible.

True.

2) LE Debian/ARM users do have alternatives: they can use USB-Ethernet
   adapters, for instance.

In case of Freecom FSG-3, that would be four USB-ethernet adapters. With 
the cost roughly half of the cost of the whole device. And all USB-ports 
occupied. Provided you don't use them for something else.

Someone could ask "why has this device four mice connected?" :) (for 
someone who doesn't work much with computers, a USB-ISDN or USB-ethernet 
adapter looks just like a mouse).

And yet another viable alternative is to use a totally different device 
which is fully supported under Linux or another system, right? :)

--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS

2007-05-09 Thread Tomasz Chmielewski


On Wed, 9 May 2007 11:35:03 +0200, Marcus Better wrote:

Lennert Buytenhek wrote:
 Does that mean that the Debian ARM people have their heads so far
 up their collective asses that they think that every form of change
 is bad and are unable to accept that some forms of change might be
 for the better?

Well, I am not one of the Debian ARM people, just a user... and I do hope the
EABI port becomes supported in the future! But in the meatime there is a
crowd of users running Debian on consumer devices like the NSLU2, and they
need a LE network driver.


1) Development _should_ happen in small individually-manageable steps.
   It's wrong to delay integration of the new IXP4xx eth driver just
   because it's not yet LE-compatible.


True.



2) LE Debian/ARM users do have alternatives: they can use USB-Ethernet
   adapters, for instance.


In case of Freecom FSG-3, that would be four USB-ethernet adapters. With 
the cost roughly half of the cost of the whole device. And all USB-ports 
occupied. Provided you don't use them for something else.


Someone could ask why has this device four mice connected? :) (for 
someone who doesn't work much with computers, a USB-ISDN or USB-ethernet 
adapter looks just like a mouse).



And yet another viable alternative is to use a totally different device 
which is fully supported under Linux or another system, right? :)



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS


Krzysztof Halasa schrieb:

Lennert Buytenhek <[EMAIL PROTECTED]> writes:


There _is_ an ARM BE version of Debian.

It's not an official port, but it's not maintained any worse than
the 'official' LE ARM Debian port is.


Hmm... That changes a bit. Perhaps we should forget about
that LE thing then, and (at best) put that trivial workaround?


Does using ixp4xx on LE have any other drawbacks than inferior network 
performance?


And talking about network performance, what numbers are we talking about 
(LE vs BE; 30% performance hit on LE, more, or less)?



--
Tomasz Chmielewski
http://wpkg.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS

Alexey Zaytsev schrieb:

On 5/8/07, Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:

Michael Jones wrote:

>> +#ifndef __ARMEB__
>> +#warning Little endian mode not supported
>> +#endif
>
> Personally I'm less fussed about WAN / LE support. Anyone with any
> sense will run ixp4xx boards doing such a specialised network
> operation as BE. Also, NSLU2-Linux can't test this functionality with
> our LE setup as we don't have this hardware on-board. You may just
> want to declare a depends on ARMEB in Kconfig (with or without OR
> (ARM || BROKEN) ) and have done with it - it's up to you.

Christian Hohnstaedt's work did support LE though.

Not all ixp4xx boards are by definition "doing such a specialised
network operation".

I was always curious, why do people want to run ixp4xx in LE mode? What
are the benefits that overweight the obvious performance degradation?

I guess the main reason, at least for me, is that there is only one 
distro that properly supports LE ARM: Debian.

It greatly simplifies management/administration of a higher number of 
devices, given the fact that Debian also supports other architectures 
(not just x86/64, sometimes PPC, like most distros do).

Not always network performance is to most important factor.

--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS


Michael Jones wrote:


+#ifndef __ARMEB__
+#warning Little endian mode not supported
+#endif


Personally I'm less fussed about WAN / LE support. Anyone with any  
sense will run ixp4xx boards doing such a specialised network  
operation as BE. Also, NSLU2-Linux can't test this functionality with  
our LE setup as we don't have this hardware on-board. You may just  
want to declare a depends on ARMEB in Kconfig (with or without OR  
(ARM || BROKEN) ) and have done with it - it's up to you.


Christian Hohnstaedt's work did support LE though.

Not all ixp4xx boards are by definition "doing such a specialised 
network operation".



Krzysztof, why is LE not supported?

Do you need access to ixp4xx that starts in LE mode?


--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS


Michael Jones wrote:


+#ifndef __ARMEB__
+#warning Little endian mode not supported
+#endif


Personally I'm less fussed about WAN / LE support. Anyone with any  
sense will run ixp4xx boards doing such a specialised network  
operation as BE. Also, NSLU2-Linux can't test this functionality with  
our LE setup as we don't have this hardware on-board. You may just  
want to declare a depends on ARMEB in Kconfig (with or without OR  
(ARM || BROKEN) ) and have done with it - it's up to you.


Christian Hohnstaedt's work did support LE though.

Not all ixp4xx boards are by definition doing such a specialised 
network operation.



Krzysztof, why is LE not supported?

Do you need access to ixp4xx that starts in LE mode?


--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS


Alexey Zaytsev schrieb:

On 5/8/07, Tomasz Chmielewski [EMAIL PROTECTED] wrote:

Michael Jones wrote:

 +#ifndef __ARMEB__
 +#warning Little endian mode not supported
 +#endif

 Personally I'm less fussed about WAN / LE support. Anyone with any
 sense will run ixp4xx boards doing such a specialised network
 operation as BE. Also, NSLU2-Linux can't test this functionality with
 our LE setup as we don't have this hardware on-board. You may just
 want to declare a depends on ARMEB in Kconfig (with or without OR
 (ARM || BROKEN) ) and have done with it - it's up to you.

Christian Hohnstaedt's work did support LE though.

Not all ixp4xx boards are by definition doing such a specialised
network operation.



I was always curious, why do people want to run ixp4xx in LE mode? What
are the benefits that overweight the obvious performance degradation?


I guess the main reason, at least for me, is that there is only one 
distro that properly supports LE ARM: Debian.


It greatly simplifies management/administration of a higher number of 
devices, given the fact that Debian also supports other architectures 
(not just x86/64, sometimes PPC, like most distros do).



Not always network performance is to most important factor.


--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Intel IXP4xx network drivers v.2 - Ethernet and HSS