Re: ZFS root mount regression
Hi, I am not sure how the original description leads to conclusion that problem is related to parallel mounting. From my point of view it sounds like a problem that root pool mounting happens based on name, not pool GUID that needs to be passed from the loader. We have seen problem like that ourselves too when boot pool names collide. So I doubt it is a new problem, just nobody got to fixing it yet. On 20.07.2019 06:41, Eugene Grosbein wrote: > CC'ing Alexander Motin who comitted the change. > > 20.07.2019 1:21, Garrett Wollman wrote: > >> I recently upgraded several file servers from 11.2 to 11.3. All of >> them boot from a ZFS pool called "tank" (the data is in a different >> pool). In a couple of instances (which caused me to have to take a >> late-evening 140-mile drive to the remote data center where they are >> located), the servers crashed at the root mount phase. In one case, >> it bailed out with error 5 (I believe that's [EIO]) to the usual >> mountroot prompt. In the second case, the kernel panicked instead. >> >> The root cause (no pun intended) on both servers was a disk which was >> supplied by the vendor with a label on it that claimed to be part of >> the "tank" pool, and for some reason the 11.3 kernel was trying to >> mount that (faulted) pool rather than the real one. The disks and >> pool configuration were unchanged from 11.2 (and probably 11.1 as >> well) so I am puzzled. >> >> Other than laboriously running "zpool labelclear -f /dev/somedisk" for >> every piece of media that comes into my hands, is there anything else >> I could have done to avoid this? > > Both 11.3-RELEASE announcement and Release Notes mention this: > >> The ZFS filesystem has been updated to implement parallel mounting. > > I strongly suggest reading Release documentation in case of troubles > after upgrade, at least. Or better, read *before* updating. > > I guess this parallelism created some race for your case. > > Unfortunately, a way to fall back to sequential mounting seems undocumented. > libzfs checks for ZFS_SERIAL_MOUNT environment variable to exist having any > value. > I'm not sure how you set it for mounting root, maybe it will use kenv, > so try adding to /boot/loader.conf: > > ZFS_SERIAL_MOUNT=1 > > Alexander should have more knowledge on this. > > And of course, attaching unrelated device having label conflicting > with root pool is asking for trouble. Re-label it ASAP. > -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: about zfs and ashift and changing ashift on existing zpool
On 08.04.2019 20:21, Eugene Grosbein wrote: > 09.04.2019 7:00, Kevin P. Neal wrote: > >>> My guess (given that only ada1 is reporting a blocksize mismatch) is that >>> your disks reported a 512B native blocksize. In the absence of any >>> override, >>> ZFS will then build an ashift=9 pool. > > [skip] > >> smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.2-RELEASE-p4 amd64] (local build) >> Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org >> >> === START OF INFORMATION SECTION === >> Vendor: SEAGATE >> Product: ST2400MM0129 >> Revision: C003 >> Compliance: SPC-4 >> User Capacity:2,400,476,553,216 bytes [2.40 TB] >> Logical block size: 512 bytes >> Physical block size: 4096 bytes > > Maybe it't time to prefer "Physical block size" over "Logical block size" in > relevant GEOMs > like GEOM_DISK, so upper levels such as ZFS would do the right thing > automatically. No. It is a bad idea. Changing logical block size for existing disks will most likely result in breaking compatibility and inability to read previously written data. ZFS already uses physical block size when possible -- on pool creation or new vdev addition. When not possible (pool already created wrong) it just complains about it, so that user would know that his configuration is imperfect and he should not expect full performance. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: TSC timekeeping and cpu states
On 14.08.2017 18:38, Ian Smith wrote: > On Mon, 14 Aug 2017 17:16:22 +1000, Aristedes Maniatis wrote: > > On 14/8/17 3:08PM, Kevin Oberman wrote: > > > Again, the documentation lags reality. The default was changed for > > > 11.0. It is still conservative. In ALMOST all cases, Cmax will yield > > > the bast results. However, on large systems with many cores, Cmax > > > will trigger very poor results, so the default is C2, just to be > > > safe. > > Given it's a server, anything beyond C2 is likely not worth trying. > OTOH, C2 is perhaps not worth avoiding; it's probably low latency and > should result in lower power consumption, so heat, and unlikely to hurt. > > Or at least, I suspect that's the case .. cc'ing Alexander, as the wiki > article you referenced was his doing, so he's among those best placed. C-states controlled here are ACPI C-states, which have limited relation to real CPU C-states. There are systems where they map exactly, but there are also systems where ACPI C1/C2/C3 states map to CPU C1/C3/C6, so it is difficult to make general recommendations. Approximately the map can be guessed looking on latency value (last of three) reported in sysctl dev.cpu.0.cx_supported: 1 is usually CPU C1, 2+ is likely CPU C2, 100+ can be C3, 500+ can be C6, but all that is very approximately and I guess depends on BIOS writer mood. What's about recommendations from me, I'd say that CPU C2 state should not hurt in most cases, unless something is broken, but benefit is rather small (often just covered by C1E enabled in BIOS); CPU C3 state gives significant power saving, but can either hurt performance due to higher enter/exit latency or slightly improve it due to TurboBoost activation (require CPU frequency to be set to max value); CPU C6 is probably useful only for laptops, since it saves not so much power power, while exit latency can be in milliseconds range. > > > As far as possible TSC impact, I think older processors had TSC > > > issues when not all cores ran with the same clock speed. That said, > > > I am not remotely expert on such issues, so don't take this too > > > seriously. > > I wasn't aware that FreeBSD could yet do different freqs on different > cores? But I'm less expert than Kevin, and certainly behind the times. On old CPUs TSC frequency was related to CPU frequency and so could fluctuate with frequency change. On modern CPUs it is always constant, equal to base CPU frequency. What's about different frequency for different cores, IIRC ACPI allows that, but up to recent time neither FreeBSD nor hardware could do that. I have feeling I heard that some very new CPUs may allow that, but to be efficient it would require very tight interoperation between power manager and CPU scheduler, otherwise performance may suffer. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Mega ZFS MFCs
Hi Mike, On 27.07.2017 16:21, Mike Tancsa wrote: > I noticed quite a few MFCs to RELENG_11 around zfs yesterday and today. > First off, thank you for all these fixes/enhancements! Of the some 60 > MFCs, are there any particular ones to be more aware of when updating > servers ? The most complicated and invasive to me looks r321610 "8021 ARC buf data scatter-ization". It took 5 fix commits to make it behave in head, but Andriy told me it should be good now, and I run it on my systems too. > Are there any more to come, or is now a good time to test things out ? I've merged all we had in head (except couple gptzfsboot commits significantly increasing its size, that could break POLA). Next round will any way go to head first, so stable/11 should probably be idle for a month at least and should be good for testing now. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stable/11 debugging kernel unable to produce crashdump again
I guess that problem of g_raid_shutdown_post_sync in case of panic can be explained by the fact it tries to write clean metadata in regular (not dumping) way while system is already in panic mode and there is no proper scheduling. May be it could be just bypassed in case of dumping (should be trivial and probably OK), or use g_raid_subdisk_kerneldump() in that case instead of normal GEOM I/O. On 24.07.2017 20:03, Eugene Grosbein wrote: > CCing mav@ as graid expert. > > On 24.07.2017 08:44, Mark Johnston wrote: > >>> Sadly, this time 11.1-STABLE r321371 SMP hangs instead of doing crashdump: >>> >>> - "call doadump" from DDB prompt works just fine; >>> - "shutdown -r now" reboots the system without problems; >>> - "sysctl debug.kdb.panic=1" triggers a panic just fine but system hangs >>> just afer showing uptime >>> instead of continuing with crashdump generation; same if "real" panic >>> occurs. >>> >>> Same for debug.minidump set to 1 or 0. How do I debug this? >> >> I'm not able to reproduce the problem in bhyve using r321401. Looking >> at the code, the culprits might be cngrab(), or one of the >> shutdown_post_sync eventhandlers. Since you're apparently able to see >> the console output at the time of the panic, I guess it's probably the >> latter. Could you try your test with the patch below applied? It'll >> print a bunch of "entering post_sync"/"leaving post_sync" messages with >> addresses that can be resolved using kgdb. That'll help determine where >> we're getting stuck. >> >> Index: sys/sys/eventhandler.h >> === >> --- sys/sys/eventhandler.h (revision 321401) >> +++ sys/sys/eventhandler.h (working copy) >> @@ -85,7 +85,11 @@ >> _t = (struct eventhandler_entry_ ## name *)_ep; \ >> CTR1(KTR_EVH, "eventhandler_invoke: executing %p", \ >> (void *)_t->eh_func); \ >> +if (strcmp(__STRING(name), "shutdown_post_sync") == 0) \ >> +printf("entering post_sync %p\n", (void >> *)_t->eh_func); \ >> _t->eh_func(_ep->ee_arg , ## __VA_ARGS__); \ >> +if (strcmp(__STRING(name), "shutdown_post_sync") == 0) \ >> +printf("leaving post_sync %p\n", (void >> *)_t->eh_func); \ >> EHL_LOCK((list)); \ >> } \ >> } \ >> > > Thanks, this helped: > > $ addr2line -f -e kernel.debug 0x80919c00 > g_raid_shutdown_post_sync > /home/src/sys/geom/raid/g_raid.c:2458 > > That is GEOM_RAID's g_raid_shutdown_post_sync() that hangs if called just > before > crashdump generation but works just fine during normal system shutdown. > > I should note my graid's RAID1 is running in degraded state currently > due to dead SSD module that does not respond. Here is part of boot log: > > ahcich5: AHCI reset: device not ready after 31000ms (tfd = 0080) > ahcich5: Poll timeout on slot 2 port 0 > ahcich5: is cs 0004 ss rs 0004 tfd 80 serr > cmd c217 > (aprobe2:ahcich5:0:0:0): NOP FLUSHQUEUE. ACB: 00 00 00 00 00 00 00 00 00 00 > 00 00 > (aprobe2:ahcich5:0:0:0): CAM status: Command timeout > (aprobe2:ahcich5:0:0:0): Error 5, Retries exhausted > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > ahcich5: Poll timeout on slot 3 port 0 > ahcich5: is cs 0008 ss rs 0008 tfd 80 serr > cmd c317 > (aprobe2:ahcich5:0:0:0): SOFT_RESET. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 > (aprobe2:ahcich5:0:0:0): CAM status: Command timeout > (aprobe2:ahcich5:0:0:0): Error 5, Retries exhausted > [skip] > Trying to mount root from ufs:/dev/raid/r0s4a [rw,noatime]... > Root mount waiting for: GRAID-Intel > Root mount waiting for: GRAID-Intel > Root mount waiting for: GRAID-Intel > Root mount waiting for: GRAID-Intel > Root mount waiting for: GRAID-Intel > GEOM_RAID: Intel-c291fe96: Force array start due to timeout. > GEOM_RAID: Intel-c291fe96: Disk ada0 state changed from NONE to ACTIVE. > GEOM_RAID: Intel-c291fe96: Subdisk r0:0-ada0 state changed from NONE to STALE. > GEOM_RAID: Intel-c291fe96: Array started. > GEOM_RAID: Intel-c291fe96: Subdisk r0:0-ada0 state changed from STALE to > ACTIVE. > GEOM_RAID: Intel-c291fe96: Volume r0 state changed from STARTING to DEGRADED. > GEOM_RAID: Intel-c291fe96: Provider raid/r0 for volume r0 created. > > > -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ASM1062 AHCI timeouts, ppt(4) BAR aligning [Was: Re: svn commit: r309251 - head/sys/dev/ahci]
On 29.12.2016 10:35, Harry Schmalzbauer wrote: > I'd like to report that this doesn't fix timeouts for me (applied to > 11-stable). > > For example my REV120 works without problems on Intel-AHCI but not on > ASM1062-AHCI. > Even attaching gives different output. Both look fine at first: > #cd0 at ahcich0 bus 0 scbus5 target 0 lun 0 > #cd0: Removable CD-ROM SCSI device > #cd0: Serial Number 0C1E4D046E5DFF18 > #cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO > 8192bytes) > > When attached to the Intel-AHCI, it's followed by > +cd0: Attempt to query device size failed: NOT READY, Medium not present > while attaching to ASM1062 it reads (!?) > -cd0: 0MB (1 0 byte sectors) > > Then these timeouts occur: > ahcich7: Timeout on slot 11 port 0 > ahcich7: is cs 0c00 ss rs 0c00 tfd 6051 serr > cmd 0004cb17 > ahcich7: Timeout on slot 24 port 0 > ahcich7: is cs 0180 ss rs 0180 tfd 2051 serr > cmd 0004d817 > ahcich7: Timeout on slot 6 port 0 > ahcich7: is cs 0060 ss rs 0060 tfd 2051 serr > cmd 0004c617 > ahcich7: Timeout on slot 20 port 0 > ahcich7: is cs 0018 ss rs 0018 tfd 2051 serr > cmd 0004d417 > > Also IDENT (via camcontrol) "hangs" for 20 seconds, but finally succeeds. I think problem may be different in your case. The HBA still reports that command is not completed by the device. Unfortunately I don't have those fancy drives to try, but I'll try to reproduce it with regular CD drive when I get back home after short New Year holidays. > Btw: I already found out that extending ppt(4) to support unaligned base > address register wouldn't be too easy. > Initially I added that ASM1062 card to use it for byhve(8) passthrough. > Unfortunately that doesn't work: > bhyve: passthru device 6/0/0 BAR 5: base 0xc3e1 or size 0x200 not > page aligned > That's the ASM1062: > ppt0@pci0:6:0:0:class=0x010601 card=0x10601b21 chip=0x06121b21 > rev=0x01 hdr=0x00 > bar [10] = type I/O Port, range 32, base 0x5050, size 8, enabled > bar [14] = type I/O Port, range 32, base 0x5040, size 4, enabled > bar [18] = type I/O Port, range 32, base 0x5030, size 8, enabled > bar [1c] = type I/O Port, range 32, base 0x5020, size 4, enabled > bar [20] = type I/O Port, range 32, base 0x5000, size 32, enabled > bar [24] = type Memory, range 32, base 0xc3e1, size 512, enabled I believe it is bhyve bug, since these values are just what hardware reports. BAR size of 512 bytes indeed does not align to 4K, but that is not our problem. :) > Are there any recommendations for AHCI (SATA-PCIe) controller > cards/chips that do work (both, for byhve passthrough and also as plain > AHCI provider)? Please don't mix multiple unrelated questions in one email. There is very little reasonable external AHCI controllers on the market now. I am not sure anything other then Marvell and ASmedia were released at all in last years since 6Gbps SATA came out. Marvell and ASmedia probably worth each other, while later Marvell may be slightly better on functionality (number of ports and FBS PMP support), but they are both desktop products. If you need this in server environment -- think about about SAS adapter like LSI. Or just use on-board Intel AHCI, since they are probably the best om reliability you may get out of SATA. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: stable/10: high load average when box is idle
On 26.12.2015 17:09, Ian Smith wrote: > Current hypothesis: some variable/s are getting improperly initialised > at boot, but are (somehow?) getting properly re-initialised on changing > cpuset to 1 then back to 2 cpus - though I've no idea how, or by what. While this is interesting hypothesis, I see no real ground for it in the code. My own explanation here, same as before, is in area of events aliasing. HPET, due to its hardware limitations, more prone to different synchronization effects then LAPIC. And those limitations are specific to hardware configuration. On modern hardware HPET may provide (up 8) per-CPU MSI interrupts. This is the best case for everything with minimal chances for aliasing (unless you have more then 8 logical cores). On older hardware it is typical to have HPET sharing single interrupt line with some other device(s) and generating events for all CPUs from it. Interrupt line sharing tends to create load of 1.0 due to counting its own interrupt thread. I've partially workarounded that at some point, but aliasing possibilities are still there. Driving multiple CPUs from the same interrupt also creates aliasing, since different CPUs wakeup close to each other and may count each-others load. Different CPU wakeup times from different sleep states and other sources of jitter may generate quite complicated but not really useful behavior patterns. Happy holidays! -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Bug 204641 - 10.2 UNMAP/TRIM not available on a zfs zpool that uses iSCSI disks, backed on a zpool file target
On 18.11.2015 02:28, Steven Hartland wrote: > On 17/11/2015 22:08, Christopher Forgeron wrote: >> I just submitted this as a bug: >> >> ( https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204641 ) >> >> ..but I thought I should bring it to the list's attention for more >> exposure >> - If that's a no-no, let me know, as I have a few others that are related >> to this that I'd like to discuss. > Having a quick flick through the code it looks like umap is now only > supported on dev backed and not file backed. > > I believe the following commit is the cause: > https://svnweb.freebsd.org/base?view=revision=279005 > > This was an MFC of: > https://svnweb.freebsd.org/base?view=revision=278672 > > I'm guessing this was an unintentional side effect mav? As I have replied on the ticket: CTL never supported UNMAP on file-backed LUNs due to lack of respective API for hole punching on FreeBSD. At this time UNMAP works for ZVOLs in both device and file modes and raw devices. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: recent ZFS / CAM updates in RELENG_10?
Hi. On 05.10.2015 16:17, Mike Tancsa wrote: > I noticed a whole whack of MFCs to RELENG_10 for zfs and cam (thanks > for all that!) Just wondering if there is more to come, or is this > perhaps a good time to start testing with all these changes on a few non > critical boxes ? At this point I've merged all I planned. There are few more recent ZFS commits in HEAD that are not merged, but they are not mine, so I leave them to authors. So yes, I think now it is a good time to start testing. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: VIMAGE kernel broken after 255541
Hi. The change is reverted. Sorry. On 14.09.2013 16:05, goran.lowkra...@ismobile.com wrote: Hi, After 255541 I can't compile a VIMAGE kernel: cc1: warnings being treated as errors /usr/src/sys/kern/sched_ule.c: In function 'cpu_search': /usr/src/sys/kern/sched_ule.c:638: warning: implicit declaration of function 'CPU_FFS' /usr/src/sys/kern/sched_ule.c:638: warning: nested extern declaration of 'CPU_FFS' [-Wnested-externs] *** [sched_ule.o] Error code 1 Kernconf: VSERVER: # # VSERVER --A VIMAGE kernel configuration file for FreeBSD/amd64 # # $FreeBSD: stable/9/sys/amd64/conf/XENHVM 239412 2012-08-20 11:34:49Z cperciva $ # include SERVER ident VSERVER # VIMAGE config option VIMAGE SERVER: # # SERVER -- General server # include GENERIC ident SERVER # Update resources for PostgreSQL options SHMMAXPGS=65536 options SEMMNI=40 options SEMMNS=240 options SEMUME=40 options SEMMNU=120 # # Compile with kernel debugger related code. # options KDB options KDB_TRACE options KDB_UNATTENDED options DDB #optionsINVARIANTS #optionsINVARIANT_SUPPORT #optionsWITNESS #optionsDEBUG_LOCKS #optionsDEBUG_VFS_LOCKS # Include Apple Talk support options NETATALK /glz -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM RAID devd events
On 01.08.2013 12:36, Daniel O'Connor wrote: Hi, Does anyone know if graid generates devd events for 'interesting' RAID events? (eg array becoming degraded, rebuild progress completion, etc). I had a look and I couldn't find any devctl_notify* calls but perhaps they are hidden behind some GEOM calls. If there aren't, are there any plans to add some? I am happy to test, or even write if I can find some time. GEOM RAID does not do anything special about devd now. I had no such plans, but probably that is a not a bad idea if do it well. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM RAID devd events
On 01.08.2013 13:27, Daniel O'Connor wrote: On 01/08/2013, at 19:56, Daniel O'Connor docon...@gsoft.com.au wrote: GEOM RAID does not do anything special about devd now. I had no such plans, but probably that is a not a bad idea if do it well. Do you have a recommendation for where I should start looking? (ie a hint about where such a thing would go) After doing the reading I should have done before I sent my last message I see that g_raid_update_* look good candidates. That would be nice to do it is possibly more generic way to be usable for other GEOM classes, such as MIRROR, MULTIPATH, etc. At least make messages formatting unified. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Supermicro and FreeBSD 9.2 PRERELEASE make_dev_physpath_alias: WARNING
On 15.07.2013 14:10, Sergey Kandaurov wrote: On 15 July 2013 14:02, Johan Hendriks joh.hendr...@gmail.com wrote: We use basic supermicro cases for our storage servers in combination with a LSI 9211-8i controller in IT mode. Since 9.1 or shortly there after we get for every disk we attach to the SAS backplane the following error. make_dev_physpath_alias: WARNING - Unable to alias gptid/abb586f5-da8d-11e2-aaaf-00259061b51a to enc@n500304800122877d/type@0/slot@f/elmdesc@Slot_15/gptid/abb586f5-da8d-11e2-aaaf-00259061b51a - path too long. I know it does not harm the operation, but every time the server boots or when we add a disk i get a little scared when i see WRNINGS passing by. Is there a way to supress these WARNINGS, or is there something i can do about it. This is because the name is longer than SPECNAMELEN. You barely can do anything with it. The warning is hidden under bootverbose in 10-CURRENT, and I think it should be merged to stable/9 before 9.2 release. Meantime you can manually apply this change: http://svnweb.freebsd.org/changeset/base/235899 Thank you for the reminder, sent MFC request. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Marvell 88SE91Ax simple patch
On 09.07.2013 11:24, Dmitry Morozovsky wrote: Alexander, trying to activate eSATA port on my home file server I found that the following simple patch seems to work -- could you please add it, hopefully before 9.2-R? marck@hamster:/sys svn diff dev/ahci Index: dev/ahci/ahci.c === --- dev/ahci/ahci.c (revision 252889) +++ dev/ahci/ahci.c (working copy) @@ -234,6 +234,7 @@ {0x91301b4b, 0x00, Marvell 88SE9130, AHCI_Q_NOBSYRES|AHCI_Q_ALTSIG}, {0x91721b4b, 0x00, Marvell 88SE9172, AHCI_Q_NOBSYRES}, {0x91821b4b, 0x00, Marvell 88SE9182, AHCI_Q_NOBSYRES}, + {0x91a01b4b, 0x00, Marvell 88SE91Ax, AHCI_Q_NOBSYRES}, {0x92201b4b, 0x00, Marvell 88SE9220, AHCI_Q_NOBSYRES|AHCI_Q_ALTSIG}, {0x92301b4b, 0x00, Marvell 88SE9230, AHCI_Q_NOBSYRES|AHCI_Q_ALTSIG}, {0x92351b4b, 0x00, Marvell 88SE9235, AHCI_Q_NOBSYRES}, Committed to HEAD. Thanks. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout
On 03.06.2013 23:22, Jeremy Chadwick wrote: On Mon, Jun 03, 2013 at 03:06:53PM +0100, Mike Pumford wrote: Ian Lepore wrote: On Wed, 2013-05-29 at 16:21 +0200, Oliver Fromme wrote: Steven Hartland wrote: Have you checked your sata cables and psu outputs? Both of these could be the underlying cause of poor signalling. I can't easily check that because it is a cheap rented server in a remote location. But I don't believe it is bad cabling or PSU anyway, or otherwise the problem would occur intermittently all the time if the load on the disks is sufficiently high. But it only occurs at tags=3 and above. At tags=2 it does not occur at all, no matter how hard I hammer on the disks. At the moment I'm inclined to believe that it is either a bug in the HDD firmware or in the controller. The disks aren't exactly new, they're 400 GB Samsung ones that are several years old. I think it's not uncommon to have bugs in the NCQ implementation in such disks. The only thing that puzzles me is the fact that the problem also disappears completely when I reduce the SATA rev from II to I, even at tags=32. It seems to me that you dismiss signaling problems too quickly. Consider the possibilities... A bad cable leads to intermittant errors at higher speeds. When NCQ is disabled or limited the software handles these errors pretty much transparently. When NCQ is not limitted and there are many outstanding requests, suddenly the error handling in the software breaks down somehow and a minor recoverable problem becomes an in-your-face error. It could also be a software bug in the way CAM handles the failure of NCQ commands. When command queueing is used on a SCSI drive and a queued command fails only that command fails. A queued command failure on a SATA device fails ALL currently queued commands. I've not looked at the code but do the SATA CAM drivers do the right thing here? Quoting T13/2015-D ATA8-ACS2 WD spec: If an error occurs while the device is processing an NCQ command, then the device shall return command aborted for all NCQ commands that are in the queue and shall return command aborted for any new commands, except a READ LOG EXT command requesting log address 10h, until the device completes a READ LOG EXT command requesting log address 10h (i.e., reading the NCQ Command Error log) without error. While I can't easily provide an answer to your question, I can tell you that sys/dev/ahci/ahci.c does execute READ LOG EXT (command 0x2f) for certain scenarios (the code is in function ahci_issue_recovery()). I am not aware about any flows in present CAM ATA error recovery logic. READ LOG EXT sending indeed implemented on ahci(4) driver level (same as siis(4) and mvs(4)) since it was complicated/impossible to do in shared code because higher levels have no idea about tags allocation done by lower-level drivers. The one person who can answer this question is mav@, who is now CC'd. Less commands queued makes it less likely that multiple commands will be in progress when a failure occurs. A lower link rate also makes you more immune to signal failures. He isn't seeing SATA-level signal/link failure; the AHCI driver would complain about that, and those messages aren't there. Unless, of course, those messages are only visible when verbose booting is enabled (I hope not). Just a curious history point: I had one old system on NVIDIA MCP55 chipset where Linux worked well before, but FreeBSD had problems with SATA -- all disk transfers were really slow, but without reporting any errors, and after some point system started to hang. That series of chipsets had long history of problems, so for some time I was looking for some way to handle it in software. But after many experiments I've accidentally found out that disabling 6 small but very powerful fans workarounded the problem. I've checked PSU voltages, and they were fine. Switching fans to separate PSU also helped. Finally I've just replaced system's main PSU with different one and problems have gone. My best guess was that capacitors in that PSU due to old age were unable to filter fan's electric noise that started to interfere with SATA and later other signals. Now the same PSU works perfectly fine in the same case with smaller Atom-based motherbard without any issues. I am not telling that ahci(4) driver is perfect, but hardware issues are always possible even if system worked fine before that. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ada(4) and ahci(4) quirk printing
On 22.04.2013 08:14, Jeremy Chadwick wrote: I've written the following patches and done the following testing (see the results.*.txt files): http://jdc.koitsu.org/freebsd/quirk_printing/ Important: these are against stable/9 r249715. Folks are welcome to try these; I've tested about as best as I can. Questions/comments for Alexander and Kenneth: 1. I'm not sure if the location of where I added the printf() code is correct or not, It seems fine for me. 2. Not sure if loader.conf(5) forced-quirks would show up here or not, As I see, they will. 3. It would be nice to have the same for SCSI da(4). I took a stab at this but the printing code I wrote never got called (or the quirks entry I added wasn't right, not sure which), 4. I strongly believe quirk printing should be shown *without* verbose booting. I say this because I noticed some of the CAPAB printf()s only get shown if bootverbose is true. In fact, it's what prompted me to open PR 178040 (My Intel 320 and 510-series SSDs don't show 4K quirks, yet advertise 512 logical and physical in IDENTIFY?! PR time!). Let me disagree. bootverbose keeps dmesg readable for average user, while quirks are specific driver workarounds and their names may confuse more then really help. If every driver print its quirks, dmesg would be two times bigger. There is bootverbose for it. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ada(4) and ahci(4) quirk printing
On 23.04.2013 12:26, Jeremy Chadwick wrote: On Tue, Apr 23, 2013 at 10:44:57AM +0300, Alexander Motin wrote: On 22.04.2013 08:14, Jeremy Chadwick wrote: I've written the following patches and done the following testing (see the results.*.txt files): http://jdc.koitsu.org/freebsd/quirk_printing/ Important: these are against stable/9 r249715. Folks are welcome to try these; I've tested about as best as I can. Questions/comments for Alexander and Kenneth: 1. I'm not sure if the location of where I added the printf() code is correct or not, It seems fine for me. 2. Not sure if loader.conf(5) forced-quirks would show up here or not, As I see, they will. 3. It would be nice to have the same for SCSI da(4). I took a stab at this but the printing code I wrote never got called (or the quirks entry I added wasn't right, not sure which), 4. I strongly believe quirk printing should be shown *without* verbose booting. I say this because I noticed some of the CAPAB printf()s only get shown if bootverbose is true. In fact, it's what prompted me to open PR 178040 (My Intel 320 and 510-series SSDs don't show 4K quirks, yet advertise 512 logical and physical in IDENTIFY?! PR time!). Let me disagree. bootverbose keeps dmesg readable for average user, while quirks are specific driver workarounds and their names may confuse more then really help. If every driver print its quirks, dmesg would be two times bigger. There is bootverbose for it. I'm willing to bend on this assuming that userland has a way to display the quirks. I've already had one user contact me off-list stating that displaying of quirks is useful to them, but *without* bootverbose (because bootverbose shows too much information for them to have to sift through). And display of quirks (or in this case) was what prompted me to create PR 178040, since I had just *assumed* FreeBSD had 4K quirks in place for both models of SSDs. I think sysctl would be an ideal place for this. Is it possible to export active device quirks to sysctl (say kern.cam.ada.X.quirks), read-only, and preferably as a string (same printf() style used)? Or does that introduce complexities? If we can't reach an agreement, I'm happy to wrap the relevant bits with an if (bootverbose), but I really feel users should have some way to see this information outside of bootverbose. Both da and ada drivers already have sysctl's. It should be trivial to add one more, especially if just numeric. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ada(4) and ahci(4) quirk printing
On 23.04.2013 13:49, Jeremy Chadwick wrote: On Tue, Apr 23, 2013 at 12:29:10PM +0300, Alexander Motin wrote: On 23.04.2013 12:26, Jeremy Chadwick wrote: On Tue, Apr 23, 2013 at 10:44:57AM +0300, Alexander Motin wrote: On 22.04.2013 08:14, Jeremy Chadwick wrote: I've written the following patches and done the following testing (see the results.*.txt files): http://jdc.koitsu.org/freebsd/quirk_printing/ Important: these are against stable/9 r249715. Folks are welcome to try these; I've tested about as best as I can. Questions/comments for Alexander and Kenneth: 1. I'm not sure if the location of where I added the printf() code is correct or not, It seems fine for me. 2. Not sure if loader.conf(5) forced-quirks would show up here or not, As I see, they will. 3. It would be nice to have the same for SCSI da(4). I took a stab at this but the printing code I wrote never got called (or the quirks entry I added wasn't right, not sure which), 4. I strongly believe quirk printing should be shown *without* verbose booting. I say this because I noticed some of the CAPAB printf()s only get shown if bootverbose is true. In fact, it's what prompted me to open PR 178040 (My Intel 320 and 510-series SSDs don't show 4K quirks, yet advertise 512 logical and physical in IDENTIFY?! PR time!). Let me disagree. bootverbose keeps dmesg readable for average user, while quirks are specific driver workarounds and their names may confuse more then really help. If every driver print its quirks, dmesg would be two times bigger. There is bootverbose for it. I'm willing to bend on this assuming that userland has a way to display the quirks. I've already had one user contact me off-list stating that displaying of quirks is useful to them, but *without* bootverbose (because bootverbose shows too much information for them to have to sift through). And display of quirks (or in this case) was what prompted me to create PR 178040, since I had just *assumed* FreeBSD had 4K quirks in place for both models of SSDs. I think sysctl would be an ideal place for this. Is it possible to export active device quirks to sysctl (say kern.cam.ada.X.quirks), read-only, and preferably as a string (same printf() style used)? Or does that introduce complexities? If we can't reach an agreement, I'm happy to wrap the relevant bits with an if (bootverbose), but I really feel users should have some way to see this information outside of bootverbose. Both da and ada drivers already have sysctl's. It should be trivial to add one more, especially if just numeric. I was hoping for an ASCII string, specifically something like what's outputted in my patches, i.e.: kern.cam.ada.2.quirks: 0x14K And ideally it'd be nice to have the same thing for ahci(4), which right now doesn't appear to have anything other than the dev.ahci.X.%xxx tree stuff (which I think is handled by the device registration stuff, not the ahci driver natively). I'll worry about that later. The problem with just leaving it as a numeric is that it doesn't provide the user with any idea of what the value represents. They're forced to go through the source code + decode the numeric into it's bit values and figure out what's what. I haven't told that it is impossible. I would just prefer to not complicate the code too much with rarely used debugging features. I'm pretty sure I can work this into sys/cam/ata/ata_da.c (looking at read_ahead as an example, though using SYSCTL_PROC not SYSCTL_INT, and for how SYSCTL_PROC works with this type of thing, referring to machdep.c for an example), but it'd be my first time doing any of this. I'll give it a shot. I really need to get myself a SFF PC for FreeBSD just for testing these types of things, unless FreeBSD has some magical way to test a kernel on a live system without having to reboot. (Sounds like black magic to me ;-) ) Virtual machine? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Any objections/comments on axing out old ATA stack?
On 21.04.2013 00:29, Jeremy Chadwick wrote: - The ATA commands which lead up to the error also vary. Many are for write requests, and from some entries I can see that the OS was doing NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a classic 28-bit LBA write (WRITE DMA). I'm not sure why an OS would do this (there's nothing optimal about it) unless there were conditions occurring where the OS/ATA driver said this NCQ write isn't working (timeout, etc.), let me retry with a classic 28-bit LBA write. ATA disk driver in CAM inserts non-queued command every several seconds of continuous load to limit possible command starvation inside the disk. SCSI driver does alike things, but inserts ordered command flag, that does not exist in SATA, instead of different command. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Any objections/comments on axing out old ATA stack?
ATA controller drivers are delaying conflicting commands, avoiding conflicts in device. 21.04.2013 14:32 пользователь Jeremy Chadwick j...@koitsu.org написал: On Sun, Apr 21, 2013 at 02:11:04PM +0300, Alexander Motin wrote: On 21.04.2013 00:29, Jeremy Chadwick wrote: - The ATA commands which lead up to the error also vary. Many are for write requests, and from some entries I can see that the OS was doing NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a classic 28-bit LBA write (WRITE DMA). I'm not sure why an OS would do this (there's nothing optimal about it) unless there were conditions occurring where the OS/ATA driver said this NCQ write isn't working (timeout, etc.), let me retry with a classic 28-bit LBA write. ATA disk driver in CAM inserts non-queued command every several seconds of continuous load to limit possible command starvation inside the disk. SCSI driver does alike things, but inserts ordered command flag, that does not exist in SATA, instead of different command. Thanks for the insights Alexander, greatly appreciated. I'm a little confused by your description, because if I'm reading it right, it sounds like it conflicts with what the ACS-2 spec states. Quoting T13/2015-D rev 3 (I'm aware it's a working draft), section 4.16.1: If the device receives a command that is not an NCQ command while NCQ commands are in the queue, then the device shall return command aborted for the new command and for all of the NCQ commands that are in the queue. I assume this means ABRT status is returned to the host controller; if so (and by design of course), how do we differentiate between that condition and any other I/O condition that induces ABRT? Possibly in the answer is in this admission: I should probably get around to reading ATA8-AST sometime. :-) -- | Jeremy Chadwick j...@koitsu.org | | UNIX Systems Administratorhttp://jdc.koitsu.org/ | | Mountain View, CA, US| | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Lost CDROM on 9.1 with ATA_CAM on Promise controller
On 17.04.2013 12:47, Andre Albsmeier wrote: On Wed, 17-Apr-2013 at 10:53:54 +0200, Jeremy Chadwick wrote: On Wed, Apr 17, 2013 at 08:26:00AM +0200, Andre Albsmeier wrote: On Tue, 16-Apr-2013 at 21:38:22 +0200, Jeremy Chadwick wrote: On Tue, Apr 16, 2013 at 07:55:20PM +0200, Andre Albsmeier wrote: I have lost one of my CDROM drives (HL-DT-STDVD-RAM GH22LP20/2.00) after going from 7.4 to 9.1 when using ATA_CAM. It is attached to a Promise PDC20268 UDMA100 controller. A standard harddisk drive attached to this controller works well. Cables, controller and drive where replaced already. Kernel gives me: atapci1: Promise PDC20268 UDMA100 controller port 0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 0xdf80-0xdf803fff irq 11 at device 12.0 on pci0 ata2: ATA channel at channel 0 on atapci1 ata3: ATA channel at channel 1 on atapci1 ... ada0 at ata2 bus 0 scbus2 target 0 lun 0 ada0: Maxtor 7B300R0 BAH41G10 ATA-7 device ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes) ada0: 286188MB (586114704 512 byte sectors: 16H 63S/T 16383C) ... (cd2:ata3:0:0:0): got CAM status 0x50 (cd2:ata3:0:0:0): fatal error, failed to attach to device (cd2:ata3:0:0:0): lost device, 4 refs (cd2:ata3:0:0:0): removing device entry ... Attaching the CDROM drive to the controller that is integrated on the mainboard (Intel PIIX4 UDMA33 controller) does not show this problem (but here I don't have UDMA66). It also works when not using ATA_CAM: ... acd0: DVDR HL-DT-STDVD-RAM GH22LP20/2.00 at ata3-master UDMA66 ... So this semes to be a problem with the Promise controller and ATA_CAM. Any ideas? Or should I file PR? The controller in question is a Promise Ultra100 TX2. Right. Tried with an Ultra133, same effect. The error message comes from sys/cam/scsi/scsi_cd.c, in function cddone(). The logic is a little hard for me to follow (I understand about 70% of it). Look at lines 1724 to 1877 for stable/9. 1. Can you provide full output from a verbose boot when the CD/DVD drive is attached to the Promise controller? Attached below. I have just filtered out some ahc cruft... Later I will try to boot a -current kernel -- just to see how this behaves... 2. What firmware version the card is using? The PDC20268 had many, many firmware problems relating to ATAPI devices. It is the latest BIOS: 2.20.0.15. 3. I wouldn't worry about ATA66 vs. ATA33; this drive can only support up to about 22MBytes/second so ATA66 isn't going to get you anything, so as a workaround, using the PIIX4 for it would not hurt you. Probably. But I already had cdrecord complain when it came to the funky DMA speed test it is doing. It went away when using the UDMA66 port. And on the other hand I sometimes use the PIIX4 port for other stuff and I do not want to attach the cdrom to the slave port. 4. ONLY if this turns out to be a controller thing: I'm not sure how much effort should be spent trying to make this work, as the PDC20268 is legacy/deprecated hardware (made/released 13 years ago). The whole box is more than 13 years old (good old Asus BX board) ;-) But since it worked in 7.4-STABLE I feel that this is some kind of regression. I do not want to waste anyone's resources in fixing it -- just if someone is curious and/or has an idea how to fix it... And here is the dmesg: {snipping for mail brevity} Thanks. CC'd ken@ and mav@ for advice on this. Here's the dmesg: http://lists.freebsd.org/pipermail/freebsd-stable/2013-April/073131.html Short details: The device under scrutiny here is cd2 on ata3, which is an ATAPI IDE-based optical drive. The drive works when either: a) Connected to a different IDE controller (atapci0), or, b) When ATA_CAM is removed (i.e. use ata(4) exclusively). And just as a note: The -current kernel from https://snapshots.glenbarber.us/Latest/FreeBSD-10.0-CURRENT-i386-20130316-r248381-bootonly.iso shows the same problem... Some of Promise controllers are known to have problems with ATAPI DMA. Have you tried to disable DMA on that channel or device with loader tunable like like hint.ata.3.mode=PIO4 ? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Any objections/comments on axing out old ATA stack?
On 02.04.2013 21:39, Matthias Andree wrote: Am 31.03.2013 23:02, schrieb Scott Long: So what I hear you and Matthias saying, I believe, is that it should be easier to force disks to fall back to non-NCQ mode, and/or have a more responsive black-list for problematic controllers. Would this help the situation? It's hard to justify holding back overall forward progress because of some bad controllers; we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x, enough to make up a sizable percentage of the internet's traffic, and we see no problems. How can we move forward but also take care of you guys with problematic hardware? Well, I am running the driver fine off of my WD Caviar RE3 disk, and the problematic drive also works just fine with Windows and Linux, so it must be something between the problematic drive and the FreeBSD driver. I would like to see any of this, in decreasing order of precedence: - debugged driver - assistance/instructions on helping how to debug the driver/trace NCQ stuff/... (as in Jeremy Chadwick's followup in this same thread - this helps, I will attempt to procure the required information; back then, reducing the number of tags to 31 was ineffective, including an error message and getting a value of 32 when reading the setting back) Unfortunately, I don't know how to debug that. Command timeouts reported on the lists before are the kind of errors that are most difficult to diagnose since the controller gives no information to do that. We just see that sent commands are no longer completing. May be it is some incompatibility of specific drive and HBA firmwares, triggered by some innocent specifics of our ATA stack, GEOM or filesystems implementation. All I can propose is to try to identify such cases and add some quirks to workaround it, like disabling NCQ or limiting number of tags. I am not sure what else can we do about it without some controlled lab environment with affected hardware and SATA analyzer. - user-space contingency features, such as letting camcontrol limit the number of open NCQ tags, or disable NCQ, either on a per-drive basis I've merged support for that to 8/9-STABLE about 9 months ago: `camcontrol tags ada0 -v -N X` should change number of simultaneously used tags, `camcontrol negotiate ada0 -T (en|dis)able` should enable/disable use of NCQ. I just did some tests on HEAD and these commands seems like working. If you can reproduce the problem, it would be nice to collect information how these changes affect it. I am capable of debugging C - mostly with gdb command-line, and graphical Windows IDEs - but am unfamiliar with FreeBSD kernel debugging. If necessary, I can pull up a second console, but the PC that is affected is legacy-free, so serial port only works through a serial/USB converter. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Any objections/comments on axing out old ATA stack?
On 31.03.2013 08:13, Ian Smith wrote: On Sat, 30 Mar 2013 21:00:24 -0700, Peter Wemm wrote: On Sat, Mar 30, 2013 at 4:29 PM, Matthias Andree mand...@freebsd.org wrote: Am 27.03.2013 22:22, schrieb Alexander Motin: Hi. Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA stack, using only some controller drivers of old ata(4) by having `options ATA_CAM` enabled in all kernels by default. I have a wish to drop non-ATA_CAM ata(4) code, unused since that time from the head branch to allow further ATA code cleanup. Does any one here still uses legacy ATA stack (kernel explicitly built without `options ATA_CAM`) for some reason, for example as workaround for some regression? Does anybody have good ideas why we should not drop it now? Alexander, The regression in http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157397 where the SATA NCQ slots stall for some Samsung drives in the new stack, and consequently hang the computer for prolonged episodes where it is in the NCQ error handling, disallows removal of the old driver. (Last checked with 9.1-RELEASE at current patchlevel.) We're talking about 10.x, so if you want it fixed, you need update with 10.x information. Please put 10.x diagnostics in the PR. Given Alexander also posted this to -stable, just for clarity, are we _only_ talking about 10.x here, or might this change get MFC'd to 9? Yes, I am only going to drop it from 10.x, but bug reports from 9-STABLE users are welcome, as at some point they will become 10.x users. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Any objections/comments on axing out old ATA stack?
On 28.03.2013 02:43, Adrian Chadd wrote: My main concern with the new stuff is that it requires CAM and that's reasonably big compared to the standalone ATA code. It'd be nice if we could slim down the CAM stack a bit first; it makes embedding it on the smaller devices really freaking painful. Are there many boards now with ATA, but without USB? But I agree, it should be checked. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Any objections/comments on axing out old ATA stack?
Hi. Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA stack, using only some controller drivers of old ata(4) by having `options ATA_CAM` enabled in all kernels by default. I have a wish to drop non-ATA_CAM ata(4) code, unused since that time from the head branch to allow further ATA code cleanup. Does any one here still uses legacy ATA stack (kernel explicitly built without `options ATA_CAM`) for some reason, for example as workaround for some regression? Does anybody have good ideas why we should not drop it now? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Any objections/comments on axing out old ATA stack?
On 27.03.2013 23:32, Steve Kargl wrote: On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote: Hi. Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA stack, using only some controller drivers of old ata(4) by having `options ATA_CAM` enabled in all kernels by default. I have a wish to drop non-ATA_CAM ata(4) code, unused since that time from the head branch to allow further ATA code cleanup. Does any one here still uses legacy ATA stack (kernel explicitly built without `options ATA_CAM`) for some reason, for example as workaround for some regression? Yes, I use the legacy ATA stack. On 9.x or HEAD where new one is default? Does anybody have good ideas why we should not drop it now? Because it works? Any problems with new one? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Any objections/comments on axing out old ATA stack?
On 28.03.2013 00:05, Steve Kargl wrote: On Wed, Mar 27, 2013 at 11:35:35PM +0200, Alexander Motin wrote: On 27.03.2013 23:32, Steve Kargl wrote: On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote: Hi. Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA stack, using only some controller drivers of old ata(4) by having `options ATA_CAM` enabled in all kernels by default. I have a wish to drop non-ATA_CAM ata(4) code, unused since that time from the head branch to allow further ATA code cleanup. Does any one here still uses legacy ATA stack (kernel explicitly built without `options ATA_CAM`) for some reason, for example as workaround for some regression? Yes, I use the legacy ATA stack. On 9.x or HEAD where new one is default? Head. Does anybody have good ideas why we should not drop it now? Because it works? Any problems with new one? Last time I tested the new one, and this was several months ago, the system (a Dell Latitude D530 laptop) would not boot. Probably we should just fix that. Any more info? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Old ICH7 SATA-2 question
printf(%s%d: %d.%03dMB/s transfers, 2041periph-periph_name, periph-unit_number, 2042mb, speed % 1000); The if() statement that is being used in Michael's case is the one for XPORT_SATA, not XPORT_PATA; that will be proven further below. I then had two questions: 1. Where does base_transfer_speed get set? For SATA devices, it gets set in sys/dev/ata/ata-all.c (I think). The default value chosen is 15: 1884 if (ch-flags ATA_SATA) 1885 cpi-base_transfer_speed = 15; 1886 else 1887 cpi-base_transfer_speed = 3300; Right. It is the lowest possible speed, that is supported by this HBA. It is reported if we have no other information sources. 2. Where does CTS_SATA_VALID_REVISION get set, which can in effect override base_transfer_speed? The jury is still out on this one as you'll see. Now on to the protocol revision printing code, i.e. SATA 2.x -- remember we're talking about the negotiated speed/protocol, not what's returned from ATA IDENTIFY (e.g. camcontrol identify) for the disk. 2060 if (cts.ccb_h.status == CAM_REQ_CMP cts.transport == XPORT_SATA) { 2061 struct ccb_trans_settings_sata *sata = 2062 cts.xport_specific.sata; 2063 2064 printf( (); 2065 if (sata-valid CTS_SATA_VALID_REVISION) 2066 printf(SATA %d.x, , sata-revision); 2067 else 2068 printf(SATA, ); 2069 if (sata-valid CTS_SATA_VALID_MODE) 2070 printf(%s, , ata_mode2string(sata-mode)); 2071 if ((sata-valid CTS_ATA_VALID_ATAPI) sata-atapi != 0) 2072 printf(ATAPI %dbytes, , sata-atapi); 2073 if (sata-valid CTS_SATA_VALID_BYTECOUNT) 2074 printf(PIO %dbytes, sata-bytecount); 2075 printf()); 2076 } 2077 printf(\n); Here we can see that XPORT_SATA must be set, because Michael's kernel output clearly shows the above printf()s. But once again we're back to CTS_SATA_VALID_REVISION. Without CTS_SATA_VALID_REVISION being set, ata_xpt.c chooses to simply say SATA. That's all -- just SATA. And that is what Michael and others with this chip see. The question is, simply, why does this model of ICH7 result in the bit CTS_SATA_VALID_REVISION, in the valid member of the appropriate ccb_trans_settings_sata struct, not being set correctly. ICH7 SATA may be configured by BIOS in three different ways: 1. PCI BAR(5) is pointing to standard set of AHCI registers. In such case controller will be able to work as AHCI and real speeds will be reported by ahci(4) driver and printed as SATA x.0. 2. PCI BAR(5) is pointing to vendor-specific set of SATA registers. In such case controller will work mostly as legacy ATA with ata(4) driver, but the code in chipset/ata-intel.c will be able use vendor-specific registers to report speed, that again will be printed as SATA x.0. 3. PCI BAR(5) is not set at all (ctlr-r_res2 == NULL). In such case controller will work as pure legacy ATA with ata(4) driver, the code in chipset/ata-intel.c will still believe it is SATA, following the chip ID, but it will have no any idea about what is going on on SATA level. In such case just SATA will be printed and cpi-base_transfer_speed is used by CAM to report speed. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: WRITE_FPDMA_QUEUED CAM status: ATA Status Error
Hi. On 18.12.2012 00:07, Mike Tancsa wrote: Is there a way to tell / narrow down if an issue with errors like below are due to a bad cable or bad port multiplier ? The disks in a particular cage are throwing errors like these below. (RELENG9 from today) All the controller, the port multiplier and the disks are firmware- based devices. All of them may have firmware problems, that is not possible to diagnose from outside. When controller is talking to disk, multiplier is transparent, so it may be impossible to say where exactly problem happen. Speaking about cables and physical links, the only kind of information I can imagine to check physical link is counters represented below: SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 21 Command failed due to ICRC error 0x0002 21 R_ERR response for data FIS 0x0003 20 R_ERR response for device-to-host data FIS 0x0004 21 R_ERR response for host-to-device data FIS 0x0005 20 R_ERR response for non-data FIS 0x0006 20 R_ERR response for device-to-host non-data FIS 0x0007 20 R_ERR response for host-to-device non-data FIS 0x000a 20 Device-to-host register FISes sent due to a COMRESET 0x000b 21 CRC errors within host-to-device FIS 0x8000 4 7720 Vendor specific They may be reported by disks. IIRC they may also be reported by port multiplier, but I've never tried to access them and haven't seen the existing tools for it, except via doing bin-banging with camcontrol. Whether the controller can report something alike, I don't remember. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Samsung SSD 840 PRO fails to probe
Hi. On 26.11.2012 20:51, Adam McDougall wrote: My co-worker ordered a Samsung 840 PRO series SSD for his desktop but we found 9.0-rel would not probe it and 9.1-rc3 shows some errors. I got past the problem with a workaround of disabling AHCI mode in the BIOS which drops it to IDE mode and it detects fine, although runs a little slower. Is there something I can try to make it probe properly in AHCI mode? We also tried moving it to the SATA data and power cables from the working SATA HD so I don't think it is the port or controller driver. The same model motherboard from another computer did the same thing. Thanks. dmesg line when it is working: ada0: Samsung SSD 840 PRO Series DXM03B0Q ATA-9 SATA 3.x device dmesg lines when it is not working: (hand transcribed from a picture) (aprobe0:ahcich0:0:0): SETFEATURES ENABLE SATA FEATURE. ACB: ef 10 00 00 00 40 00 00 00 00 05 00 (aprobe0:ahcich0:0:0): CAM status: ATA Status Error (aprobe0:ahcich0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich0:0:0): RES: 51 04 00 00 00 40 00 00 00 00 00 (aprobe0:ahcich0:0:0): Retrying command (aprobe0:ahcich0:0:0): SETFEATURES ENABLE SATA FEATURE. ACB: ef 10 00 00 00 40 00 00 00 00 05 00 (aprobe0:ahcich0:0:0): CAM status: ATA Status Error (aprobe0:ahcich0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT ) (aprobe0:ahcich0:0:0): RES: 51 04 00 00 00 40 00 00 00 00 00 (aprobe0:ahcich0:0:0): Error 5, Retries exhausted I believe that is SSD's firmware bug. Probably it declares support for SATA Asynchronous Notifications in its IDENTIFY data, but returns error on attempt to enable it. Switching controller to legacy mode disables that functionality and so works as workaround. Patch below should workaround the problem from the OS side: --- ata_xpt.c (revision 243561) +++ ata_xpt.c (working copy) @@ -745,6 +745,14 @@ probedone(struct cam_periph *periph, union ccb *do goto noerror; /* +* Some Samsung SSDs report supported Asynchronous Notification, +* but return ABORT on attempt to enable it. +*/ + } else if (softc-action == PROBE_SETAN + status == CAM_ATA_STATUS_ERROR) { + goto noerror; + + /* * SES and SAF-TE SEPs have different IDENTIFY commands, * but SATA specification doesn't tell how to identify them. * Until better way found, just try another if first fail. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Increasing the DMESG buffer....
On 25.11.2012 01:43, Adrian Chadd wrote: I'm surprised it's not tunable via a kenv variable at boottime.. It is tunable. AFAIR that is it: kern.msgbufsize=65536 # Set size of kernel message buffer -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Increasing the DMESG buffer....
On 22.11.2012 12:53, Ian Smith wrote: On Wed, 21 Nov 2012 23:12:17 -0800, Adrian Chadd wrote: On 21 November 2012 20:16, Ian Smith smi...@nimnet.asn.au wrote: On Wed, 21 Nov 2012 12:08:42 -0800, Adrian Chadd wrote: [..] T61_dmesg.boot.10.works (file 1 of 2) lines 1813-1861/1861 byte 82415/82415 Cutting just the hdaa0, pcm0 and pcm1 stuff results in: hda_pcm.verbose (file 2 of 2) lines 712-760/760 byte 28531/28531 Is there a way to extract this topology information out of the driver without putting it in the verbose output? We should be asking Alexander, cc'd. I only have a snd_ich here, where hw.snd.verbose=3 is as rich as it gets, 105 lines incl. file versions. Neither ICH, nor any other driver I know have amount of information comparable to what HDA hardware provides. So the analogy is not good. Respecting that most CODECs have no published datasheets, that information is the only input for debugging. snd_hda also uses hw.snd.verbose=3. But it is used for even deeper driver debugging. It also enables a lot of debugging in sound(4), that can be too verbose for HDA debugging. I will recheck again how can it be reorganized, but I think that the real problem is not in HDA. We need some way to structure and filter the output. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable/9 @r241776 panic: REDZONE: Buffer underflow detected...
On 21.10.2012 20:40, Konstantin Belousov wrote: On Sun, Oct 21, 2012 at 09:46:34AM -0700, David Wolfskill wrote: On Sun, Oct 21, 2012 at 09:33:22AM -0700, David Wolfskill wrote: ... So I tried reverting 241749 ... and I failed to reproduce the problem. Well, one boot out of one, at least. I'll try a few more reality checks, and report back if a correction is in order. But (for now, at least), it looks to me as if 241749 is presenting a problem on this laptop. ... 5 for 5. I'm convinced that 241749 causes problems on this laptop for attempts to boot without a stop is single-user mode first. (So that sounds like a timing issue, somehow.) And thanks again, Konstantin! I do not know/do not understand the CAM code, the question shall be addressed to Alexander. It still might be a false positive. I don't see how increasing buffer size by few bytes in mentioned change may cause memory corruption in some other place. I guess change can be just innocent witness that affected some memory placement, moving some existing corruption from one area to another where it was noticed. I am curious, how to interpret phrase 42=94966796 bytes allocated in log. May be it is just corrupted output, but the number still seems quite big, especially for i386 system, making me think about some integer overflow. David, could you write down that part once more? Having few more lines of Allocation backtrace: could also be useful. Could you show your kernel config? I can try to run it on my tests system, hoping to reproduce the problem. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable/9 @r241776 panic: REDZONE: Buffer underflow detected...
On 21.10.2012 23:23, David Wolfskill wrote: On Sun, Oct 21, 2012 at 09:28:06PM +0300, Alexander Motin wrote: ... I am curious, how to interpret phrase 42=94966796 bytes allocated in log. May be it is just corrupted output, but the number still seems quite big, especially for i386 system, making me think about some integer overflow. David, could you write down that part once more? Having few more lines of Allocation backtrace: could also be useful. Could you show your kernel config? I can try to run it on my tests system, hoping to reproduce the problem. ... I've used your kernel config and my test system was unable to boot from NFS, while GENERIC kernel boots fine. I haven't got panic, but boot just stopped on root mounting. You have so many options specified there so I can't predict which of them could cause this. Now I am trying to binary search for the problematic one(s). -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable/9 @r241776 panic: REDZONE: Buffer underflow detected...
On 22.10.2012 01:03, Alexander Motin wrote: On 21.10.2012 23:23, David Wolfskill wrote: On Sun, Oct 21, 2012 at 09:28:06PM +0300, Alexander Motin wrote: ... I am curious, how to interpret phrase 42=94966796 bytes allocated in log. May be it is just corrupted output, but the number still seems quite big, especially for i386 system, making me think about some integer overflow. David, could you write down that part once more? Having few more lines of Allocation backtrace: could also be useful. Could you show your kernel config? I can try to run it on my tests system, hoping to reproduce the problem. ... I've used your kernel config and my test system was unable to boot from NFS, while GENERIC kernel boots fine. I haven't got panic, but boot just stopped on root mounting. You have so many options specified there so I can't predict which of them could cause this. Now I am trying to binary search for the problematic one(s). Sorry. false alarm. I was just closed firewall in your kernel config. Without it my test system boots your kernel without any problem. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time keeps on slipping... slipping...
On 11.10.2012 09:30, John-Mark Gurney wrote: Alexander Motin wrote this message on Thu, Oct 11, 2012 at 01:43 +0300: On 08.10.2012 07:02, John-Mark Gurney wrote: I recently put together a new machine w/ a SuperMicro H8SCM and an AMD Opteron 4228 HE... I've having an issue where the clock on the machine skips around... The wierd part is that it's very sudden when it happens... ntp sometimes brings it back, but it can't when the clock gets too far ahread (1000 seconds), ntp dies... In order to catch it happening, I ran a sleep 60 loop fetching time from another server that keeps time correctly via: while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; done here are some snippits: h2:Sun Oct 7 17:12:54 2012^M Sun Oct 7 17:12:54 PDT 2012 7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting h2:Sun Oct 7 17:13:48 2012^M Sun Oct 7 17:20:21 PDT 2012 7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting but then ntp brings it back in sync: h2:Sun Oct 7 17:28:49 2012^M Sun Oct 7 17:35:21 PDT 2012 7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting h2:Sun Oct 7 17:29:49 2012^M Sun Oct 7 17:29:49 PDT 2012 7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting It happens pretty often: Oct 7 00:19:13 gold ntpd[3721]: time reset -785.347912 s Oct 7 00:46:37 gold ntpd[3721]: time reset -392.673256 s Oct 7 01:04:24 gold ntpd[3721]: time reset -785.346533 s Oct 7 15:00:59 gold ntpd[3721]: time reset -392.681720 s Oct 7 16:32:11 gold ntpd[3721]: time reset -392.671268 s Oct 7 17:29:29 gold ntpd[3721]: time reset -392.671752 s Oct 7 18:04:37 gold ntpd[3721]: time reset -785.346987 s but as you can see above, the time slip happens abruptly.. looks like a rounding error or something... I'm now reducing the sleep to 5 seconds... but as you can see the sleep ends a few seconds early and local time suddenly jumped forward 6 minutes 33 seconds... $ sysctl kern.timecounter kern.timecounter.fast_gettime: 1 kern.timecounter.tick: 1 kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 11598 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3257069245 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 950 kern.timecounter.tc.ACPI-safe.mask: 16777215 kern.timecounter.tc.ACPI-safe.counter: 4219134510 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.TSC-low.mask: 4294967295 kern.timecounter.tc.TSC-low.counter: 2854866610 kern.timecounter.tc.TSC-low.frequency: 10937740 kern.timecounter.tc.TSC-low.quality: 1000 kern.timecounter.smp_tsc: 1 kern.timecounter.invariant_tsc: 1 $ sysctl kern.eventtimer kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.et.LAPIC.flags: 15 kern.eventtimer.et.LAPIC.frequency: 12217 kern.eventtimer.et.LAPIC.quality: 400 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.periodic: 0 kern.eventtimer.timer: LAPIC kern.eventtimer.activetick: 1 kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 I have switched my timecounter to HPET to see if things are different... Any clues? Mentioned switching to HPET could tell a lot about the problem. Switching event timer also may be interesting. Since I switch to HPET, it hasn't happened at all in the last 3 days.. That is probably tells about some problems with TSC timecounter. What is strange to me is time jump size of 5 minutes. TSC timecounter should overflow each few seconds, so single jump should be just that big. Should I try switching back to TSC and switching event timer? do you need any other info, or want me to try anything else? You may try to do it to be sure eventtimers are not related to the case. Oh, forgot to include the specific processor info in my previous email: CPU: AMD Opteron(tm) Processor 4228 HE (2800.05-MHz K8-class CPU) Origin = AuthenticAMD Id = 0x600f12 Family = 0x15 Model = 0x1 Stepping = 2 Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT Features2=0x1e98220bSSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AESNI,XSAVE,OSXSAVE,AVX AMD Features=0x2e500800SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM AMD Features2=0x1c9bfffLAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,XOP,SKINIT,WDT,LWP,FMA4,NodeId,Topology,b23,b24 TSC: P-state invariant, performance statistics Unfortunately, I don't know AMD processors specifics. May be jkim@ or avg
Re: time keeps on slipping... slipping...
On 08.10.2012 07:02, John-Mark Gurney wrote: I recently put together a new machine w/ a SuperMicro H8SCM and an AMD Opteron 4228 HE... I've having an issue where the clock on the machine skips around... The wierd part is that it's very sudden when it happens... ntp sometimes brings it back, but it can't when the clock gets too far ahread (1000 seconds), ntp dies... In order to catch it happening, I ran a sleep 60 loop fetching time from another server that keeps time correctly via: while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; done here are some snippits: h2:Sun Oct 7 17:12:54 2012^M Sun Oct 7 17:12:54 PDT 2012 7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting h2:Sun Oct 7 17:13:48 2012^M Sun Oct 7 17:20:21 PDT 2012 7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting but then ntp brings it back in sync: h2:Sun Oct 7 17:28:49 2012^M Sun Oct 7 17:35:21 PDT 2012 7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting h2:Sun Oct 7 17:29:49 2012^M Sun Oct 7 17:29:49 PDT 2012 7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting It happens pretty often: Oct 7 00:19:13 gold ntpd[3721]: time reset -785.347912 s Oct 7 00:46:37 gold ntpd[3721]: time reset -392.673256 s Oct 7 01:04:24 gold ntpd[3721]: time reset -785.346533 s Oct 7 15:00:59 gold ntpd[3721]: time reset -392.681720 s Oct 7 16:32:11 gold ntpd[3721]: time reset -392.671268 s Oct 7 17:29:29 gold ntpd[3721]: time reset -392.671752 s Oct 7 18:04:37 gold ntpd[3721]: time reset -785.346987 s but as you can see above, the time slip happens abruptly.. looks like a rounding error or something... I'm now reducing the sleep to 5 seconds... but as you can see the sleep ends a few seconds early and local time suddenly jumped forward 6 minutes 33 seconds... $ sysctl kern.timecounter kern.timecounter.fast_gettime: 1 kern.timecounter.tick: 1 kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0) dummy(-100) kern.timecounter.hardware: TSC-low kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 11598 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 3257069245 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 950 kern.timecounter.tc.ACPI-safe.mask: 16777215 kern.timecounter.tc.ACPI-safe.counter: 4219134510 kern.timecounter.tc.ACPI-safe.frequency: 3579545 kern.timecounter.tc.ACPI-safe.quality: 850 kern.timecounter.tc.TSC-low.mask: 4294967295 kern.timecounter.tc.TSC-low.counter: 2854866610 kern.timecounter.tc.TSC-low.frequency: 10937740 kern.timecounter.tc.TSC-low.quality: 1000 kern.timecounter.smp_tsc: 1 kern.timecounter.invariant_tsc: 1 $ sysctl kern.eventtimer kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0) kern.eventtimer.et.LAPIC.flags: 15 kern.eventtimer.et.LAPIC.frequency: 12217 kern.eventtimer.et.LAPIC.quality: 400 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.periodic: 0 kern.eventtimer.timer: LAPIC kern.eventtimer.activetick: 1 kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 I have switched my timecounter to HPET to see if things are different... Any clues? Mentioned switching to HPET could tell a lot about the problem. Switching event timer also may be interesting. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ahcich reset - cannot mount zfs root in 9.1-PRE
On 02.10.2012 16:51, Andriy Gapon wrote: on 02/10/2012 16:16 geoffroy desvernay said the following: Hi all, Trying to upgrade a system from 9.0-RELEASE to 9.1-PRE from yesterday on my machine (GEOM+ZFS mirror setup on ada[01]p3), the new kernel becomes unable to mount root... The only way to recover is to boot from 9.0 kernel. The disks were already named ada[01] in 9.0, so I suspect nothing there... I tried - disabling AHCI in bios (no change seen) - change cables, check PSU, test disks with smartctl Here are some bits (via serial console): ahci0: ATI IXP600 AHCI SATA controller port 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f mem 0xfe9ff800-0xfe9ffbff irq 22 at device 18.0 on pci0 ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported ahci0: Caps: 64bit NCQ SNTF MPS AL CLO 3Gbps PM PMD SSC PSC 32cmd CCC 4ports ahcich0: AHCI channel at channel 0 on ahci0 ahcich0: Caps: HPCP ahcich1: AHCI channel at channel 1 on ahci0 ahcich1: Caps: HPCP ahcich2: AHCI channel at channel 2 on ahci0 ahcich2: Caps: HPCP ahcich3: AHCI channel at channel 3 on ahci0 ahcich3: Caps: HPCP ahcich0: AHCI reset... ahcich0: SATA connect time=100us status=0123 ahcich0: AHCI reset: device found ahcich0: AHCI reset: device ready after 0ms The difference with 9.0 is after that: here is 9.0's next lines: (same for ahcich1) (aprobe0:ahcich0:0:15:0): Command timed out (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted (aprobe0:ahcich0:0:0:0): SIGNATURE: And 9.1-PRE's: (aprobe0:ahcich0:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich0:0:15:0): CAM status: Command timeout (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted In both cases ada[01] are detected and available, but with 9.1-PRE I see: GEOM_RAID: Promise: Disk ada0 state changed from NONE to SPARE. GEOM_RAID: Promise: Disk ada1 state changed from NONE to SPARE. (I see the same when I # kldload geom_raid # from running 9.0, doesn't breaks anything...) I attach the full boot log with 9.1-PRE (bios with NO-raid nor AHCI enabled, but this changes nothing in the output) I could test patches or try any command required to debug this… But for the moment I don't know where to search (and kernel code is far away from my current skills in debugging…) You probably need to clear RAID metadata on the disks as I think that disabling geom_raid is not possible in 9.1-PRE. I think that Alexander can help you more here. The right way is to clear RAID metadata on disks. If it is possible to boot from any other source, you can just do `graid delete Promise` and then reboot. Alternatively it is possible to disable geom_raid module using recently added loader tunable kern.geom.raid.enable=0. After that your system should boot and run fine. I would still recommend you to erase metadata, but after setting that tunable it will be impossible to do it via graid tool, only with manual dd surgery. In case of Promise format metadata use up to 63 last sectors of the disk. You can identify respective sectors to erase by signature Promise Technology, Inc. in the beginning of the sector. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Fatal trap 12: hda
On 23.09.2012 23:41, Andriy Gapon wrote: on 23/09/2012 23:10 Barbara said the following: After updating src on RELENG_9 from r240236 to r240821 I have rebuilt my world+kernel. On reboot I had a kernel panic, supervisor read, page not present for process swapper. Trying to reboot in Single User Mode I accidentally disabled ACPI. Luckily the machine booted succesfully but there was nothing new in /var/crash. Then I tried again with ACPI enabled: same kernel panic. So I run nm on the instruction pointer of the panic and I noticed that it was in hdaa_sense_init, in sys/dev/sound/pci/hda/hdaa.c. BTW, I have device sound and device snd_hda in my KERNCONF, and the sound hw detection happens before HDs, is that the reason why I wasn't able to get a dump or dumping using DDB and the panicking process is swapper? Is there any trick I'm missing for that? Booting in verbose mode and comparing the output with ACPI enabled (where the panic happens) and disabled, I guessed that the problem was where No presence detection support at nid... is printed, as it was missing in the former case for nid 27 - Headphone (Green Jack). With ACPI disabled the value was looking quite weird: 36765696. So I made the following change: --- sys/dev/sound/pci/hda/hdaa.c.orig 2012-09-22 20:06:20.0 +0200 +++ sys/dev/sound/pci/hda/hdaa.c2012-09-23 20:39:32.0 +0200 @@ -627,7 +627,7 @@ (HDA_CONFIG_DEFAULTCONF_MISC(w-wclass.pin.config) 1) != 0) { device_printf(devinfo-dev, No presence detection support at nid %d\n, - as[i].pins[15]); + as-pins[15]); } else { if (w-unsol 0) poll = 1; Maybe the fix is not correct, but at least the new kernel boots successfully. Can someone review that? I tried looking in svn commits between the two builds, but I don't know what exposed the problem. If anyone is interested in my verbose log, or doing some tests, please ask. Your patch looks correct, looks like a bug could have been introduced via copy+paste. Good catch. Thank you. Slightly modified patch committed at r240884. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM_RAID in GENERIC is harmful
On 13.09.2012 08:31, Eugene Grosbein wrote: 9-STABLE has got options GEOM_RAID in GENERIC. In real world, this change is pretty harmful and there are lots of cases when 9.0-RELEASE systems upgraded to 9-STABLE fail to mount root UFS filesystem or attach ZFS. It seems, there are lots of HDDs supplied with pseudo-RAID labels at the end: pre-installed Windows machined having motherboards with pseudo-RAID like Intel RapidStore and alike. One can not even be aware of these labels. 9.0-RELEASE can be installed on such HDDs and use them with GMIRROR or ZFS without a problem. Upgraded to 9-STABLE, such system fails to build due to GRAID jumping out of box and grabbing HDDs for itself, so GMIRROR or ZFS got broken. That's makes users very angry when production server fails to boot with GENERIC kernel after correctly performed upgrade. GEOM_RAID compiled in GENERIC should be deactivated and require activation with some loader knob. Also, we need distinct RELEASE NOTES warning about the issue. Problem of on-disk metadata garbage is not limited to GEOM_RAID. For example, I had case where remainders of old UFS file system were found by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT partition, making other partitions inaccessible. Does it mean we should remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty in is that it was not in place for 9.0 release. If we remove it now, it will just postpone the problem for later time or will never be able to add it again because of the same reasons. Unlike GEOM_LABEL, metadata of GEOM_RAID is quite easy to delete without complete disk erase: `graid status -ag`, `graid delete ...`. Yes, it can be a problem if system can't boot, but now we at least have live mode on installation images, that should allow to do it. Adding some loader tunables indeed could simplify recovery in case of boot problem. I will probably add such ones now. It won't hurt. But I disagree they should be disabled by default, limiting users who really want to use BIOS RAID. Disabling them will also make metadata removal without full wipe more difficult because different RAIDs have different on-disk metadata layout, and you should know where exactly to apply dd. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM_RAID in GENERIC is harmful
On 13.09.2012 13:01, Eugene Grosbein wrote: 13.09.2012 16:51, Alexander Motin wrote: That's makes users very angry when production server fails to boot with GENERIC kernel after correctly performed upgrade. GEOM_RAID compiled in GENERIC should be deactivated and require activation with some loader knob. Also, we need distinct RELEASE NOTES warning about the issue. Problem of on-disk metadata garbage is not limited to GEOM_RAID. For example, I had case where remainders of old UFS file system were found by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT partition, making other partitions inaccessible. Does it mean we should remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty in is that it was not in place for 9.0 release. If we remove it now, it will just postpone the problem for later time or will never be able to add it again because of the same reasons. We must be ready for lots of angry users of 9.1-RELEASE then and have BIG RED WARNING in RELEASE NOTES. Warning is good, but I don't think it will be lots. It is enabled in 9-STABLE for some time now and I haven't seen many complains. If re@ permit to MFC r240465 in few days, solution for those who may need it will be simple: kern.geom.raid.enable=0. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Thinkpad X61s cannot boot 9.1-BETA1
On 13.09.2012 10:44, Lars Engels wrote: On Wed, Sep 12, 2012 at 11:08:25PM +0300, Alexander Motin wrote: On 12.09.2012 22:58, Lars Engels wrote: On Wed, Sep 12, 2012 at 09:58:31PM +0300, Alexander Motin wrote: On 12.09.2012 20:46, Lars Engels wrote: On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote: on 12/09/2012 20:25 Lars Engels said the following: On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote: Could you try to play with different eventtimer settings (preferably in current) ? You can use this thread / PR as a guide: http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495 The place where boot stop looks suspiciously close to the place where timer interrupts should start driving the system. Yes, that's it! Setting kern.eventtimer.timer=i8254 let's the Thinkpad boot on CURRENT with the AC cable inserted. Please share your sysctl kern.eventtimer output with Alexander. He will probably ask for some additional information :-) Sorry if I've missed, but it would be useful to see verbose dmesg in situation where system couldn't boot without switching eventtimer. No problem. See: http://bsd-geek.de/FreeBSD/IMAG0190.jpg No, I've seen that one and I don't mean it. I mean full verbose dmesg of successful boot in conditions where system was not booting before without setting kern.eventtimer.timer=i8254. Ok, sorry. Here's a verbose dmesg booting CURRENT without AC power: http://bsd-geek.de/FreeBSD/T61_dmesg.boot.works Hmm. I see nothing suspicious. HPET driver output is typical for ICH8M chipset, many of which are working fine in different systems, including several mine. There was no significant changes in HPET after 9.0-RELASE except r231161. It changed device probe order that increased chance of interrupt sharing. It should not be a problem, but who knows. You can try to hint HPET driver specific IRQ 23 (that looks unused) to avoid sharing by setting hint.hpet.0.allowed_irqs=0x0080. You've told that problem related to AC power state. Have you compared dmesg outputs with and without it? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Thinkpad X61s cannot boot 9.1-BETA1
On 12.09.2012 20:46, Lars Engels wrote: On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote: on 12/09/2012 20:25 Lars Engels said the following: On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote: Could you try to play with different eventtimer settings (preferably in current) ? You can use this thread / PR as a guide: http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495 The place where boot stop looks suspiciously close to the place where timer interrupts should start driving the system. Yes, that's it! Setting kern.eventtimer.timer=i8254 let's the Thinkpad boot on CURRENT with the AC cable inserted. Please share your sysctl kern.eventtimer output with Alexander. He will probably ask for some additional information :-) Sorry if I've missed, but it would be useful to see verbose dmesg in situation where system couldn't boot without switching eventtimer. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Thinkpad X61s cannot boot 9.1-BETA1
On 12.09.2012 22:58, Lars Engels wrote: On Wed, Sep 12, 2012 at 09:58:31PM +0300, Alexander Motin wrote: On 12.09.2012 20:46, Lars Engels wrote: On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote: on 12/09/2012 20:25 Lars Engels said the following: On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote: Could you try to play with different eventtimer settings (preferably in current) ? You can use this thread / PR as a guide: http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495 The place where boot stop looks suspiciously close to the place where timer interrupts should start driving the system. Yes, that's it! Setting kern.eventtimer.timer=i8254 let's the Thinkpad boot on CURRENT with the AC cable inserted. Please share your sysctl kern.eventtimer output with Alexander. He will probably ask for some additional information :-) Sorry if I've missed, but it would be useful to see verbose dmesg in situation where system couldn't boot without switching eventtimer. No problem. See: http://bsd-geek.de/FreeBSD/IMAG0190.jpg No, I've seen that one and I don't mean it. I mean full verbose dmesg of successful boot in conditions where system was not booting before without setting kern.eventtimer.timer=i8254. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.1 RC1 and CAM issues with old SCSI drive
On 09.09.2012 16:25, kirk russell wrote: On Sat, Sep 8, 2012 at 12:29 PM, Alexander Motin m...@freebsd.org wrote: Hi. It seems like both of your problems have the same cause: device report wrong size of INQUIRY data, that causes failure on attempt to fetch it. With FreeBSD 9.0 it caused domain validation failures and so reduced transfer rate, on 9.1 it also causes detection failure. I am not sure why detection worked on 9.0, it needs some deeper code comparison, but I think it is mostly device problem. Could you send me output of such commands from FreeBSD 9.0: camcontrol cmd da0 -vEc 12 00 00 00 24 00 -i 36 - | hd camcontrol cmd da0 -vEc 12 00 00 00 fe 00 -i 254 - | hd camcontrol cmd da0 -vEc 12 00 00 01 00 00 -i 256 - | hd -- Alexander Motin This is running 9.0-RELEASE. # camcontrol cmd da0 -vEc 12 00 00 00 24 00 -i 36 - | hd 00 00 02 02 fa 00 00 3e 43 4f 4d 50 41 51 50 43 |...COMPAQPC| 0010 57 44 45 39 31 30 30 57 20 20 20 20 20 20 20 20 |WDE9100W| 0020 31 2e 30 31 |1.01| 0024 # camcontrol cmd da0 -vEc 12 00 00 00 fe 00 -i 254 - | hd 00 00 02 02 fa 00 00 3e 43 4f 4d 50 41 51 50 43 |...COMPAQPC| 0010 57 44 45 39 31 30 30 57 20 20 20 20 20 20 20 20 |WDE9100W| 0020 31 2e 30 31 32 33 30 31 57 53 37 30 32 30 33 37 |1.012301WS702037| 0030 32 34 39 33 00 00 00 00 20 20 20 20 20 20 20 20 |2493| 0040 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 || * 0060 57 44 45 39 31 30 30 2d 36 30 30 35 44 30 20 20 |WDE9100-6005D0 | 0070 34 30 36 31 30 30 31 31 39 31 30 30 32 43 30 20 |4061001191002C0 | 0080 32 34 30 38 00 00 00 00 00 00 00 00 00 00 00 00 |2408| 0090 00 00 00 00 4e 32 30 35 30 30 39 39 30 32 35 35 |N20500990255| 00a0 33 20 20 20 50 20 30 30 00 00 00 00 00 00 42 41 |3 P 00..BA| 00b0 43 43 42 45 4b 43 31 39 39 38 30 38 32 38 57 53 |CCBEKC19980828WS| 00c0 36 30 44 20 04 03 00 04 02 01 00 00 00 00 00 00 |60D | 00d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 00f0 # camcontrol cmd da0 -vEc 12 00 00 01 00 00 -i 256 - | hd (pass1:ahc0:0:0:0): INQUIRY. CDB: 12 0 0 1 0 0 (pass1:ahc0:0:0:0): CAM status: SCSI Status Error (pass1:ahc0:0:0:0): SCSI status: Check Condition (pass1:ahc0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB) (pass1:ahc0:0:0:0): Command Specific Info: 0x (pass1:ahc0:0:0:0): Command byte 3 is invalid camcontrol: error sending command (pass1:ahc0:0:0:0): INQUIRY. CDB: 12 0 0 1 0 0 (pass1:ahc0:0:0:0): CAM status: SCSI Status Error (pass1:ahc0:0:0:0): SCSI status: Check Condition (pass1:ahc0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB) (pass1:ahc0:0:0:0): Command Specific Info: 0x (pass1:ahc0:0:0:0): Command byte 3 is invalid It seems that problem can be in our SCSI code that rounds inquiry data size up to even. Please try to comment out line inquiry_len = roundup2(inquiry_len, 2); in sys/cam/scsi/scsi_xpt.c and rebuild the kernel. It should probably fix both device detection and transfer speed. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.1 RC1 and CAM issues with old SCSI drive
Hi. It seems like both of your problems have the same cause: device report wrong size of INQUIRY data, that causes failure on attempt to fetch it. With FreeBSD 9.0 it caused domain validation failures and so reduced transfer rate, on 9.1 it also causes detection failure. I am not sure why detection worked on 9.0, it needs some deeper code comparison, but I think it is mostly device problem. Could you send me output of such commands from FreeBSD 9.0: camcontrol cmd da0 -vEc 12 00 00 00 24 00 -i 36 - | hd camcontrol cmd da0 -vEc 12 00 00 00 fe 00 -i 254 - | hd camcontrol cmd da0 -vEc 12 00 00 01 00 00 -i 256 - | hd -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 14.08.2012 22:25, Adam McDougall wrote: On Sun, Apr 29, 2012 at 04:39:29PM +0300, Alexander Motin wrote: On 04/29/12 16:30, Alex Kozlov wrote: On Sun, Apr 29, 2012 at 04:11:20PM +0300, Alexander Motin wrote: On 04/29/12 15:27, Alex Kozlov wrote: On Sun, Apr 29, 2012 at 03:07:40PM +0300, Alexander Motin wrote: On 04/29/12 15:04, Oliver Pinter wrote: Removing dummynet from kernel don't chanage anything, that is releated to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh + top) New ktr dump? I have similar issue on one of my laptops. Should I provide ktr dump? http://lists.freebsd.org/pipermail/freebsd-current/2011-September/027133.html In your case HPET also shares interrupt with other devices. I suspect that may be a reason. Every time when swi thread runs loadavg, other CPU runs shared interrupt handler, that is accounted as result. Please show your verbose dmesg. Attached. In your case HPET could solely use IRQ22 that seems free now. After recent changes in ACPI code it is detected before PCI devices and so doesn't avoids sharing. You may try to hint it specific IRQ by adding to loader,conf line: hint.hpet.0.allowed_irqs=0x0040 -- Alexander Motin I think I am having the same issue on my Sun Fire x4150 servers. It goes away when I sysctl kern.eventtimer.timer=LAPIC but I'm hesitant to use local workarounds in case they become pessimistic in the future. I'm not sure all of my systems would have the same free irqs (including after potential addition of expansion cards) so it might be a pain to determine an appropriate allowed_irqs setting for each. I tried hint.hpet.0.allowed_irqs=0x for the sake of experiment and that just results in LAPIC being used since HPET is removed from kern.eventtimer.choice. I've attached a verbose dmesg (will probably be stripped from the list, hence the Cc:). Is there a limit to how high the irq can be set or could I perhaps set it high enough that it is unlikely to conflict with other hardware? Is there a chance we can find an automatic fix for this issue, or should I just stick with LAPIC at the expense of whatever the HPET event timer gets me? Or something else? I feel the partially random load average level makes it difficult to measure a low load and can be misleading during problem debugging. Thanks. HPET theoretically can use any IRQ from 0 to 31. Practically there could be different limitations. It is BIOS duty to tell us which IRQs are allowed to use. In your case IRQs 20-23 are allowed. Unluckily now system just gives to the HPET driver the first from the range. Problem with LAPIC timer is that it stops working when CPU goes to C3 or deeper idle state. These states are not enabled by default, so unless you enabled them explicitly, it is safe to use LAPIC. In any case present 9-STABLE system should prevent you from using unsafe C-state if LAPIC timer is used. From all other perspectives LAPIC is preferable, as it is faster and easier to operate then HPET. Latest CPUs fixed the LAPIC timer problem, so I don't think that switching to it will be pessimistic in foreseeable future. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: GEOM_RAID in GENERIC 9.1
On 30.07.2012 08:33, Eugene M. Zheganin wrote: On 30.07.2012 11:04, Eugene M. Zheganin wrote: I am aware about how this thing works and what it does. However, every time I upgrade new server I got hit by it again and again, simply forgetting to remove it from the kernel's config. I'm afraid this thing will hit lots of FreeBSD installations after the release; it may be easily removed but still it will poison the life of many engineers and I really think it's a bomb, and should be removed from GENERIC. Okay, I feel like I need to clarify this, as some decent guys pointed me out that I'm very unclear and even rude (sorry for that, that's unintentional). GEOM_RAID was inserted instead of ataraid, but ataraid wasn't messing with zpooled disks: with GEOM_RAID the kernel takes both (in case of mirrored pool) providers, and mountroot just fails, as it sees no zfs pool. Plus, it's even more. This time I have disabled the raid in it's BIOS before installing FreeBSD. After mountroot failed, I booted 9.0-R from usb flash, trying to avoid any surgery with kernel files, like manual install from another machine. I was curious if I will be able to resolve this issue using base utilities. So, I loaded geom_raid via 'graid load', kernel said like 'Doh... I have ada0/ada1 spare disks', then I tried to remove the softraid label remains with 'graid remove' - and it failed, because there's no array at all, only spares. So, the 'graid status' is empty, 'graid list' is empty' and it's obvious that some surgery is needed. And I'm not disappointed that it's happened to me, no, because I know how to resolve this. But the thing that I'm really afraid of is that this default option will hit the less experienced engineers. Thank you for your report. I will recheck deletion of spare disks. But what's about `geom status/list` in this case, there are special options -a and -g to handle geoms without providers. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: AHCI Timeout errors on Intel Patsburg
Hi. is cs ss 0001 rs 0001 tfd 40 serr 0088 This line (ss and rs fields) tells me that device haven't confirmed completion of one NCQ command. Bits set in serr field mean 10b to 8b Decode Error and Link Sequence Error. I would suggest that something wrong with the link quality. That may explain why reducing speed helps. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi(4) IO performance regression, post 8.1
On 19.07.2012 18:28, Adrian Chadd wrote: Hm! A timer related bug? I'll CC mav@ on this, as it was his commit (and work in his general area.) I wonder what's going on - is it something to do with the two ACPI calls inserted there, or is it something to do with the change in event timer values? mav? Any ideas? I can just agree with earlier made guess that for some reason ACPI timer on that system is very slow. Unless user explicitly enabled deeper C-states, values returned by the timer are not really used for anything, so there is just no place for other bug. When doing this change I was expecting that it may have cost, but on most systems that cost makes effect only during high interrupt rates, where it is covered by automatic fallback to using faster MWAIT as idle method. Unluckily, that code still was not merged to 8-STABLE (only 9). I will recheck is there problem to merge it now. Manual switching to MWAIT via sysctl is correct workaround for this situation. It may give slightly higher power consumption, but for this workload with many interrupts probably the best possible performance. On 17 July 2012 13:39, Steve McCoy smc...@greatbaysoftware.com wrote: Alright, I've finally narrowed it down to r209897, which only affects acpi_cpu_idle(): --- stable/8/sys/dev/acpica/acpi_cpu.c 2010/06/23 17:04:42 209471 +++ stable/8/sys/dev/acpica/acpi_cpu.c 2010/07/11 11:58:46 209897 @@ -930,12 +930,16 @@ /* * Execute HLT (or equivalent) and wait for an interrupt. We can't - * calculate the time spent in C1 since the place we wake up is an - * ISR. Assume we slept half of quantum and return. + * precisely calculate the time spent in C1 since the place we wake up + * is an ISR. Assume we slept no more then half of quantum. */ if (cx_next-type == ACPI_STATE_C1) { - sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 + 50 / hz) / 4; + AcpiHwRead(start_time, AcpiGbl_FADT.XPmTimerBlock); acpi_cpu_c1(); + AcpiHwRead(end_time, AcpiGbl_FADT.XPmTimerBlock); +end_time = acpi_TimerDelta(end_time, start_time); + sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 + + min(PM_USEC(end_time), 50 / hz)) / 4; return; } My current guess is that AcpiHwRead() is a problem on our hardware. It's an isolated change and, to my desperate eyes, the commit message implies that it isn't critical — Do you think we could buy ourselves some time by pulling it out of our version of the kernel? Or is this essential for correctness? Any thoughts are appreciated, thanks! -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi(4) IO performance regression, post 8.1
On 20.07.2012 16:38, Alexander Motin wrote: On 19.07.2012 18:28, Adrian Chadd wrote: Hm! A timer related bug? I'll CC mav@ on this, as it was his commit (and work in his general area.) I wonder what's going on - is it something to do with the two ACPI calls inserted there, or is it something to do with the change in event timer values? mav? Any ideas? I can just agree with earlier made guess that for some reason ACPI timer on that system is very slow. Unless user explicitly enabled deeper C-states, values returned by the timer are not really used for anything, so there is just no place for other bug. When doing this change I was expecting that it may have cost, but on most systems that cost makes effect only during high interrupt rates, where it is covered by automatic fallback to using faster MWAIT as idle method. Unluckily, that code still was not merged to 8-STABLE (only 9). I will recheck is there problem to merge it now. I've just merged that to 8-STABLE at r238658. Testers are welcome. Manual switching to MWAIT via sysctl is correct workaround for this situation. It may give slightly higher power consumption, but for this workload with many interrupts probably the best possible performance. On 17 July 2012 13:39, Steve McCoy smc...@greatbaysoftware.com wrote: Alright, I've finally narrowed it down to r209897, which only affects acpi_cpu_idle(): --- stable/8/sys/dev/acpica/acpi_cpu.c 2010/06/23 17:04:42 209471 +++ stable/8/sys/dev/acpica/acpi_cpu.c 2010/07/11 11:58:46 209897 @@ -930,12 +930,16 @@ /* * Execute HLT (or equivalent) and wait for an interrupt. We can't - * calculate the time spent in C1 since the place we wake up is an - * ISR. Assume we slept half of quantum and return. + * precisely calculate the time spent in C1 since the place we wake up + * is an ISR. Assume we slept no more then half of quantum. */ if (cx_next-type == ACPI_STATE_C1) { - sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 + 50 / hz) / 4; + AcpiHwRead(start_time, AcpiGbl_FADT.XPmTimerBlock); acpi_cpu_c1(); + AcpiHwRead(end_time, AcpiGbl_FADT.XPmTimerBlock); +end_time = acpi_TimerDelta(end_time, start_time); + sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 + + min(PM_USEC(end_time), 50 / hz)) / 4; return; } My current guess is that AcpiHwRead() is a problem on our hardware. It's an isolated change and, to my desperate eyes, the commit message implies that it isn't critical — Do you think we could buy ourselves some time by pulling it out of our version of the kernel? Or is this essential for correctness? Any thoughts are appreciated, thanks! -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: mfi(4) IO performance regression, post 8.1
Hi. On 20.07.2012 22:38, Adrian Chadd wrote: I'm worried that this won't be the only source of freebsd is slower than linux issues. What can we add to the timer path to make identifying and root causing this issue easy? I'd just like to be absolutely sure that we're not only doing the best job possible, but we can provide some tools and statistics to the user/administrator so as to make debugging much easier. The only instrument to diagnose this problem without provided input I could propose is hwpmc profiling. It should be able to show that we are spending much time in those timer routines. If we guessed somehow that reason is in slow ACPI timer, it is easy to write respective benchmark, but we can't write tests for everything, and even if we could, users won't be able to run/analyze output of them without some level of knowledge. I've spent much time profiling that on hardware I have, but the only way to be sure in general case I see is more testing and feedbacks. For this specific area I am using very simple test, that effectively depends on interrupt latency and CPUs wakeup times: `dd if=/dev/ada0 of=/dev/null bs=512`. Depending on device, controller and other factors, gives me about 20-30K IOPS. If you have some ideas what and how could we test automatically -- welcome. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: svn commit: r237318 - in stable/8: share/man/man4 sys/cam sys/cam/scsi sys/conf
On 06/22/12 21:41, Mike Tancsa wrote: On 6/20/2012 10:39 AM, Alexander Motin wrote: Author: mav Date: Wed Jun 20 14:39:35 2012 New Revision: 237318 URL: http://svn.freebsd.org/changeset/base/237318 Log: MFC r236712: To make CAM debugging easier, compile in some debug flags (CAM_DEBUG_INFO, CAM_DEBUG_CDB, CAM_DEBUG_PERIPH and CAM_DEBUG_PROBE) by default. List of these flags can be modified with CAM_DEBUG_COMPILE kernel option. CAMDEBUG kernel option still enables all possible debug, if not overriden. Additional 50KB of kernel size is a good price for the ability to debug problems without rebuilding the kernel. In case where size is important, debugging can be compiled out by setting CAM_DEBUG_COMPILE option to 0. Hi, Not sure if this is the commit or not, but a kernel from the 18th seems to function normally, and a kernel from today has a great deal of messages like the ones below. I also dont know if this is just exposing an existing bug in the driver that was upto now hidden ? That's not. That's a bit later. Boot time, I see the following (probe1:twa0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe1:twa0:0:1:0): CAM status: Invalid Target ID (probe1:twa0:0:1:0): Error 22, Unretryable error (probe2:twa0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe2:twa0:0:2:0): CAM status: Invalid Target ID (probe2:twa0:0:2:0): Error 22, Unretryable error (probe3:twa0:0:3:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe3:twa0:0:3:0): CAM status: Invalid Target ID (probe3:twa0:0:3:0): Error 22, Unretryable error (probe4:twa0:0:4:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe4:twa0:0:4:0): CAM status: Invalid Target ID (probe4:twa0:0:4:0): Error 22, Unretryable error (probe15:twa0:0:15:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe15:twa0:0:15:0): CAM status: Invalid Target ID (probe15:twa0:0:15:0): Error 22, Unretryable error (probe16:twa0:0:16:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe16:twa0:0:16:0): CAM status: Invalid Target ID (probe16:twa0:0:16:0): Error 22, Unretryable error (probe17:twa0:0:17:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe17:twa0:0:17:0): CAM status: Invalid Target ID (probe17:twa0:0:17:0): Error 22, Unretryable error (probe18:twa0:0:18:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe18:twa0:0:18:0): CAM status: Invalid Target ID (probe18:twa0:0:18:0): Error 22, Unretryable error (probe19:twa0:0:19:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe19:twa0:0:19:0): CAM status: Invalid Target ID (probe19:twa0:0:19:0): Error 22, Unretryable error (probe20:twa0:0:20:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe20:twa0:0:20:0): CAM status: Invalid Target ID (probe20:twa0:0:20:0): Error 22, Unretryable error (probe21:twa0:0:21:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe21:twa0:0:21:0): CAM status: Invalid Target ID (probe21:twa0:0:21:0): Error 22, Unretryable error (probe22:twa0:0:22:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe22:twa0:0:22:0): CAM status: Invalid Target ID (probe22:twa0:0:22:0): Error 22, Unretryable error (probe23:twa0:0:23:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe23:twa0:0:23:0): CAM status: Invalid Target ID (probe23:twa0:0:23:0): Error 22, Unretryable error (probe24:twa0:0:24:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe24:twa0:0:24:0): CAM status: Invalid Target ID (probe24:twa0:0:24:0): Error 22, Unretryable error (probe25:twa0:0:25:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe25:twa0:0:25:0): CAM status: Invalid Target ID (probe25:twa0:0:25:0): Error 22, Unretryable error (probe26:twa0:0:26:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe26:twa0:0:26:0): CAM status: Invalid Target ID (probe26:twa0:0:26:0): Error 22, Unretryable error (probe5:twa0:0:5:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe5:twa0:0:5:0): CAM status: Invalid Target ID (probe5:twa0:0:5:0): Error 22, Unretryable error (probe6:twa0:0:6:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe6:twa0:0:6:0): CAM status: Invalid Target ID (probe6:twa0:0:6:0): Error 22, Unretryable error (probe7:twa0:0:7:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe7:twa0:0:7:0): CAM status: Invalid Target ID (probe7:twa0:0:7:0): Error 22, Unretryable error (probe8:twa0:0:8:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe8:twa0:0:8:0): CAM status: Invalid Target ID (probe8:twa0:0:8:0): Error 22, Unretryable error (probe9:twa0:0:9:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe9:twa0:0:9:0): CAM status: Invalid Target ID (probe9:twa0:0:9:0): Error 22, Unretryable error (probe10:twa0:0:10:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe10:twa0:0:10:0): CAM status: Invalid Target ID (probe10:twa0:0:10:0): Error 22, Unretryable error (probe11:twa0:0:11:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe11:twa0:0:11:0): CAM status: Invalid Target ID (probe11:twa0:0:11:0): Error 22, Unretryable error (probe12:twa0:0:12:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe12:twa0:0:12:0): CAM status: Invalid Target ID (probe12:twa0:0:12:0): Error 22, Unretryable error (probe13:twa0:0:13:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe13:twa0:0:13:0): CAM status: Invalid Target ID (probe13:twa0:0:13:0): Error 22, Unretryable error (probe14:twa0:0:14:0): INQUIRY. CDB: 12 0 0 0 24 0 (probe14:twa0:0:14:0
Re: [stable 9] broken hwpstate calls
On 06/07/12 11:10, Andriy Gapon wrote: on 07/06/2012 02:02 Jung-uk Kim said the following: Any way, hwpstate still isn't quite right even without your patch. sys/kern/kern_cpu.c cpufreq_curr_sysctl() - CPUFREQ_SET() -/* for all CPU devices */ cf_set_method() -/* thread_lock(), sched_bind(), ... */ CPUFREQ_DRV_SET() - sys/x86/cpufreq/hwpstate.c hwpstate_set() - hwpstate_goto_pstate() /* for each CPU unit */ /* thread_lock(), sched_bind(), ... */ Oh, I didn't realize that there was the cpufreq-level loop over all CPUs! That really sucks. Maybe some day we should accept that different CPUs could legitimately be in different P-states and provide support for that throughout the stack (from powerd to drivers). Support for different P-states on different CPUs can be useful if CPUs have different capabilities. I believe it is very rare, but possible. At this moment cpufreq should set for each CPU frequency closest to one that was set on BSP. It should be possible to make powerd to read sets of frequencies from all CPUs and do the same, just more intelligently. Same time using very different frequencies for different CPUs can IMHO be very problematic even in theory. For SMP systems it is quite difficult (because of threads migration and possible inter-operations of multiple threads) to identify cases when even global frequency can be reduced without proportional performance penalty. Making in per-CPU multiplies number of options and requires awareness from the scheduler. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 04/29/12 09:09, Ian Smith wrote: On Sun, 29 Apr 2012 08:17:38 +0300, Alexander Motin wrote: On 04/29/12 01:53, Oliver Pinter wrote: Attached the ktr file. This is on core2duo P9400 cpu ( smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ). The workload is only a single user boost: sh + top running, but the load average is near 0.5. ktr shows no real load there. But it shows that you are using dummynet, that schedules its runs on every hardclock tick. I believe that load you see is the result or synchronization between dummynet calls and loadvg sampling, both of which called from hardclock. I think removing dummynet from equation, should hide this problem and also reduce you laptops power consumption. What's about fixing this, it is loadavg sampling algorithm that should be changed. Fixing dummynet to not run on every hardclock tick would also be great. Wading in out of my depth, and copying Luigi in case he misses it .. but even back in the olden days when HZ defaulted to 100, one was advised to use HZ= 1000 for smooth dummynet traffic shaping dispatch scheduling. I wonder, with the newer clocks and timers, whether there is another clock that could be used for dummynet scheduling, that would not have this effect (even if largely cosmetic?) on load average calculation? First of all, the easiest solution would be to make dummynet to schedule callout not automatically, but on first queued packet. I believe that in case of laptop the queue should be empty most of time and the callout calls are completely useless there. Luigi promised to look on this once. What's about better precision/removing synchronization -- there is starting GSoC project now (by davide@) to rewrite callout(9) subsystem to use better precision allowed by new timer drivers. While now it is possible to get raw access to additional timer hardware available on some systems, I don't think it is a good idea. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 04/29/12 15:04, Oliver Pinter wrote: Removing dummynet from kernel don't chanage anything, that is releated to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh + top) New ktr dump? On 4/29/12, Alexander Motinm...@freebsd.org wrote: On 04/29/12 09:09, Ian Smith wrote: On Sun, 29 Apr 2012 08:17:38 +0300, Alexander Motin wrote: On 04/29/12 01:53, Oliver Pinter wrote: Attached the ktr file. This is on core2duo P9400 cpu ( smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ). The workload is only a single user boost: sh + top running, but the load average is near 0.5. ktr shows no real load there. But it shows that you are using dummynet, that schedules its runs on every hardclock tick. I believe that load you see is the result or synchronization between dummynet calls and loadvg sampling, both of which called from hardclock. I think removing dummynet from equation, should hide this problem and also reduce you laptops power consumption. What's about fixing this, it is loadavg sampling algorithm that should be changed. Fixing dummynet to not run on every hardclock tick would also be great. Wading in out of my depth, and copying Luigi in case he misses it .. but even back in the olden days when HZ defaulted to 100, one was advised to use HZ= 1000 for smooth dummynet traffic shaping dispatch scheduling. I wonder, with the newer clocks and timers, whether there is another clock that could be used for dummynet scheduling, that would not have this effect (even if largely cosmetic?) on load average calculation? First of all, the easiest solution would be to make dummynet to schedule callout not automatically, but on first queued packet. I believe that in case of laptop the queue should be empty most of time and the callout calls are completely useless there. Luigi promised to look on this once. What's about better precision/removing synchronization -- there is starting GSoC project now (by davide@) to rewrite callout(9) subsystem to use better precision allowed by new timer drivers. While now it is possible to get raw access to additional timer hardware available on some systems, I don't think it is a good idea. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 04/29/12 15:27, Oliver Pinter wrote: http://oliverp.teteny.bme.hu/freebsd/ktr/ OK. Now there is no dummynet, but I've found there two more things: 1. for some reason some acpi_thremal thread seems to consume about 0.37s of time every 10s. I have no idea what is this. It's not 0.7 load, but still strange at least. 2. I suspect another possible synchronization between ehci driver and loadavg as result of interrupt sharing between HPET timer used for time events and EHCI USB hardware. Not sure what to do about this. Please send _verbose_ dmesg to check whether this interrupt sharing is unavoidable. On 4/29/12, Alexander Motinm...@freebsd.org wrote: On 04/29/12 15:04, Oliver Pinter wrote: Removing dummynet from kernel don't chanage anything, that is releated to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh + top) New ktr dump? On 4/29/12, Alexander Motinm...@freebsd.org wrote: On 04/29/12 09:09, Ian Smith wrote: On Sun, 29 Apr 2012 08:17:38 +0300, Alexander Motin wrote: On 04/29/12 01:53, Oliver Pinter wrote: Attached the ktr file. This is on core2duo P9400 cpu ( smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ). The workload is only a single user boost: sh + top running, but the load average is near 0.5. ktr shows no real load there. But it shows that you are using dummynet, that schedules its runs on every hardclock tick. I believe that load you see is the result or synchronization between dummynet calls and loadvg sampling, both of which called from hardclock. I think removing dummynet from equation, should hide this problem and also reduce you laptops power consumption. What's about fixing this, it is loadavg sampling algorithm that should be changed. Fixing dummynet to not run on every hardclock tick would also be great. Wading in out of my depth, and copying Luigi in case he misses it .. but even back in the olden days when HZ defaulted to 100, one was advised to use HZ= 1000 for smooth dummynet traffic shaping dispatch scheduling. I wonder, with the newer clocks and timers, whether there is another clock that could be used for dummynet scheduling, that would not have this effect (even if largely cosmetic?) on load average calculation? First of all, the easiest solution would be to make dummynet to schedule callout not automatically, but on first queued packet. I believe that in case of laptop the queue should be empty most of time and the callout calls are completely useless there. Luigi promised to look on this once. What's about better precision/removing synchronization -- there is starting GSoC project now (by davide@) to rewrite callout(9) subsystem to use better precision allowed by new timer drivers. While now it is possible to get raw access to additional timer hardware available on some systems, I don't think it is a good idea. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 04/29/12 15:27, Alex Kozlov wrote: On Sun, Apr 29, 2012 at 03:07:40PM +0300, Alexander Motin wrote: On 04/29/12 15:04, Oliver Pinter wrote: Removing dummynet from kernel don't chanage anything, that is releated to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh + top) New ktr dump? I have similar issue on one of my laptops. Should I provide ktr dump? http://lists.freebsd.org/pipermail/freebsd-current/2011-September/027133.html In your case HPET also shares interrupt with other devices. I suspect that may be a reason. Every time when swi thread runs loadavg, other CPU runs shared interrupt handler, that is accounted as result. Please show your verbose dmesg. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 04/29/12 16:30, Alex Kozlov wrote: On Sun, Apr 29, 2012 at 04:11:20PM +0300, Alexander Motin wrote: On 04/29/12 15:27, Alex Kozlov wrote: On Sun, Apr 29, 2012 at 03:07:40PM +0300, Alexander Motin wrote: On 04/29/12 15:04, Oliver Pinter wrote: Removing dummynet from kernel don't chanage anything, that is releated to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh + top) New ktr dump? I have similar issue on one of my laptops. Should I provide ktr dump? http://lists.freebsd.org/pipermail/freebsd-current/2011-September/027133.html In your case HPET also shares interrupt with other devices. I suspect that may be a reason. Every time when swi thread runs loadavg, other CPU runs shared interrupt handler, that is accounted as result. Please show your verbose dmesg. Attached. In your case HPET could solely use IRQ22 that seems free now. After recent changes in ACPI code it is detected before PCI devices and so doesn't avoids sharing. You may try to hint it specific IRQ by adding to loader,conf line: hint.hpet.0.allowed_irqs=0x0040 -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 04/28/12 00:34, Albert Shih wrote: Le 27/04/2012 ? 22:45:40+0200, Oliver Pinter a écrit I'm running 9-stable on all my computer. (csup yesterday). On my desktop everything is fine. But I've two laptop, (both are Dell). On both latptop I've problem about the load, event when I do nothing I got a load between 0.5-1. Here the result of a «top» on the laptop : last pid: 2434; load averages: 0.63, 0.67, 0.59 up 0+00:23:59 22:25:29 57 processes: 3 running, 54 sleeping CPU: 2.7% user, 0.0% nice, 3.7% system, 1.4% interrupt, 92.2% idle Mem: 89M Active, 92M Inact, 198M Wired, 13M Cache, 100M Buf, 3529M Free Swap: 4096M Total, 4096M Free Here on the desktop : last pid: 61010; load averages: 0.00, 0.00, 0.00 up 2+11:02:42 22:29:08 126 processes: 1 running, 125 sleeping CPU: % user, % nice, % system, % interrupt, % idle Mem: 803M Active, 2874M Inact, 1901M Wired, 112M Cache, 620M Buf, 202M Free Swap: 6144M Total, 36M Used, 6107M Free http://lists.freebsd.org/pipermail/freebsd-bugs/2012-April/048213.html What I understand of your message (I'm definitvly not a dev) is that's only a little problem of accounting. I'm not absolute sure of that because my laptop fan never stop... If you want any more information... Definitely, because here I don't see much. Generally, all CPU loads and load averages now calculated via sampling, so theoretically with spiky load numbers may vary for many reasons. I would start from collecting information about running processes. To find fast switching processes that could hide from accounting try `top -SH -m io -o vcsw`. To get more information about scheduler work, use /usr/src/tools/sched/schedgraph.py (instruction inside it). -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: High load event idl.
On 04/29/12 01:53, Oliver Pinter wrote: Attached the ktr file. This is on core2duo P9400 cpu ( smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ). The workload is only a single user boost: sh + top running, but the load average is near 0.5. ktr shows no real load there. But it shows that you are using dummynet, that schedules its runs on every hardclock tick. I believe that load you see is the result or synchronization between dummynet calls and loadvg sampling, both of which called from hardclock. I think removing dummynet from equation, should hide this problem and also reduce you laptops power consumption. What's about fixing this, it is loadavg sampling algorithm that should be changed. Fixing dummynet to not run on every hardclock tick would also be great. On 4/28/12, Alexander Motinm...@freebsd.org wrote: On 04/28/12 00:34, Albert Shih wrote: Le 27/04/2012 ? 22:45:40+0200, Oliver Pinter a écrit I'm running 9-stable on all my computer. (csup yesterday). On my desktop everything is fine. But I've two laptop, (both are Dell). On both latptop I've problem about the load, event when I do nothing I got a load between 0.5-1. Here the result of a «top» on the laptop : last pid: 2434; load averages: 0.63, 0.67, 0.59 up 0+00:23:59 22:25:29 57 processes: 3 running, 54 sleeping CPU: 2.7% user, 0.0% nice, 3.7% system, 1.4% interrupt, 92.2% idle Mem: 89M Active, 92M Inact, 198M Wired, 13M Cache, 100M Buf, 3529M Free Swap: 4096M Total, 4096M Free Here on the desktop : last pid: 61010; load averages: 0.00, 0.00, 0.00 up 2+11:02:42 22:29:08 126 processes: 1 running, 125 sleeping CPU: % user, % nice, % system, % interrupt, % idle Mem: 803M Active, 2874M Inact, 1901M Wired, 112M Cache, 620M Buf, 202M Free Swap: 6144M Total, 36M Used, 6107M Free http://lists.freebsd.org/pipermail/freebsd-bugs/2012-April/048213.html What I understand of your message (I'm definitvly not a dev) is that's only a little problem of accounting. I'm not absolute sure of that because my laptop fan never stop... If you want any more information... Definitely, because here I don't see much. Generally, all CPU loads and load averages now calculated via sampling, so theoretically with spiky load numbers may vary for many reasons. I would start from collecting information about running processes. To find fast switching processes that could hide from accounting try `top -SH -m io -o vcsw`. To get more information about scheduler work, use /usr/src/tools/sched/schedgraph.py (instruction inside it). -- Alexander Motin -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [stable-ish 9] Dell R815 ipmi(4) attach failure
| |ipmi1:IPMI System Interface on isa0 | |device_attach: ipmi1 attach returned 16 | |ipmi1:IPMI System Interface on isa0 | |device_attach: ipmi1 attach returned 16 | |ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2 | |ipmi0: DEBUG ipmi_complete_request 527 before wakeup 6201 | |ipmi0: DEBUG ipmi_complete_request 529 after wakeup 6263 | |ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 6323 | | | | Actually, can you compile with: | | | | optionsKTR | | optionsKTR_COMPILE=KTR_SCHED | | optionsKTR_MASK=KTR_SCHED | | | | and then add a temporary hack to ipmi.c to set ktr_mask to 0 after | | ipmi_submit_driver_request() returns in ipmi_startup()? You can | | then use 'ktrdump -ct' after boot to capture a log of what the scheduler | | did including if it timed out the sleep, etc. I think this would be | | useful for figuring out what went wrong. It does seem that it timed | | out after 3 seconds. | | Assuming I didn't mess up, the log should be at: | http://people.freebsd.org/~ambrisko/ipmi_ktr_dump.txt | again, I using ipmi(4) as module loaded via the loader. | | If you use -ct then you get a file you can feed into schedgraph. | However, just reading the log, it seems that IRQ 20 keeps preempting | the KCS worker thread preventing it from getting anything done. Also, | there seem to be a lot of threads on CPU 0's runqueue waiting for a | chance to run (load average of 12 or 13 the entire time). You can try | just bumping up the max timeout from 3 seconds to higher perhaps. Not | sure why IRQ 20 keeps firing though. It might be related to USB, so | you could try fiddling with USB options in the BIOS perhaps, or disabling | the USB drivers to see if that fixes IPMI. Tried without USB in kernel: http://people.freebsd.org/~ambrisko/ipmi_ktr_dump_no_usb.txt Hmm, it's still just running constantly (note that the idle thread is _never_ scheduled). The lion's share of the time seems to be spent in xpt_thrd. Note that there are several places where nothing happens except that xpt_thrd runs constantly (spinning) during 10's of statclock ticks. I would maybe start debugging that to see what in the world it is doing. Maybe it is polling some hardware down in xpt_action() (i.e., xpt_action() for a single bus called down into a driver and it is just spinning using polling instead of sleeping and waiting for an interrupt). xpt_thrd is a bus scanner thread. It is scheduled by CAM for every bus on attach and by controller driver on hot-plug events. For some controllers it may be quite CPU-hungry. For example, for legacy ATA controllers, where bus reset may take many seconds of hardware polling, while devices just spinning up. For ahci(4) it was improved about year ago to not use polling when possible, but it still may loop for some time if controller is not responding on reset. What mfi(4), mentioned in log, does during scanning, I am not sure. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [stable-ish 9] Dell R815 ipmi(4) attach failure
On 04/06/12 20:12, Doug Ambrisko wrote: Alexander Motin writes: [ Charset ISO-8859-1 unsupported, converting... ] | On 04/04/12 21:47, John Baldwin wrote: | On Wednesday, April 04, 2012 12:24:33 pm Doug Ambrisko wrote: | John Baldwin writes: | | On Tuesday, April 03, 2012 12:37:50 pm Doug Ambrisko wrote: | | John Baldwin writes: | | | On Monday, April 02, 2012 7:27:13 pm Doug Ambrisko wrote: | | | Doug Ambrisko writes: | | | | John Baldwin writes: | | | | | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko wrote: | | | | | Sean Bruno writes: | | | | | | Noting a failure to attach to the onboard IPMI controller | with | | this | | | dell | | | | | | R815. Not sure what to start poking at and thought I'd | though | | this | | | over | | | | | | here for comment. | | | | | | | | | | | | -bash-4.2$ dmesg |grep ipmi | | | | | | ipmi0: KCS mode found at io 0xca8 on acpi | | | | | | ipmi1:IPMI System Interface on isa0 | | | | | | device_attach: ipmi1 attach returned 16 | | | | | | ipmi1:IPMI System Interface on isa0 | | | | | | device_attach: ipmi1 attach returned 16 | | | | | | ipmi0: Timed out waiting for GET_DEVICE_ID | | | | | | | | | | I've run into this recently. A quick hack to fix it is: | | | | | | | | | | Index: ipmi.c | | | | | [snip] | | If you use -ct then you get a file you can feed into schedgraph. | | However, just reading the log, it seems that IRQ 20 keeps preempting | | the KCS worker thread preventing it from getting anything done. Also, | | there seem to be a lot of threads on CPU 0's runqueue waiting for a | | chance to run (load average of 12 or 13 the entire time). You can try | | just bumping up the max timeout from 3 seconds to higher perhaps. Not | | sure why IRQ 20 keeps firing though. It might be related to USB, so | | you could try fiddling with USB options in the BIOS perhaps, or disabling | | the USB drivers to see if that fixes IPMI. | | Tried without USB in kernel: | http://people.freebsd.org/~ambrisko/ipmi_ktr_dump_no_usb.txt | | Hmm, it's still just running constantly (note that the idle thread is | _never_ scheduled). The lion's share of the time seems to be spent in | xpt_thrd. Note that there are several places where nothing happens except | that xpt_thrd runs constantly (spinning) during 10's of statclock ticks. I | would maybe start debugging that to see what in the world it is doing. Maybe | it is polling some hardware down in xpt_action() (i.e., xpt_action() for a | single bus called down into a driver and it is just spinning using polling | instead of sleeping and waiting for an interrupt). | | xpt_thrd is a bus scanner thread. It is scheduled by CAM for every bus | on attach and by controller driver on hot-plug events. For some | controllers it may be quite CPU-hungry. For example, for legacy ATA | controllers, where bus reset may take many seconds of hardware polling, | while devices just spinning up. For ahci(4) it was improved about year | ago to not use polling when possible, but it still may loop for some | time if controller is not responding on reset. What mfi(4), mentioned in | log, does during scanning, I am not sure. I thought that mfi(4) could be an issue. There are some ata controllers with nothing attached. I built a GENERIC with USB and mfi commented out and then the timeout issue went away: ipmi0: KCS mode found at io 0xca8 on acpi ipmi1:IPMI System Interface on isa0 device_attach: ipmi1 attach returned 16 ipmi1:IPMI System Interface on isa0 device_attach: ipmi1 attach returned 16 ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 1 ipmi0: DEBUG ipmi_complete_request 527 before wakeup 2211 ipmi0: DEBUG ipmi_complete_request 529 after wakeup 2272 ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 2332 ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0 Without mfi and with USB and it had issues: ipmi0: KCS mode found at io 0xca8 on acpi ipmi1:IPMI System Interface on isa0 device_attach: ipmi1 attach returned 16 ipmi1:IPMI System Interface on isa0 device_attach: ipmi1 attach returned 16 ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2 ipmi0: DEBUG ipmi_complete_request 527 before wakeup 3137 ipmi0: DEBUG ipmi_complete_request 529 after wakeup 3199 ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 3259 ipmi0: Timed out waiting for GET_DEVICE_ID ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0 I can post more ktrdump traces if needed. A 1U Dell machine without mfi also has this problem. As John mentioned it might be good to bump up the timeout from 3s to 6s. I did that with the USB no mfi kernel and that passed: % dmesg | grep ipmi ipmi0: KCS mode found at io 0xca8 on acpi ipmi1:IPMI System Interface on isa0 device_attach
Serverworks HT-1000 HPET event timer
Hi. Does anybody have success story of using HPET event timer (not time counter!) on Serverworks HT-1000 chipset under FreeBSD 9/10? I was reported about problems with it on HP BL465c G6 blade system and now thinking whether it is global problem or specific to this system. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: missing disk device under 9-STABLE
Hi. On 03.03.2012 21:08, Jeff Blank wrote: I attempted an upgrade last night from an old 8-STABLE (25 Apr 2011) to 9-STABLE and ran into a problem where a disk apparently wasn't detected. I'm of course aware of the ATA/CAM changes, but I haven't found anything that quite explains what's happening here. I've attached dmesg output from the 8-STABLE and 9-STABLE kernels as well as the results of 'ls -l /dev/ad*' and 'zpool status' under both kernels. ZFS seems to have figured out what to do about its ad4p3 member (switching to a gptid device), but since only ada0 is detected during boot, it can't complete the pool. The weird thing is, though, that the other disk was actually detected on one reboot to the 9.0 kernel, ZFS was happy, etc. I haven't been able to reproduce it, though. Due to these problems, I haven't upgraded userland yet and am of course sticking with the 8-STABLE kernel, but I can boot into the 9-STABLE kernel at will if anyone needs more information. This looks like cause of the missing disk: ahcich1: Timeout on slot 0 port 0 ahcich1: is 0002 cs ss rs 0001 tfd 50 serr cmd 6017 ahcich1: Timeout on slot 0 port 0 ahcich1: is 0002 cs ss rs 0001 tfd 50 serr cmd 6017 It tells that controller signals interrupt, but driver haven't got it. That is even more strange after the disk on first SATA port is working fine. You may try to add to your /boot/loader.conf line: hint.ahci.0.msi=0 , or just set it via loader prompt. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: missing disk device under 9-STABLE
On 03.03.2012 22:21, Jeff Blank wrote: On Sat, Mar 03, 2012 at 09:51:53PM +0200, Alexander Motin wrote: This looks like cause of the missing disk: ahcich1: Timeout on slot 0 port 0 ahcich1: is 0002 cs ss rs 0001 tfd 50 serr cmd 6017 ahcich1: Timeout on slot 0 port 0 ahcich1: is 0002 cs ss rs 0001 tfd 50 serr cmd 6017 It tells that controller signals interrupt, but driver haven't got it. That is even more strange after the disk on first SATA port is working fine. You may try to add to your /boot/loader.conf line: hint.ahci.0.msi=0 , or just set it via loader prompt. Alexander, Thanks, that seemed to clear the problem up, no troubles through half a dozen or more reboots. Is disabling MSI likely to have any side effects on, for example, performance or reliability? Is there any point to pursuing this as a FreeBSD problem, since I didn't have any issues under the old ATA system? I'm happy to help troubleshoot this if anyone thinks it's worth looking into. MSI interrupts could give a bit better performance. But with regular HDDs I think it is unlikely that you notice any difference. What's about about old driver, it never used MSI by default, while new one does. What board and chipset do you use? Have you tried to update BIOS? Please show `pciconf -lvcb` output about the controller. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Lost ata_xpt.c fix for -stable: svn commit: r217444
Hi. On 02/15/12 12:02, Harald Schmalzbauer wrote: I just applied my local patches against RELENG_8_2 src tree and found that http://svn.freebsd.org/changeset/base/217444 was still missing, and if I read svnweb right (sorry, lack of svn knowledge here), it's also missing in -stable. Any plans to commit? As I can see, it was merged to 8-STABLE a year ago at r218340: http://svnweb.freebsd.org/base?view=revisionrevision=218340 -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: problems with AHCI on FreeBSD 8.2
On 02/14/12 11:19, Victor Balada Diaz wrote: We're having some troubles with AHCI under FreeBSD 8.2 and 8-STABLE. The error is: ahcich0: Timeout on slot 8 ahcich0: is cs 0100 ss rs 0100 tfd c0 serr ahcich0: AHCI reset... ahcich0: SATA connect time=0ms status=0123 ahcich0: ready wait time=18ms ahcich0: AHCI reset done: device found (ada0:ahcich0:0:0:0): Request requeued (ada0:ahcich0:0:0:0): Retrying command (ada0:ahcich0:0:0:0): Command timed out (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 8 ahcich0: is cs 007ff000 ss 007fff00 rs 007fff00 tfd c0 serr ahcich0: AHCI reset... ahcich0: SATA connect time=0ms status=0123 ahcich0: ready wait time=84ms ahcich0: AHCI reset done: device found (ada0:ahcich0:0:0:0): Request requeued (ada0:ahcich0:0:0:0): Retrying command (ada0:ahcich0:0:0:0): Command timed out (ada0:ahcich0:0:0:0): Retrying command (ada0:ahcich0:0:0:0): Request requeued [...] If we use old ATA driver we have no problems. If we just use the first disk (ada0) with ahci, no problems either. If we use both disks (ada0 and ada1) in gmirror setup with ahci, we got the above error. If we use both disks in gmirror with old ata driver, no problems. In both cases controller reports command status as 0xc0, that means device is busy with the command. For NCQ commands it means that device in in stage of processing command itself, not a head positioning or data transfer. Enabling AHCI enables NCQ for the devices. That increases load on both devices and the controller, and it is difficult to say who's fault is here. SAMSUNG HD154UI disks AFAIR have 4k sectors that may have big performance penalties when accessing small/misaligned data. I am not sure how big that penalty can be in the worst case, especially since disks by default cache writes, hiding the real load level. Relations with gmirror is harder to explain. Depending on how you created it and partitions it could cause more misaligned I/Os during rebuild. Using gmirror also double concurrent load on the controller, but at this point I have nothing to blame it for. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Disable DMA.
On 02/11/12 20:15, Peter Ankerstål wrote: In FreeBSD 8 i used the loader-variable hw.ata.ata_dma=0 to get my computer boot on a CF card. But in FreeBSD 9.0 it doesn't seem to work. Could it be another variable or is it something else that doesn't work in 9? The machine boots up the installer when the CF-card is not present but when it is present it stops right after the Timecounter stuff. On 9.0 you can to it with hint.ata.X.mode=PIO4 , where X is a bus number. In recent 8/9-STABLE I've also resurrected hw.ata.ata_dma=0. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: siisch1: Error while READ LOG EXT
: siisch1: Error while READ LOG EXT This indicates the underlying device was handed a READ LOG EXT ATA command (command 0x2f) and the device did not respond promptly (resulting in the timeout messages you see). There are hours between timeouts and READ LOG EXT errors. they are not directly related, but may have the same reason. smartctl doesnt show any issues on the drives other than one that has some historical errors from a while ago. What are these errors and do I need to worry about them ? The READ LOG EXT ones are new. {snipping SMART stats} You're focused heavily on the READ LOG EXT command. READ LOG EXT is intended for accessing the GP Log section of a drive. EXT stands for Extended. GP Log means General Purpose Log, and is where all sorts of logging information regarding drive performance is stored. It's usually stored within a reserved section of the platters, or in the HPA area. It's not within a standard user-accessible LBA/sector region. This is a completely separate log from that of SMART logs. READ LOG EXT commands here used to fetch status of some failed NCQ commands. It is normal (the only) way to get detailed error status in that case. Error of the READ LOG EXT commands may mean that it is not regular media error, but may be problem with communication, firmware or something else. You can review the different types of logs on a device by reviewing the ATA8-ACS specification here. See Annex A, section A.1, page 362: http://www.t13.org/documents/UploadedDocuments/docs2007/D1699r4a-ATA8-ACS.pdf This is almost certainly a lower level problem with the disk that cannot be addressed/solved via normal means. Thus, my recommendation is to replace the disk. If you would rather not replace the disk, I can try to step you through looking at the GPLog sections of the disk to see if you can trigger the problem -- and I have a feeling you'll be able to, but I won't necessarily be able to tell you where the actual problem lies hardware-wise, nor will I be able to solve the problem. Regarding the repeated errors at semi-regular (but not entirely) intervals: are you using smartd? Do you have a cronjob that issues smartctl -a or smartctl -x commands at intervals? I imagine any of these could be tickling something lower level. Also, please upgrade your smartmontools to 5.42. It does provide some further enhancements that are useful. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: siisch1: Error while READ LOG EXT
On 09.02.2012 00:38, Jeremy Chadwick wrote: On Thu, Feb 09, 2012 at 12:22:40AM +0200, Alexander Motin wrote: On 08.02.2012 23:27, Jeremy Chadwick wrote: On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote: I have a 4 port eSata PCIe card with 3 external port multipliers attached on an AMD64 box (8G of RAM), RELENG8 from Feb1st. siis0@pci0:5:0:0: class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 hdr=0x00 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)' device = 'PCI-X to Serial ATA Controller (SiI 3124)' class = mass storage subclass = RAID bar [10] = type Memory, range 64, base 0xb4408000, size 128, enabled bar [18] = type Memory, range 64, base 0xb440, size 32768, enabled bar [20] = type I/O Port, range 32, base 0x3000, size 16, enabled cap 01[64] = powerspec 2 supports D0 D1 D2 D3 current D0 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split transactions cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message siis0:SiI3124 SATA controller port 0x3000-0x300f mem 0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5 siis0: [ITHREAD] siisch0:SIIS channel at channel 0 on siis0 siisch0: [ITHREAD] siisch1:SIIS channel at channel 1 on siis0 siisch1: [ITHREAD] siisch2:SIIS channel at channel 2 on siis0 siisch2: [ITHREAD] siisch3:SIIS channel at channel 3 on siis0 siisch3: [ITHREAD] # camcontrol devlist WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 0 lun 0 (pass0,ada0) WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 1 lun 0 (pass1,ada1) WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 2 lun 0 (pass2,ada2) WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 3 lun 0 (pass3,ada3) Port Multiplier 47261095 1f06 at scbus0 target 15 lun 0 (pass4,pmp1) WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 0 lun 0 (pass5,ada4) WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 1 lun 0 (pass6,ada5) WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 2 lun 0 (pass7,ada6) WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 3 lun 0 (pass8,ada7) WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 4 lun 0 (pass9,ada8) Port Multiplier 37261095 1706 at scbus1 target 15 lun 0 (pass10,pmp0) Areca usrvar R001 at scbus4 target 0 lun 0 (pass11,da0) Areca backup1 R001 at scbus4 target 0 lun 1 (pass12,da1) Areca RAID controller R001 at scbus4 target 16 lun 0 (pass13) AMCC 9650SE-2LP DISK 4.10 at scbus5 target 0 lun 0 (pass14,da2) ST31000333AS SD35 at scbus6 target 0 lun 0 (pass15,ada9) ST31000528AS CC35 at scbus7 target 0 lun 0 (pass16,ada10) ST31000340AS SD1A at scbus8 target 0 lun 0 (pass17,ada11) WDC WD1002FAEX-00Z3A0 05.01D05 at scbus11 target 0 lun 0 (pass18,ada12) Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) along with a the odd slot timeout error. Feb 7 23:49:32 backup3 kernel: siisch1: ... waiting for slots 4700 Feb 7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26 Feb 7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 rs 7f17e8b9 es sts 801d2000 serr 0068 Feb 7 23:49:32 backup3 kernel: siisch1: ... waiting for slots 4300 Feb 7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30 Feb 7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 rs 7f17e8b9 es sts 801d2000 serr 0068 Feb 7 23:49:34 backup3 kernel: siisch1: ... waiting for slots 0300 Feb 7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25 Feb 7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 rs 7f17e8b9 es sts 801d2000 serr 0068 Feb 7 23:49:34 backup3 kernel: siisch1: ... waiting for slots 0100 Feb 7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24 Feb 7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 rs 7f17e8b9 es sts 801d2000 serr 0068 This indicates the controller on channel 1 (siisch1) is stalled waiting for underlying communication with the device attached to it. Feb 7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 01:33:52 backup3 last message repeated 2 times Feb 8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 01:50:31 backup3 last message repeated 2 times Feb 8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT Feb 8 03:16:28 backup3 kernel
Re: Kernel panics under 8.2 due to ATA timeouts
Hi. On 01/30/12 22:46, Andrew Boyer wrote: I have a system that appears to have a flaky SATA controller (one of the Intel ESB2 variants) and it seems to be exposing a weakness in the ATA driver (not using ATA_CAM). If a command with ATA_R_DIRECT set times out, the channel gets reinitialized, but from the soft interrupt context. It panics when it tries to sleep in ata_queue_request(). Timeouts work if ATA_R_DIRECT isn't set because in that case it uses a taskqueue to complete the request. Here is the backtrace: #0 kdb_enter (why=0x80962cfa panic, msg=0xaAddress 0xa out of bounds) at ../../../kern/subr_kdb.c:349 #1 0x805d6d0b in panic (fmt=Variable fmt is not available. ) at ../../../kern/kern_shutdown.c:689 #2 0x8061bc53 in sleepq_add (wchan=0xff00052c3e58, lock=0xff00052c3e38, wmesg=0x808fa213 ATA request done, flags=1, queue=0) at ../../../kern/subr_sleepqueue.c:320 #3 0x80590c95 in _cv_timedwait (cvp=0xff00052c3e58, lock=0xff00052c3e38, timo=4) at ../../../kern/kern_condvar.c:313 #4 0x805d61af in _sema_timedwait (sema=0xff00052c3e38, timo=4, file=0x808fa1f6 ../../../dev/ata/ata-queue.c, line=118) at ../../../kern/kern_sema.c:123 #5 0x8028559f in ata_queue_request (request=0xff00052c3dc0) at ../../../dev/ata/ata-queue.c:117 #6 0x80286628 in ata_controlcmd (dev=0xff0002e83d00, command=239 '?', feature=Variable feature is not available. ) at ../../../dev/ata/ata-queue.c:153 #7 0x8027ffd3 in ata_setmode (dev=0xff0002e83d00) at ../../../dev/ata/ata-all.c:637 #8 0x802a0af9 in ad_init (dev=0xff0002e83d00) at ../../../dev/ata/ata-disk.c:405 #9 0x802a0c29 in ad_reinit (dev=0xff0002e83d00) at ../../../dev/ata/ata-disk.c:221 #10 0x80280cad in ata_reinit (dev=0xff0002902800) at ata_if.h:79 #11 0x802856c4 in ata_completed (context=Variable context is not available. ) at ../../../dev/ata/ata-queue.c:313 #12 0x80285ffb in ata_finish (request=0xff00054ec8c0) at ../../../dev/ata/ata-queue.c:265 #13 0x805ed419 in softclock (arg=Variable arg is not available. ) at ../../../kern/kern_timeout.c:430 This is very repeatable. I'm not sure what's the best fix - always use a taskqueue on timeouts? Don't reinit if direct commands fail? This is one of the most messy points of the old ata(4). Problem is that reinit implemented to work synchronously. It means that if some command caused timeout and started reinit, that reinit runs from the taskqueue, blocking it. As result, we can't use taskqueue for completion there and can't do reinit on one of reinit commands timeout. That is handled using ATA_STALL_QUEUE flag. I remember I've intentionally blocked new device detection on reinit to avoid problems with taskqueue there. What's about ATA_R_DIRECT, sorry, I don't remember why it is used there or why it is needed at all. It was done before me. The only place where I see it set except ataraid is ata_getparam(), that should be called only on initial bus probe. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Timekeeping in stable/9
Hi. On 01/21/12 11:18, Martin Sugioarto wrote: Am Wed, 18 Jan 2012 07:50:49 +0100 schrieb Martin Sugioartomar...@sugioarto.com: I can confirm this on VirtualBox. I've been running WinXP inside VirtualBox and measured network I/O during downloads. It showed me very high download rates (around 800kB/s) while it's physically possible to download 200kB/s through DSL here (Germany sucks with DSL, even in largest cities, btw!). I correlated this behavior with high disk I/O on the host. That means that the timer issues on the virtual host appear when I start a larger cp job on the host. I also immediately thought that this has something to do with timers. I just want to add some information on this. I tested a few things with VirtualBox yesterday. I switched off ntpd on the host and tested if there are differences, but the clock is working correctly on the host. I tested it a few times, it is stable, as I expect it to be. It seems to be rather a software problem with VirtualBox. I can see that when the host is under heavy load (CPU!) the guest does not get enough runtime to adjust the clock correctly. After a few minutes there has been a difference of 50 seconds between the host and guest clock. And furthermore, I don't quite understand how the real time clock works in VirtualBox but it seems to slide in the different directions causing weird results with progress bars on MS-Windows XP. I just want to explain why I thought that I/O influences this. I have got my hard disk encrypted, so it puts some load on the CPU, too. If you want to test VirtualBox behavior, you can simple dd from /dev/random and look at the weird results in VirtualBox. I am not using VirtualBox right now, so I'll need to setup it to test this. Meanwhile you could try to experiment with switching to different timecounters and eventtimers. May be some change in 9.0 changed default timecounter for you, causing the problem. timecounter wrap should be the main cause of time drift (if timecounter hardware is emulated correctly at all). Different timecounters have different wrap periods that can be calculated by dividing kern.timecounter.tc.X.mask on kern.timecounter.tc.X.frequency. In my case there are: 300s for HPET, 5s for ACPI-fast, 2s for TSC and 55ms for i8254. If system won't get timer interrupts within half of that time -- time will drift. Start from looking what you are using and how good it is in your case. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Timekeeping in stable/9
On 01/21/12 15:20, Martin Sugioarto wrote: Am Sat, 21 Jan 2012 14:30:53 +0200 schrieb Alexander Motinm...@freebsd.org: I am not using VirtualBox right now, so I'll need to setup it to test this. Meanwhile you could try to experiment with switching to different timecounters and eventtimers. May be some change in 9.0 changed default timecounter for you, causing the problem. I think we have a misunderstanding here. The host (FreeBSD 9.0R) works fine. The time is being updated under heavy load without problems. I already said that this seems to be an application problem and this email(s) should be rather seen by the VBox maintainer. The problem is that VBox seems to stop working properly when you put heavy CPU load on the host. It even does not keep the clock up-to-date. I can desync the guest clock to -1 minute in a few seconds, just by running openssl speed -multi 20. Ah. I'm sorry. I was sure we are debugging FreeBSD inside VirtualBox. If we are speaking about FreeBSD outside, then neither timecounter nor eventtimer choice should not affect guest if host is working fine. It is more question to VirtualBox and may be host system scheduler. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD hangs on boot after kernel upgrade to 9.0-R
Hi. On 01/21/12 21:34, mato wrote: I've used freebsd-update to upgrade from 8.2-R to 9.0-R and all looked nice until the first reboot. Now my FreeBSD always hangs midway through the boot process and the last message output is: uhub3:Intel EHCI root HUB... I've tried safe boot option but that does not help at all. When I disable USB support in BIOS the last message before hang is: ata1: reset tp1 mask=03 ostat0=00 ostat1=00 (aprobe0:ata0:0:1:0): SIGNATURE: eb14 Any idea what might be wrong and how to fix it please ? The last line is the ATAPI device detection. What ATA controller do you have there? On one Core2Duo-class Supermicro system alike hang was caused by ITE PATA controller. In that case it was workarounded by adding hint.ata.0.mode=PIO4 to /oot/loader.conf. You may try just set it from loader prompt with `set` command. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Marvel 88SE9480
Hi. On 01/21/12 00:19, Mike Tancsa wrote: I tried this new controller http://www.addonics.com/products/ad2ms6gpx8.php which is based on the 88SE9480 chipset. Does anyone have it working ? I tried adding the PCI ID, but it does not attach unfortunately. {0x94801b4b, 0x00, Addonics, AHCI_Q_NOBSYRES}, ahci0:Addonics AHCI SATA controller mem 0x4814-0x4815,0x4810-0x4813 irq 16 at device 0.0 on pci1 device_attach: ahci0 attach returned 6 pciconf shows ahci0@pci0:1:0:0: class=0x010400 card=0x94801b4b chip=0x94801b4b rev=0x03 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' class = mass storage subclass = RAID bar [10] = type Memory, range 64, base 0x4814, size 131072, enabled bar [18] = type Memory, range 64, base 0x4810, size 262144, enabled cap 01[40] = powerspec 3 supports D0 D1 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 2 endpoint max data 128(4096) link x8(x8) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 0002[140] = VC 1 max VC0 I haven't seen SAS controllers compatible with AHCI yet. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Strange 'hangs' with RELENG_9
On 19.01.2012 18:51, Oliver Pinter wrote: CC: Alexander Motin On 1/19/12, László KÁROLYIlas...@karolyi.hu wrote: László KÁROLYI wrote: Ok, couldn't get it through... So here is it, uploaded: http://www.freeimagehosting.net/s836i Another screenshot here: http://www.freeimagehosting.net/xv26d I am not sure how freezes that could be fixed with key press could be related to panics around storage. I would try to go two different ways: - for panics, if dumping is not possible, I would try to resolve address of the instruction pointer from both messages with `addr2line -e /path/to/kernel address`. - for freezes I would try to look on eventtimers(4) subsystem: check what timer is used, try to switch to different one, try to switch into periodic mode. Since cause of siis timeouts in SATA2 mode is also unclear, I can't also exclude that it may be somehow related. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Strange 'hangs' with RELENG_9
On 01/19/12 21:05, Ian Lepore wrote: On Thu, 2012-01-19 at 19:14 +0200, Alexander Motin wrote: On 19.01.2012 18:51, Oliver Pinter wrote: CC: Alexander Motin On 1/19/12, László KÁROLYIlas...@karolyi.hu wrote: László KÁROLYI wrote: Ok, couldn't get it through... So here is it, uploaded: http://www.freeimagehosting.net/s836i Another screenshot here: http://www.freeimagehosting.net/xv26d I am not sure how freezes that could be fixed with key press could be related to panics around storage. I would try to go two different ways: - for panics, if dumping is not possible, I would try to resolve address of the instruction pointer from both messages with `addr2line -e /path/to/kernel address`. - for freezes I would try to look on eventtimers(4) subsystem: check what timer is used, try to switch to different one, try to switch into periodic mode. Since cause of siis timeouts in SATA2 mode is also unclear, I can't also exclude that it may be somehow related. The new eventtimers was also the first thing that came to my mind, but I couldn't quickly find the right way to boot with a different timer. I saw in the eventtimers(7) manpage that there's a sysctl to change the timer, but when I used it the system timing went completely wonky (ntpd reported it was off by many seconds, a few seconds after I changed it). When I just tried it again the system locked up and had to be power cycled. (I'm trying this on old hardware where my only choices are i8254 and RTC, and changing to RTC apparently doesn't work well.) So I didn't want to recommend it to someone else. :) That's strange. On all systems I have, I can safely set any event timer in any way. Though for better precision it is better to set them using loader tunable. For both eventtimers and timecounters, I think it'd be nice if a tunable or hint could let the user override the quality number. But maybe there's already some better way of influencing the choices the kernel makes? kern.eventtimer.timer is both sysctl and loader tunable. You can set it anywhere you want. Also for most enevt timers there are documented tunables to disable them, Also, as I've already said, you may try to switch to old periodic mode by setting kern.eventtimer.periodic. On your old system with just i8254 and RTC it is enabled always automatically. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Strange 'hangs' with RELENG_9
On 01/19/12 22:03, Andriy Gapon wrote: on 19/01/2012 21:24 László Károlyi said the following: On 2012.01.19., at 18:18, Andriy Gapon wrote: Please provide output of the following sysctls: sysctl kern.eventtimer sysctl kern.timecounter [root@sys ~]# sysctl kern.eventtimer kern.eventtimer.choice: HPET(450) HPET1(450) HPET2(450) LAPIC(400) i8254(100) RTC(0) kern.eventtimer.et.LAPIC.flags: 15 kern.eventtimer.et.LAPIC.frequency: 0 kern.eventtimer.et.LAPIC.quality: 400 kern.eventtimer.et.i8254.flags: 1 kern.eventtimer.et.i8254.frequency: 1193182 kern.eventtimer.et.i8254.quality: 100 kern.eventtimer.et.HPET.flags: 3 kern.eventtimer.et.HPET.frequency: 14318180 kern.eventtimer.et.HPET.quality: 450 kern.eventtimer.et.HPET1.flags: 3 kern.eventtimer.et.HPET1.frequency: 14318180 kern.eventtimer.et.HPET1.quality: 450 kern.eventtimer.et.HPET2.flags: 3 kern.eventtimer.et.HPET2.frequency: 14318180 kern.eventtimer.et.HPET2.quality: 450 kern.eventtimer.et.RTC.flags: 17 kern.eventtimer.et.RTC.frequency: 32768 kern.eventtimer.et.RTC.quality: 0 kern.eventtimer.periodic: 0 kern.eventtimer.timer: HPET kern.eventtimer.idletick: 0 kern.eventtimer.singlemul: 2 [root@sys ~]# sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC-low(800) HPET(950) i8254(0) ACPI-fast(900) dummy(-100) kern.timecounter.hardware: HPET kern.timecounter.stepwarnings: 0 kern.timecounter.tc.ACPI-fast.mask: 4294967295 kern.timecounter.tc.ACPI-fast.counter: 3649705857 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 900 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 27536 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.HPET.mask: 4294967295 kern.timecounter.tc.HPET.counter: 1224089625 kern.timecounter.tc.HPET.frequency: 14318180 kern.timecounter.tc.HPET.quality: 950 kern.timecounter.tc.TSC-low.mask: 4294967295 kern.timecounter.tc.TSC-low.counter: 1655163352 kern.timecounter.tc.TSC-low.frequency: 11772185 kern.timecounter.tc.TSC-low.quality: 800 kern.timecounter.smp_tsc: 1 kern.timecounter.invariant_tsc: 1 I wonder whether there could be an interference between HPET being used as timecounter and HPET being used as an event timer. Alexander, what do you think? I don't expect interference between them. HPET timecounter just reads same hardware counter that is also read by comparators for eventtimer interrupts generation. Theoretically they could interfere if that timer was stopped during comparators programming, but it is not. László, can you please try changing kern.timecounter.hardware to TSC-low or ACPI-fast? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.0 and Intel MatrixRAID RAID5
On 17.01.2012 12:53, Alexander Pyhalov wrote: On my desktop I use Intel MatrixRAID RAID5 soft raid controller. RAID5 is configured over 3 disks. FreeBSD 8.2 sees this as: ar0: 953874MB Intel MatrixRAID RAID5 (stripe 64 KB) status: READY ar0: disk0 READY using ad4 at ata2-master ar0: disk1 READY using ad6 at ata3-master ar0: disk2 READY using ad12 at ata6-master Root filesystem is on /dev/ar0s1. Today I've tried to upgrade to 9.0. It doesn't see this disk array. Here is dmesg. When I load geom_raid, it finds something, but doesn't want to work with RAID: GEOM_RAID: Intel-e922b201: Array Intel-e922b201 created. GEOM_RAID: Intel-e922b201: No transformation module found for Volume0. GEOM_RAID: Intel-e922b201: Volume Volume0 state changed from STARTING to UNSUPPORTED. GEOM_RAID: Intel-e922b201: Disk ada2 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Subdisk Volume0:2-ada2 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Disk ada1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Subdisk Volume0:1-ada1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Disk ada0 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Subdisk Volume0:0-ada0 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Array started. No new devices appear in /dev. How could I solve this issue? ataraid(4) had mostly read-only support for RAID5 because it doesn't update parity data. I haven't thought anybody really using it in such condition. That's why geom_raid doesn't support RAID5 now at all. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.0 and Intel MatrixRAID RAID5
On 17.01.2012 19:03, Vinny Abello wrote: I had something similar on a software based RAID controller on my Intel S5000PSL motherboard when I just went from 8.2-RELEASE to 9.0-RELEASE. After adding geom_raid_load=YES to my /boot/loader.conf, it still didn't create the device on bootup. I had to manually create the label with graid. After that it created /dev/raid/ar0 for me and I could mount the volume. Only thing which I've trying to understand is the last message below about the integrity check failed. I've found other posts on this but when I dig into my setup, I don't see the same problems that are illustrated in the post and am at a loss for why that is being stated. Also, on other posts I think it was (raid/r0, MBR) that people were getting and trying to fix. Mine is (raid/r0, BSD) which I cannot find reference to. I have a feeling it has to do with the geometry of the disk or something. Everything else seems fine... I admittedly only use this volume for scratch space and didn't have anything important stor ed on it so I wasn't worried about experimenting or losing data. ada0 at ahcich0 bus 0 scbus2 target 0 lun 0 ada0:WDC WD4000YR-01PLB0 01.06A01 ATA-7 SATA 1.x device ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus3 target 0 lun 0 ada1:WDC WD4000YR-01PLB0 01.06A01 ATA-7 SATA 1.x device ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) ada1: Previously was known as ad6 GEOM_RAID: Intel-8c840681: Array Intel-8c840681 created. GEOM_RAID: Intel-8c840681: Disk ada0s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Subdisk ar0:0-ada0s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Disk ada1s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Subdisk ar0:1-ada1s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Array started. GEOM_RAID: Intel-8c840681: Volume ar0 state changed from STARTING to OPTIMAL. GEOM_RAID: Intel-8c840681: Provider raid/r0 for volume ar0 created. GEOM_PART: integrity check failed (raid/r0, BSD) Any ideas on the integrity check anyone? It is not related to geom_raid, but to geom_part. There is something wrong with your label. You may set kern.geom.part.check_integrity sysctl to zero do disable these checks. AFAIR it was mentioned in 9.0 release notes. On 1/17/2012 6:57 AM, Matthias Gamsjager wrote: Not sure if geom_raid is implemented with cam. I remember a post a while back about this issue to happen with defaulting cam in 9. Did not follow it so not sure if something has been done about it. On Tue, Jan 17, 2012 at 11:53 AM, Alexander Pyhalova...@rsu.ru wrote: Hello. On my desktop I use Intel MatrixRAID RAID5 soft raid controller. RAID5 is configured over 3 disks. FreeBSD 8.2 sees this as: ar0: 953874MBIntel MatrixRAID RAID5 (stripe 64 KB) status: READY ar0: disk0 READY using ad4 at ata2-master ar0: disk1 READY using ad6 at ata3-master ar0: disk2 READY using ad12 at ata6-master Root filesystem is on /dev/ar0s1. Today I've tried to upgrade to 9.0. It doesn't see this disk array. Here is dmesg. When I load geom_raid, it finds something, but doesn't want to work with RAID: GEOM_RAID: Intel-e922b201: Array Intel-e922b201 created. GEOM_RAID: Intel-e922b201: No transformation module found for Volume0. GEOM_RAID: Intel-e922b201: Volume Volume0 state changed from STARTING to UNSUPPORTED. GEOM_RAID: Intel-e922b201: Disk ada2 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Subdisk Volume0:2-ada2 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Disk ada1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Subdisk Volume0:1-ada1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Disk ada0 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Subdisk Volume0:0-ada0 state changed from NONE to ACTIVE. GEOM_RAID: Intel-e922b201: Array started. No new devices appear in /dev. How could I solve this issue? -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.0 and Intel MatrixRAID RAID5
On 17.01.2012 23:35, Vinny Abello wrote: On 1/17/2012 4:04 PM, Alexander Motin wrote: On 17.01.2012 19:03, Vinny Abello wrote: I had something similar on a software based RAID controller on my Intel S5000PSL motherboard when I just went from 8.2-RELEASE to 9.0-RELEASE. After adding geom_raid_load=YES to my /boot/loader.conf, it still didn't create the device on bootup. I had to manually create the label with graid. After that it created /dev/raid/ar0 for me and I could mount the volume. Only thing which I've trying to understand is the last message below about the integrity check failed. I've found other posts on this but when I dig into my setup, I don't see the same problems that are illustrated in the post and am at a loss for why that is being stated. Also, on other posts I think it was (raid/r0, MBR) that people were getting and trying to fix. Mine is (raid/r0, BSD) which I cannot find reference to. I have a feeling it has to do with the geometry of the disk or something. Everything else seems fine... I admittedly only use this volume for scratch space and didn't have anything important st or ed on it so I wasn't worried about experimenting or losing data. ada0 at ahcich0 bus 0 scbus2 target 0 lun 0 ada0:WDC WD4000YR-01PLB0 01.06A01 ATA-7 SATA 1.x device ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus3 target 0 lun 0 ada1:WDC WD4000YR-01PLB0 01.06A01 ATA-7 SATA 1.x device ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) ada1: Previously was known as ad6 GEOM_RAID: Intel-8c840681: Array Intel-8c840681 created. GEOM_RAID: Intel-8c840681: Disk ada0s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Subdisk ar0:0-ada0s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Disk ada1s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Subdisk ar0:1-ada1s1 state changed from NONE to ACTIVE. GEOM_RAID: Intel-8c840681: Array started. GEOM_RAID: Intel-8c840681: Volume ar0 state changed from STARTING to OPTIMAL. GEOM_RAID: Intel-8c840681: Provider raid/r0 for volume ar0 created. GEOM_PART: integrity check failed (raid/r0, BSD) Any ideas on the integrity check anyone? It is not related to geom_raid, but to geom_part. There is something wrong with your label. You may set kern.geom.part.check_integrity sysctl to zero do disable these checks. AFAIR it was mentioned in 9.0 release notes. Thanks for responding, Alexander. I also found that information about that sysctl variable, however I was trying to determine if something is actually wrong, how to determine what it is and ultimately how to fix it so it passes the check. I'd rather not ignore errors/warnings unless it's a bug. Again, I have no data of value on this partition, so I can do anything to fix it. Just not sure what to do or look at specifically. First thing I would check is that partition is not bigger then the RAID volume size. If label was created before the RAID volume, that could be the reason, because RAID cuts several sectors off the end of disk to store metadata. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ataraid and 9.0 RC-2
Hi. On 27.11.2011 01:41, Adam Stylinski wrote: I just ran freebsd-update to get up to 9.0-RC2 and discovered that ataraid does not work. I realize I'm an edge case and my scenario is not ideal (I use an ITE controller and performance is actually impressively slow), but I cannot boot 9.0 from my stripe, even after manually loading ataraid from the loader prompt (after running an unload command). I mention it mostly because other people using the fakeraid setup by their motherboards for whatever reason (perhaps to share a partition table with windows on the same mirror or stripe) may have a similar problem. It seems like the ar0 device disappeared for me completely (even though it finds ada0 and ada1). I'm using the following device: atapci0@pci0:2:11:0:class=0x010400 card=0x chip=0x82121283 rev=0x13 hdr=0x00 vendor = 'Integrated Technology Express (ITE) Inc' device = 'ATA 133 IDE RAID Controller (IT8212F)' class = mass storage subclass = RAID rl0@pci0:2:13:0:class=0x02 card=0x80ea104d chip=0x813910ec rev=0x10 hdr=0x00 At first I figured because it may be loading AHCI (as per the device naming schemes ada0 and ada1). I haven't looked too much into it (these devices are actually PATA not SATA, so AHCI doesn't even exist for these), but maybe there's an ATA/AHCI driver that's built into the default kernelthat is interfering with ataraid.ko? Maybe this interferes with my stupidly slow and unpopular configuration. Thanks for any help, I'll also have a gander at the new DEFAULTS for the generic kernel in the 9.0 source tree. FreeBSD 9.x uses new CAM-bases ATA subsystem. ataraid driver depends on old ATA infrastructure and does not work with new. Instead, new GEOM RAID class was implemented. Unluckily, as soon as ITE produced only PATA controllers, there is no support for their metadata format in geom_raid module now. So, at the moment, the only option to access that RAID volume is to build custom kernel with old ATA and use ataraid. Respective kernel options listed in /usr/src/UPDATING item from 20110424. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ataraid and 9.0 RC-2
On 30.11.2011 03:03, Adam Stylinski wrote: On Tue, Nov 29, 2011 at 08:38:47PM +0200, Alexander Motin wrote: On 27.11.2011 01:41, Adam Stylinski wrote: I just ran freebsd-update to get up to 9.0-RC2 and discovered that ataraid does not work. I realize I'm an edge case and my scenario is not ideal (I use an ITE controller and performance is actually impressively slow), but I cannot boot 9.0 from my stripe, even after manually loading ataraid from the loader prompt (after running an unload command). I mention it mostly because other people using the fakeraid setup by their motherboards for whatever reason (perhaps to share a partition table with windows on the same mirror or stripe) may have a similar problem. It seems like the ar0 device disappeared for me completely (even though it finds ada0 and ada1). I'm using the following device: atapci0@pci0:2:11:0:class=0x010400 card=0x chip=0x82121283 rev=0x13 hdr=0x00 vendor = 'Integrated Technology Express (ITE) Inc' device = 'ATA 133 IDE RAID Controller (IT8212F)' class = mass storage subclass = RAID rl0@pci0:2:13:0:class=0x02 card=0x80ea104d chip=0x813910ec rev=0x10 hdr=0x00 At first I figured because it may be loading AHCI (as per the device naming schemes ada0 and ada1). I haven't looked too much into it (these devices are actually PATA not SATA, so AHCI doesn't even exist for these), but maybe there's an ATA/AHCI driver that's built into the default kernelthat is interfering with ataraid.ko? Maybe this interferes with my stupidly slow and unpopular configuration. Thanks for any help, I'll also have a gander at the new DEFAULTS for the generic kernel in the 9.0 source tree. FreeBSD 9.x uses new CAM-bases ATA subsystem. ataraid driver depends on old ATA infrastructure and does not work with new. Instead, new GEOM RAID class was implemented. Unluckily, as soon as ITE produced only PATA controllers, there is no support for their metadata format in geom_raid module now. So, at the moment, the only option to access that RAID volume is to build custom kernel with old ATA and use ataraid. Respective kernel options listed in /usr/src/UPDATING item from 20110424. Hmm, I may just as well dump the UFS and restore it to a totally geom based solution. If anything it will likely help rather than hurt my performance. Sure. You can't boot from GEOM STRIPE (you may want MIRROR or CONCAT), but if your motherboard has at least one SATA port, single modern hard drive may give you even higher speeds then stripe of old PATA drives on PCI controller. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ATA/Cdrom(?) panic
Hi. On 11/16/11 08:43, Bjoern A. Zeeb wrote: we have seen this or a very similar panic for about 1 year now once in a while and I think I reported it before; this is FreeBSD as guest on vmware. Seems it was a double panic this time. Could someone please see what's going on there?It was on 8.x-STABLE in the past and this is 8.2-RELEASE-p4. The part of code reporting completing request directly is IMHO broken by design. It returns request completion before request will actually be completed by lower levels without any knowledge of what's going on there. There is kind of protection against double request completion, but it looks like not always working. May be because that part of code is not locked and nothing prevents that semaphore timeout and normal request timeout/completion to happen simultaneously. It is surprising to see even two traps same time, not sure what synchronized them so precisely. Simple removing that semaphore timeout is not an option, because it will cause deadlock when this wait happen within taskqueue thread that is used to handle requests completion and abort that wait. Avoid waiting inside taskqueue is also impossible without major rewrite. That's why ATA_CAM drops that code completely. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ATA/Cdrom(?) panic
On 11/16/11 16:14, Bjoern A. Zeeb wrote: On Wed, 16 Nov 2011, Alexander Motin wrote: Hi. On 11/16/11 08:43, Bjoern A. Zeeb wrote: we have seen this or a very similar panic for about 1 year now once in a while and I think I reported it before; this is FreeBSD as guest on vmware. Seems it was a double panic this time. Could someone please see what's going on there?It was on 8.x-STABLE in the past and this is 8.2-RELEASE-p4. The part of code reporting completing request directly is IMHO broken by design. It returns request completion before request will actually be completed by lower levels without any knowledge of what's going on there. There is kind of protection against double request completion, but it looks like not always working. May be because that part of code is not locked and nothing prevents that semaphore timeout and normal request timeout/completion to happen simultaneously. It is surprising to see even two traps same time, not sure what synchronized them so precisely. Simple removing that semaphore timeout is not an option, because it will cause deadlock when this wait happen within taskqueue thread that is used to handle requests completion and abort that wait. Avoid waiting inside taskqueue is also impossible without major rewrite. That's why ATA_CAM drops that code completely. So the bottom line of what you are saying is: 1) it's hard to fix right in 8 2) it's not an issue in 9 anymore at all? Right. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Trouble with SSD on SATA
Hi. On 16.11.2011 18:12, Willem Jan Withagen wrote: I'm getting these: Nov 16 16:40:49 zfs kernel: ata6: port is not ready (timeout 15000ms) tfd = 0080 Nov 16 16:40:49 zfs kernel: ata6: hardware reset timeout Nov 16 16:41:50 zfs kernel: ata6: port is not ready (timeout 15000ms) tfd = 0080 Nov 16 16:41:50 zfs kernel: ata6: hardware reset timeout When inserting the tray with a SSD disk connected to that controller. Which is probably due to a BIOS upgrade At least it started after upgrading the BIOS. So I'm asking SuperMicro for an older version. When this happens, the system sometimes panics, haven't written the details yet down right now. somewhere in get_devices... After the panic I really need to powerdown the machine, otherwise it boots but stalls at finding any disks. It does not just find no disks, it freezes at the point it should report the found disks in the bios-boot. So apparently the ata controller are left in a very confused state. Why is the controller found at boot, and works as it should. And why later it just starts generating these hardware resets?? Looking on messages, I would say that you are using AHCI controller with old ata(4) driver. I would recommend you to try new ahci(4) driver. It has better hot-plug support and also supports NCQ and some other features. Note that disks connected to it will be reported as adaX instead of adY. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SATA 6g 4-port non-RAID controller ?
On 25.07.2011 12:34, Kurt Jaeger wrote: What kind of SATA 6g 4-port non-RAID controller is currently suggested for use in 8/9 setups with large RAM (64G) setups with ZFS ? If you need exactly SATA, not SAS 6g controller, then choice is not so big: either something integrated into latest Intel (only two ports) or AMD (6 ports) chipset, or something based on Marvell 88SE91xx chips. Last case also has only 2 ports, but you may install two cards, or use Highpoint RocketRAID 640, which is just two above chips connected with PCIe bridge. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MFC: graid(8) (RAID GEOM) support
Jeremy Chadwick wrote: On Fri, Jun 17, 2011 at 05:51:24PM -0700, Jeremy Chadwick wrote: Sorry for the cross-post, but I thought both lists would want to know about this. Looks like mav@ just committed this ~17 hours ago: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/geom/raid/g_raid.c Those who have historically wanted to use Intel MatrixRAID (now called Intel RST (Rapid Storage Technology)), but haven't due to the severe issues/risks with ataraid(4), will probably be very interested in this commit. I know I am! I plan on stress-testing the Intel support on a 2-disk system with RAID-1 enabled, and will document my experiences, procedures, etc... Thanks, mav@ and imp@ ! I'll be sending another mail momentarily asking about USB memory stick image building, since to accomplish the above, I want to do a bare-bones install on our test system (e.g. enable Intel RAID, set up 2 disks in a RAID-1 mirror, boot a USB memory stick that contains this latest RELENG_8 build, and do sysinstall, etc.. the normal way). = MFC r219974, r220209, r220210, r220790: Add new RAID GEOM class, that is going to replace ataraid(4) in supporting various BIOS-based software RAIDs. Unlike ataraid(4) this implementation does not depend on legacy ata(4) subsystem and can be used with any disk drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4) with `options ATA_CAM`). To make code more readable and extensible, this implementation follows modular design, including core part and two sets of modules, implementing support for different metadata formats and RAID levels. Support for such popular metadata formats is now implemented: Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage. Such RAID levels are now supported: RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT. For all of these RAID levels and metadata formats this class supports full cycle of volume operations: reading, writing, creation, deletion, disk removal and insertion, rebuilding, dirty shutdown detection and resynchronization, bad sector recovery, faulty disks tracking, hot-spare disks. For Intel and Promise formats there is support multiple volumes per disk set. Look graid(8) manual page for additional details. Co-authored by: imp Sponsored by: Cisco Systems, Inc. and iXsystems, Inc. = By the way, it doesn't look like the graid(8) man page is being brought in to the base system on either of the two RELENG_8 systems I've rebuilt in the past few days. I'm thinking /usr/src/sbin/geom/class/raid/graid.8 isn't being noticed as a man page. /usr/src/sbin/geom/class/raid/Makefile doesn't have MAN8=graid.8 in it, is that the problem? I've just rebuilt my test 8-STABLE system and it installed graid(8). -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: PCIe SATA HBA for ZFS on -STABLE
On 07.06.2011 05:33, Matthew Dillon wrote: The absolute cheapest solution is to buy a Sil-3132 PCIe card (providing 2 E-SATA ports), and then connect an external port multiplier to each port. External port multiplier enclosures typically support 5 drives each so that would give you your 10 drives. Even the 3132 is a piss-ant little card it does support FIS-Based switching so performance will be very good... it will just be limited to SATA-II speeds is all. SiI3132 is indeed good for it's price and it is quite good for random I/O. But at burst speeds it is limited lower then SATA-II. Even lower then PCIe 1.0 x1 it uses. IIRC I've seen about 150MB/s from one port and about 170MB/s from two. If burst rate is important, SiI3124 chip is much better -- up to about 900MB/s measured from 4 ports. The only issue is PCI-X interface: either motherboard with PCI-X needed, or card with PCIe x8 bridge (like these http://www.addonics.com/products/host_controller/adsa3gpx8-4e.asp), but last case is too expensive. There are also much cheaper (~$50) PCIe x1 bridge SiI3124 cards (http://www.sybausa.com/productInfo.php?iid=537). They are not so fast -- about 200MB/s, but still more then SiI3132. And they still have 4 SATA ports. For SSDs you want to directly connect the SSD to a mobo SATA port and then either mount the SSD in the case or mount it in a hot-swap gadget that you can screw into a PCI slot (it doesn't actually use the PCI connector, just the slot). A SATA-III port with a SATA-III SSD really shines here and 400-500 MBytes/sec random read performance from a single SSD is possible, but it isn't an absolute requirement. A SATA-II port will still work fine as long as you don't mind maxing out the bandwidth at 250 MBytes/sec. Agree. Intel on-board ports rock! Recently I've built new system with two OCZ Vertex 3 SSDs connected to 6Gbps SATA ports on Intel Sandy Bridge class motherboard. UFS on top of graid RAID0 volume gives me about 950MB/s on both read and write! To get robust hot-swap enclosures you either need to go with SAS or you need to go with discrete SATA ports (no port multiplication), and the ports have to support hot-swap. The best hot-swap support for an AHCI port is if the AHCI chipset supports cold-presence-detect (CPD), and again Mobo AHCI chipsets usually don't. Hot-swap is a bit hit or miss without CPD because power savings modes can effectively prevent hot-swap detect from working properly. Drive disconnects will always be detected but drive connects might not be. I would say it depends. In some cases it is easier to detect hot-plug then hot-unplug, as device sends COMINIT that should wake up port even from power-save state. With ICH10, for example, I've managed to make both hot plug and unplug work even with power-management enabled: hot-plug via tracking COMINIT, unplug via it's CPD capability. Without PM it just works. :) -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8-STABLE won't boot with ZFSv28
Hi. Holger Kipp wrote: as yesterday was a bank holiday in Germany I wasn't in the office to try the patch linked in the email. Is it consent that I should try the patch located here: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26 and report the result? Or do you need some additional discussion on this topic? I really don't know much about ata-intel chipset programming interface things, that's why I'm asking :-) Yes, I want you to try it and report the result. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8-STABLE won't boot with ZFSv28
Hi. Holger Kipp wrote: got the same messages over and over again - panic took some time: unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 ata0: reinit done .. ata0: reiniting channel .. ata0: DISCONNECT requested short delay here ata0: p0: SATA connect time=0ms status=0113 ata0: p1: SATA connect timeout status= ata0: reset tp1 mask=03 ostat0=00 ostat1=00 ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata0: reset tp2 stat0=00 stat1=00 devices=0x3 unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 ata0: reinit done .. ata0: reiniting channel .. ata0: DISCONNECT requested I see two problems here: 1. devices=0x3 means that two ATAPI devices were detected instead of one. I can reproduce it also with other Intel chipsets. It looks like a hardware bug to me. It can be workarounded by reconnecting ATAPI device to even (2 or 4) SATA port, or connecting any other device there. 2. DISCONNECT requested means that controller reported PHY status change for some device on channel, triggering infinite retry. Unluckily I have no ICH9 board, while I can't reproduce it with ICH10 or above. This patch should workaround the first problem in software: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26 Try it please and let's see if with some luck it do something about the second problem. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8-STABLE won't boot with ZFSv28
Jeremy Chadwick wrote: On Thu, Jun 02, 2011 at 09:53:58AM +0300, Alexander Motin wrote: Holger Kipp wrote: got the same messages over and over again - panic took some time: unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 ata0: reinit done .. ata0: reiniting channel .. ata0: DISCONNECT requested short delay here ata0: p0: SATA connect time=0ms status=0113 ata0: p1: SATA connect timeout status= ata0: reset tp1 mask=03 ostat0=00 ostat1=00 ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata0: reset tp2 stat0=00 stat1=00 devices=0x3 unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 ata0: reinit done .. ata0: reiniting channel .. ata0: DISCONNECT requested I see two problems here: 1. devices=0x3 means that two ATAPI devices were detected instead of one. I can reproduce it also with other Intel chipsets. It looks like a hardware bug to me. It can be workarounded by reconnecting ATAPI device to even (2 or 4) SATA port, or connecting any other device there. 2. DISCONNECT requested means that controller reported PHY status change for some device on channel, triggering infinite retry. Unluckily I have no ICH9 board, while I can't reproduce it with ICH10 or above. This patch should workaround the first problem in software: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26 Try it please and let's see if with some luck it do something about the second problem. With regards to item #1: I don't see anything in the ICH9 errata that indicates a silicon bug if the only device attached to the controller is an ATAPI device and connected to SATA port 0 (presumably), or an odd-numbered port? If this problem exists on other ICHxx and/or ESBxx chips, I sure would hope it'd be documented. I haven't tried confirming it myself, but if need be I can set up a test box with a SATA-based DVD drive hooked up to it + provide remote serial console/etc. if it'd be of any help. I don't think it would be (sounds like you have lots of hardware :-) ), but I'm willing to help in any way I can. Intel probably don't see issue there, as the same behavior can be found even on latest chipsets. But according to my ATA specs understanding and real PATA devices behavior analysis, this behavior is not correct. When ATAPI device connected to the first of two SATA ports, routed to the same legacy-/PATA-emulated ATA channel (master device), soft-reset sequence returns false-positive slave ATAPI device presence. Problem doesn't expose with ATA disk devices, or if some other device really attached to the slave port. Problem looks like it was there always, but before ATA_CAM it was not usually noticed, due to very small IDENTIFY command timeouts in ata(4). If somebody can give better explanation or propose better workaround -- welcome, as I am not very like this solution. With regards to item #2: could this be at all related to OOB (bit 15) somehow being set in PCS (SATA register offset 0x92)? I'm doubting it but I thought I'd ask. My thought process, which is probably wrong (consider it an educational discussion :-) ): The ICH9 specification states that the default value for this register is 0x, and b15=0 means SATA controller will not retry after an OOB failure, while b15=1 causes the controller to indefinitely retry after OOB failure. I imagine system BIOSes and other things can change this default value, but we don't seem to print it anywhere in ata_intel_chipinit() during a verbose boot. Looking at chipsets/ata-intel.c, it looks like we only touch PCS in ata_intel_chipinit() and ata_intel_reset(). In the former, we avoid touching bits 4 through 15, and in the latter we mask out only what we want to adjust (e.g. the SATA port per ch variable). As as I can see, ata_intel.c should not change that bit if it was set for some reason. Theoretically, OOB (Out-of-Band signaling) is the function of the same state machine which sets that PHY changes status flag. But friendly speaking, I have no idea what result can be from setting of this bit. In this legacy/PATA emulation mode there are too many things not documented to be sure in anything. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ICH9 panic/instability on recent kernel
On 29.05.2011 07:56, Jeremy Chadwick wrote: On Sat, May 28, 2011 at 09:10:11PM -0700, Michael Sinatra wrote: I have a core-2 system with a 3ware SATA RAID controller for the main disks and the built-in Intel ICH9 4-port SATA controller that is only used for the DVDR. An 8-STABLE kernel csup'd and compiled on April 25 works fine on this system. Kernels from source csup'd this week are extremely unstable and usually panic or hang just minutes after booting. The following warning messages appear after the kernel probes the SATA controller and/or ICH9 USB controller and continue about once per 1-2 seconds until the system crashes: May 13 14:21:05 sonicyouth kernel: unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 Disabling the ICH9 SATA controller in the BIOS allows the system to boot and run normally. Changes were made on April 28 to allow better support for 6-port ICH9 controllers (SVN rev 221156) and I am wondering if my controller is now being incorrectly recognized. Here's the relevant kernel messages: May 13 13:52:53 sonicyouth kernel: atapci1:Intel ICH9 SATA300 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1c40-0x1c4f,0x1c30-0x1c3f at device 31.2 on pci0 May 13 13:52:53 sonicyouth kernel: ata0:ATA channel 0 on atapci1 May 13 13:52:53 sonicyouth kernel: ata0: [ITHREAD] May 13 13:52:53 sonicyouth kernel: ata1:ATA channel 1 on atapci1 May 13 13:52:53 sonicyouth kernel: ata1: [ITHREAD] May 13 13:52:53 sonicyouth kernel: atapci2:Intel ICH9 SATA300 controller port 0x1cb8-0x1cbf,0x1cac-0x1caf,0x1cb0-0x1cb7,0x1ca8-0x1cab,0x1c60-0x1c6f,0x1c50-0x1c5f irq 18 at device 31.5 on pci0 May 13 13:52:53 sonicyouth kernel: atapci2: [ITHREAD] May 13 13:52:53 sonicyouth kernel: ata3:ATA channel 0 on atapci2 May 13 13:52:53 sonicyouth kernel: ata3: [ITHREAD] May 13 13:52:53 sonicyouth kernel: ata4:ATA channel 1 on atapci2 May 13 13:52:53 sonicyouth kernel: ata4: [ITHREAD] If I csup the most recent kernel sources, I get the same problem. However, if, after csuping the latest kernel sources, I then fetch the version of sys/dev/ata/ata-all.c as of April 27, everything works fine. Here's the output of pciconf -l: The only change in 8-STABLE ata-all.c since April 27 was the SVN rev 221155. But I don't see how can it cause problems. I would really like to see full _verbose_ demsg output to better understand what is going on there. If it even panics, I need to see how exactly. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MPS driver: force bus rescan after remove SAS cable
Rumen Telbizov wrote: Also identify function doesn't work from the OS (no problem via the card BIOS). Don't remember having any luck with sg3_util package either but worth trying again. I don't use SAS myself, but wouldn't the command be inquiry and not identify? identify is for ATA (specifically SATA via CAM), while inquiry is for SCSI. Where SAS fits into this is unknown to me. Well I have SATA disks visible as /dev/da* . From camcontrol(8): inquiry Send a SCSI inquiry command (0x12) to a device. By default, camcontrol will print out the standard inquiry data, device serial number, and transfer rate information. The user can specify that only certain types of inquiry data be printed: Example: # camcontrol inquiry /dev/da47 pass48: ATA WDC WD2003FYYS-0 0D02 Fixed Direct Access SCSI-5 device pass48: Serial Number WD-WMAUR0408496 pass48: 300.000MB/s transfers, Command Queueing Enabled It's a SATA disk in this case attached to SAS/SATA backplane and SAS2008 HBA chip (9211-8i) What I need is a way to light on the fault led on the disk that I want to identify (point to) This is usually what I need when I send a DC technician to replace a disk. For which I though I should be using: identifySend a ATA identify command (0xec) to a device. From my experience SAS or SATA disks - I always get those as /dev/da* disks. It's a combo controller and backplane. So which is the correct way of identifying a disk? `camcontrol identify` means sending ATA IDENTIFY DEVICE command to the ATA device. That command is roughly the analogue of the SCSI INQUIRY command. It has nothing to do with LEDs. LEDs most likely controlled via ses device or some alike management thing. The fact that you see ATA device as daX is just means that your SAS controller does protocol translation on-the-fly. It allows you to communicate with disk using SCSI commands _instead_ of ATA. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: MPS driver: force bus rescan after remove SAS cable
On 27.04.2011 16:39, Denny Schierz wrote: Am Mittwoch, den 27.04.2011, 05:57 -0700 schrieb Jeremy Chadwick: camcontrol reset 0 0:22:0 is available, 0:46:0 not: root@iscsihead-m:~# camcontrol reset 0:22:0 Reset of 0:22:0 was successful root@iscsihead-m:~# camcontrol reset 0:46:0 camcontrol: cam_open_btl: no passthrough device found at 0:46:0 You should reset whole bus, not the specific LUN. Full reset doesn't need that passthrough device. IIRC it works via xpt0. We bought the LSI SAS6160 switch: http://www.lsi.com/storage_home/products_home/sas_switch/sas6100/index.html use the LSI 9200-8e hostbusadapter and LSI JBODs 630j. We had a lot of e-mail conversation with LSI and they mean, that we need the switch for a clear failover setup. Also a reason for the switch: increase storage with more jbods. Every jbod has his own cable to the (later) redundant switch. Otherwise we have to build a bus from JBOD to JBOD to JBOD to JBOD to host ... bad idea ;-) The question is, what does the driver while FreeBSD starts? CAM exactly does full bus reset and after few seconds full rescan. What controller driver may do except it depends on it alone. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Promise SATA controller issues...
George Kontostanos wrote: I have a Promise PDC40718 SATA300 controller running on a box from 8.0-Release since now now on 8.2-Stable 8.2-STABLE FreeBSD 8.2-STABLE #3: Thu Apr 21 15:23:08 EEST 2011. The controller is in jbod mode with 3 WD drives in Raidz1. ad6: 610480MB WDC WD6401AALS-00J7B1 05.00K05 at ata3-master UDMA100 SATA 3Gb/s ad8: 610480MB WDC WD6402AAEX-00Y9A0 05.01D05 at ata4-master UDMA100 SATA 3Gb/s ad10: 610480MB WDC WD6401AALS-00J7B1 05.00K05 at ata5-master UDMA100 SATA 3Gb/s Today the box became unresponsive so I had to do a hard reset. From /var/log/messages: It appears from the logs that the problem lasted for a full day! However, after the reboot the drive did not perform any resilver and no data loss occurred. I have scrubbed my pool successfully and run smartmon tests also. It doesn't appear to be a drive issue so I was wondering if the recent changes in controllers that appeared a few days ago might be related. There was no changes specific to the Promise controllers for a long time. Mostly because I have no any documentation for them. For the same reason I hardly can say what could be wrong there. Some additional information is definitely required. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Promise SATA controller issues...
George Kontostanos wrote: Please let me know what kind of information might be also useful. I don't know. What were the first messages before the problem? Was there any specific activity? It would be most useful if you could reproduce the problem in controllable environment. There was no changes specific to the Promise controllers for a long time. Mostly because I have no any documentation for them. For the same reason I hardly can say what could be wrong there. Some additional information is definitely required. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Promise SATA controller issues...
George Kontostanos wrote: The system was up since April 21 when I upgraded to the latest world kernel. There are 2 pools, a mirror with root on ZFS for the OS and a Raidz1 just for the data. There was nothing out of the ordinary before this except some repeated power failures that where handled by the UPS as you can see in the logs: Apr 25 13:05:21 hp apcupsd[870]: Power is back. UPS running on mains. Apr 25 13:32:48 hp apcupsd[870]: Power failure. Apr 25 13:32:50 hp apcupsd[870]: Power is back. UPS running on mains. Apr 25 13:35:06 hp apcupsd[870]: Power failure. Apr 25 22:08:35 hp kernel: ata4: SIGNATURE: Apr 25 22:08:35 hp kernel: ata4: timeout waiting to issue command Apr 25 22:08:35 hp kernel: ata4: error issuing SETFEATURES SET TRANSFER MODE command . I would enable verbose kernel messages for case it it repeats again. May be it gives some more understanding. But that's not a fact. On Tue, Apr 26, 2011 at 10:39 PM, Alexander Motin m...@freebsd.org mailto:m...@freebsd.org wrote: George Kontostanos wrote: Please let me know what kind of information might be also useful. I don't know. What were the first messages before the problem? Was there any specific activity? It would be most useful if you could reproduce the problem in controllable environment. There was no changes specific to the Promise controllers for a long time. Mostly because I have no any documentation for them. For the same reason I hardly can say what could be wrong there. Some additional information is definitely required. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Sense fetching [Was: cdrtools /devel ...]
Buganini wrote: does r22056{3,5,6,9} supercede these patches ? Yes. They solve problem from different side. my dvd burning with ahci seems to be fixed by those commits, without these patches. I've just burned a DVD successful, and it's readable. Yea, I've also burned few DVDs with cdrecord-devel for testing. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org