Re: ZFS root mount regression

2019-07-21 Thread Alexander Motin
Hi,

I am not sure how the original description leads to conclusion that
problem is related to parallel mounting.  From my point of view it
sounds like a problem that root pool mounting happens based on name, not
pool GUID that needs to be passed from the loader.  We have seen problem
like that ourselves too when boot pool names collide.  So I doubt it is
a new problem, just nobody got to fixing it yet.

On 20.07.2019 06:41, Eugene Grosbein wrote:
> CC'ing Alexander Motin who comitted the change.
> 
> 20.07.2019 1:21, Garrett Wollman wrote:
> 
>> I recently upgraded several file servers from 11.2 to 11.3.  All of
>> them boot from a ZFS pool called "tank" (the data is in a different
>> pool).  In a couple of instances (which caused me to have to take a
>> late-evening 140-mile drive to the remote data center where they are
>> located), the servers crashed at the root mount phase.  In one case,
>> it bailed out with error 5 (I believe that's [EIO]) to the usual
>> mountroot prompt.  In the second case, the kernel panicked instead.
>>
>> The root cause (no pun intended) on both servers was a disk which was
>> supplied by the vendor with a label on it that claimed to be part of
>> the "tank" pool, and for some reason the 11.3 kernel was trying to
>> mount that (faulted) pool rather than the real one.  The disks and
>> pool configuration were unchanged from 11.2 (and probably 11.1 as
>> well) so I am puzzled.
>>
>> Other than laboriously running "zpool labelclear -f /dev/somedisk" for
>> every piece of media that comes into my hands, is there anything else
>> I could have done to avoid this?
> 
> Both 11.3-RELEASE announcement and Release Notes mention this:
> 
>> The ZFS filesystem has been updated to implement parallel mounting.
> 
> I strongly suggest reading Release documentation in case of troubles
> after upgrade, at least. Or better, read *before* updating.
> 
> I guess this parallelism created some race for your case.
> 
> Unfortunately, a way to fall back to sequential mounting seems undocumented.
> libzfs checks for ZFS_SERIAL_MOUNT environment variable to exist having any 
> value.
> I'm not sure how you set it for mounting root, maybe it will use kenv,
> so try adding to /boot/loader.conf:
> 
> ZFS_SERIAL_MOUNT=1
> 
> Alexander should have more knowledge on this.
> 
> And of course, attaching unrelated device having label conflicting
> with root pool is asking for trouble. Re-label it ASAP.
> 

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: about zfs and ashift and changing ashift on existing zpool

2019-04-08 Thread Alexander Motin
On 08.04.2019 20:21, Eugene Grosbein wrote:
> 09.04.2019 7:00, Kevin P. Neal wrote:
> 
>>> My guess (given that only ada1 is reporting a blocksize mismatch) is that
>>> your disks reported a 512B native blocksize.  In the absence of any 
>>> override,
>>> ZFS will then build an ashift=9 pool.
> 
> [skip]
> 
>> smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.2-RELEASE-p4 amd64] (local build)
>> Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org
>>
>> === START OF INFORMATION SECTION ===
>> Vendor:   SEAGATE
>> Product:  ST2400MM0129
>> Revision: C003
>> Compliance:   SPC-4
>> User Capacity:2,400,476,553,216 bytes [2.40 TB]
>> Logical block size:   512 bytes
>> Physical block size:  4096 bytes
> 
> Maybe it't time to prefer "Physical block size" over "Logical block size" in 
> relevant GEOMs
> like GEOM_DISK, so upper levels such as ZFS would do the right thing 
> automatically.

No.  It is a bad idea.  Changing logical block size for existing disks
will most likely result in breaking compatibility and inability to read
previously written data.  ZFS already uses physical block size when
possible -- on pool creation or new vdev addition.  When not possible
(pool already created wrong) it just complains about it, so that user
would know that his configuration is imperfect and he should not expect
full performance.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: TSC timekeeping and cpu states

2017-08-14 Thread Alexander Motin
On 14.08.2017 18:38, Ian Smith wrote:
> On Mon, 14 Aug 2017 17:16:22 +1000, Aristedes Maniatis wrote:
>  > On 14/8/17 3:08PM, Kevin Oberman wrote:
>  > > Again, the documentation lags reality. The default was changed for 
>  > > 11.0. It is still conservative. In ALMOST all cases, Cmax will yield 
>  > > the bast results. However, on large systems with many cores, Cmax 
>  > > will trigger very poor results, so the default is C2, just to be 
>  > > safe.
> 
> Given it's a server, anything beyond C2 is likely not worth trying. 
> OTOH, C2 is perhaps not worth avoiding; it's probably low latency and 
> should result in lower power consumption, so heat, and unlikely to hurt.
> 
> Or at least, I suspect that's the case .. cc'ing Alexander, as the wiki 
> article you referenced was his doing, so he's among those best placed.

C-states controlled here are ACPI C-states, which have limited relation
to real CPU C-states.  There are systems where they map exactly, but
there are also systems where ACPI C1/C2/C3 states map to CPU C1/C3/C6,
so it is difficult to make general recommendations.  Approximately the
map can be guessed looking on latency value (last of three) reported in
sysctl dev.cpu.0.cx_supported:  1 is usually CPU C1, 2+ is likely CPU
C2, 100+ can be C3, 500+ can be C6, but all that is very approximately
and I guess depends on BIOS writer mood.

What's about recommendations from me, I'd say that CPU C2 state should
not hurt in most cases, unless something is broken, but benefit is
rather small (often just covered by C1E enabled in BIOS);  CPU C3 state
gives significant power saving, but can either hurt performance due to
higher enter/exit latency or slightly improve it due to TurboBoost
activation (require CPU frequency to be set to max value); CPU C6 is
probably useful only for laptops, since it saves not so much power
power, while exit latency can be in milliseconds range.

>  > > As far as possible TSC impact, I think older processors had TSC
>  > > issues when not all cores ran with the same clock speed. That said,
>  > > I am not remotely expert on such issues, so don't take this too 
>  > > seriously.
> 
> I wasn't aware that FreeBSD could yet do different freqs on different 
> cores?  But I'm less expert than Kevin, and certainly behind the times.

On old CPUs TSC frequency was related to CPU frequency and so could
fluctuate with frequency change.  On modern CPUs it is always constant,
equal to base CPU frequency.  What's about different frequency for
different cores, IIRC ACPI allows that, but up to recent time neither
FreeBSD nor hardware could do that.  I have feeling I heard that some
very new CPUs may allow that, but to be efficient it would require very
tight interoperation between power manager and CPU scheduler, otherwise
performance may suffer.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Mega ZFS MFCs

2017-07-27 Thread Alexander Motin
Hi Mike,

On 27.07.2017 16:21, Mike Tancsa wrote:
>   I noticed quite a few MFCs to RELENG_11 around zfs yesterday and today.
> First off, thank you for all these fixes/enhancements! Of the some 60
> MFCs, are there any particular ones to be more aware of when updating
> servers ? 

The most complicated and invasive to me looks r321610 "8021 ARC buf data
scatter-ization".  It took 5 fix commits to make it behave in head, but
Andriy told me it should be good now, and I run it on my systems too.

> Are there any more to come, or is now a good time to test things out ?

I've merged all we had in head (except couple gptzfsboot commits
significantly increasing its size, that could break POLA).  Next round
will any way go to head first, so stable/11 should probably be idle for
a month at least and should be good for testing now.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: stable/11 debugging kernel unable to produce crashdump again

2017-07-24 Thread Alexander Motin
I guess that problem of g_raid_shutdown_post_sync in case of panic can
be explained by the fact it tries to write clean metadata in regular
(not dumping) way while system is already in panic mode and there is no
proper scheduling.  May be it could be just bypassed in case of dumping
(should be trivial and probably OK), or use g_raid_subdisk_kerneldump()
in that case instead of normal GEOM I/O.

On 24.07.2017 20:03, Eugene Grosbein wrote:
> CCing mav@ as graid expert.
> 
> On 24.07.2017 08:44, Mark Johnston wrote:
> 
>>> Sadly, this time 11.1-STABLE r321371 SMP hangs instead of doing crashdump:
>>>
>>> - "call doadump" from DDB prompt works just fine;
>>> - "shutdown -r now" reboots the system without problems;
>>> - "sysctl debug.kdb.panic=1" triggers a panic just fine but system hangs 
>>> just afer showing uptime
>>> instead of continuing with crashdump generation; same if "real" panic 
>>> occurs.
>>>
>>> Same for debug.minidump set to 1 or 0. How do I debug this?
>>
>> I'm not able to reproduce the problem in bhyve using r321401. Looking
>> at the code, the culprits might be cngrab(), or one of the
>> shutdown_post_sync eventhandlers. Since you're apparently able to see
>> the console output at the time of the panic, I guess it's probably the
>> latter. Could you try your test with the patch below applied? It'll
>> print a bunch of "entering post_sync"/"leaving post_sync" messages with
>> addresses that can be resolved using kgdb. That'll help determine where
>> we're getting stuck.
>>
>> Index: sys/sys/eventhandler.h
>> ===
>> --- sys/sys/eventhandler.h   (revision 321401)
>> +++ sys/sys/eventhandler.h   (working copy)
>> @@ -85,7 +85,11 @@
>>  _t = (struct eventhandler_entry_ ## name *)_ep; \
>>  CTR1(KTR_EVH, "eventhandler_invoke: executing %p", \
>>  (void *)_t->eh_func);   \
>> +if (strcmp(__STRING(name), "shutdown_post_sync") == 0) \
>> +printf("entering post_sync %p\n", (void 
>> *)_t->eh_func); \
>>  _t->eh_func(_ep->ee_arg , ## __VA_ARGS__);  \
>> +if (strcmp(__STRING(name), "shutdown_post_sync") == 0) \
>> +printf("leaving post_sync %p\n", (void 
>> *)_t->eh_func); \
>>  EHL_LOCK((list));   \
>>  }   \
>>  }   \
>>
> 
> Thanks, this helped:
> 
> $ addr2line -f -e kernel.debug 0x80919c00
> g_raid_shutdown_post_sync
> /home/src/sys/geom/raid/g_raid.c:2458
> 
> That is GEOM_RAID's g_raid_shutdown_post_sync() that hangs if called just 
> before
> crashdump generation but works just fine during normal system shutdown.
> 
> I should note my graid's RAID1 is running in degraded state currently
> due to dead SSD module that does not respond. Here is part of boot log:
> 
> ahcich5: AHCI reset: device not ready after 31000ms (tfd = 0080)
> ahcich5: Poll timeout on slot 2 port 0
> ahcich5: is  cs 0004 ss  rs 0004 tfd 80 serr  
> cmd c217
> (aprobe2:ahcich5:0:0:0): NOP FLUSHQUEUE. ACB: 00 00 00 00 00 00 00 00 00 00 
> 00 00
> (aprobe2:ahcich5:0:0:0): CAM status: Command timeout
> (aprobe2:ahcich5:0:0:0): Error 5, Retries exhausted
> run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
> ahcich5: Poll timeout on slot 3 port 0
> ahcich5: is  cs 0008 ss  rs 0008 tfd 80 serr  
> cmd c317
> (aprobe2:ahcich5:0:0:0): SOFT_RESET. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
> (aprobe2:ahcich5:0:0:0): CAM status: Command timeout
> (aprobe2:ahcich5:0:0:0): Error 5, Retries exhausted
> [skip]
> Trying to mount root from ufs:/dev/raid/r0s4a [rw,noatime]...
> Root mount waiting for: GRAID-Intel
> Root mount waiting for: GRAID-Intel
> Root mount waiting for: GRAID-Intel
> Root mount waiting for: GRAID-Intel
> Root mount waiting for: GRAID-Intel
> GEOM_RAID: Intel-c291fe96: Force array start due to timeout.
> GEOM_RAID: Intel-c291fe96: Disk ada0 state changed from NONE to ACTIVE.
> GEOM_RAID: Intel-c291fe96: Subdisk r0:0-ada0 state changed from NONE to STALE.
> GEOM_RAID: Intel-c291fe96: Array started.
> GEOM_RAID: Intel-c291fe96: Subdisk r0:0-ada0 state changed from STALE to 
> ACTIVE.
> GEOM_RAID: Intel-c291fe96: Volume r0 state changed from STARTING to DEGRADED.
> GEOM_RAID: Intel-c291fe96: Provider raid/r0 for volume r0 created.
> 
>  
> 

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ASM1062 AHCI timeouts, ppt(4) BAR aligning [Was: Re: svn commit: r309251 - head/sys/dev/ahci]

2016-12-29 Thread Alexander Motin
On 29.12.2016 10:35, Harry Schmalzbauer wrote:
> I'd like to report that this doesn't fix timeouts for me (applied to
> 11-stable).
> 
> For example my REV120 works without problems on Intel-AHCI but not on
> ASM1062-AHCI.
> Even attaching gives different output. Both look fine at first:
> #cd0 at ahcich0 bus 0 scbus5 target 0 lun 0
> #cd0:  Removable CD-ROM SCSI device
> #cd0: Serial Number 0C1E4D046E5DFF18
> #cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO
> 8192bytes)
> 
> When attached to the Intel-AHCI, it's followed by
> +cd0: Attempt to query device size failed: NOT READY, Medium not present
> while attaching to ASM1062 it reads (!?)
> -cd0: 0MB (1 0 byte sectors)
> 
> Then these timeouts occur:
> ahcich7: Timeout on slot 11 port 0
> ahcich7: is  cs 0c00 ss  rs 0c00 tfd 6051 serr
>  cmd 0004cb17
> ahcich7: Timeout on slot 24 port 0
> ahcich7: is  cs 0180 ss  rs 0180 tfd 2051 serr
>  cmd 0004d817
> ahcich7: Timeout on slot 6 port 0
> ahcich7: is  cs 0060 ss  rs 0060 tfd 2051 serr
>  cmd 0004c617
> ahcich7: Timeout on slot 20 port 0
> ahcich7: is  cs 0018 ss  rs 0018 tfd 2051 serr
>  cmd 0004d417
> 
> Also IDENT (via camcontrol) "hangs" for 20 seconds, but finally succeeds.

I think problem may be different in your case.  The HBA still reports
that command is not completed by the device.  Unfortunately I don't have
those fancy drives to try, but I'll try to reproduce it with regular CD
drive when I get back home after short New Year holidays.

> Btw: I already found out that extending ppt(4) to support unaligned base
> address register wouldn't be too easy.
> Initially I added that ASM1062 card to use it for byhve(8) passthrough.
> Unfortunately that doesn't work:
> bhyve: passthru device 6/0/0 BAR 5: base 0xc3e1 or size 0x200 not
> page aligned
> That's the ASM1062:
> ppt0@pci0:6:0:0:class=0x010601 card=0x10601b21 chip=0x06121b21
> rev=0x01 hdr=0x00
> bar   [10] = type I/O Port, range 32, base 0x5050, size 8, enabled
> bar   [14] = type I/O Port, range 32, base 0x5040, size 4, enabled
> bar   [18] = type I/O Port, range 32, base 0x5030, size 8, enabled
> bar   [1c] = type I/O Port, range 32, base 0x5020, size 4, enabled
> bar   [20] = type I/O Port, range 32, base 0x5000, size 32, enabled
> bar   [24] = type Memory, range 32, base 0xc3e1, size 512, enabled

I believe it is bhyve bug, since these values are just what hardware
reports.  BAR size of 512 bytes indeed does not align to 4K, but that is
not our problem. :)

> Are there any recommendations for AHCI (SATA-PCIe) controller
> cards/chips that do work (both, for byhve passthrough and also as plain
> AHCI provider)?

Please don't mix multiple unrelated questions in one email.

There is very little reasonable external AHCI controllers on the market
now.  I am not sure anything other then Marvell and ASmedia were
released at all in last years since 6Gbps SATA came out.  Marvell and
ASmedia probably worth each other, while later Marvell may be slightly
better on functionality (number of ports and FBS PMP support), but they
are both desktop products.  If you need this in server environment --
think about about SAS adapter like LSI.  Or just use on-board Intel
AHCI, since they are probably the best om reliability you may get out of
SATA.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: stable/10: high load average when box is idle

2015-12-26 Thread Alexander Motin
On 26.12.2015 17:09, Ian Smith wrote:
> Current hypothesis: some variable/s are getting improperly initialised 
> at boot, but are (somehow?) getting properly re-initialised on changing 
> cpuset to 1 then back to 2 cpus - though I've no idea how, or by what.

While this is interesting hypothesis, I see no real ground for it in the
code.  My own explanation here, same as before, is in area of events
aliasing.  HPET, due to its hardware limitations, more prone to
different synchronization effects then LAPIC.  And those limitations are
specific to hardware configuration.  On modern hardware HPET may provide
(up 8) per-CPU MSI interrupts.  This is the best case for everything
with minimal chances for aliasing (unless you have more then 8 logical
cores).  On older hardware it is typical to have HPET sharing single
interrupt line with some other device(s) and generating events for all
CPUs from it.  Interrupt line sharing tends to create load of 1.0 due to
counting its own interrupt thread.  I've partially workarounded that at
some point, but aliasing possibilities are still there.  Driving
multiple CPUs from the same interrupt also creates aliasing, since
different CPUs wakeup close to each other and may count each-others
load.  Different CPU wakeup times from different sleep states and other
sources of jitter may generate quite complicated but not really useful
behavior patterns.

Happy holidays!

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Bug 204641 - 10.2 UNMAP/TRIM not available on a zfs zpool that uses iSCSI disks, backed on a zpool file target

2015-11-18 Thread Alexander Motin
On 18.11.2015 02:28, Steven Hartland wrote:
> On 17/11/2015 22:08, Christopher Forgeron wrote:
>> I just submitted this as a bug:
>>
>> ( https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204641 )
>>
>> ..but I thought I should bring it to the list's attention for more
>> exposure
>> - If that's a no-no, let me know, as I have a few others that are related
>> to this that I'd like to discuss.

> Having a quick flick through the code it looks like umap is now only
> supported on dev backed and not file backed.
> 
> I believe the following commit is the cause:
> https://svnweb.freebsd.org/base?view=revision=279005
> 
> This was an MFC of:
> https://svnweb.freebsd.org/base?view=revision=278672
> 
> I'm guessing this was an unintentional side effect mav?

As I have replied on the ticket: CTL never supported UNMAP on
file-backed LUNs due to lack of respective API for hole punching on
FreeBSD. At this time UNMAP works for ZVOLs in both device and file
modes and raw devices.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: recent ZFS / CAM updates in RELENG_10?

2015-10-05 Thread Alexander Motin
Hi.

On 05.10.2015 16:17, Mike Tancsa wrote:
>   I noticed a whole whack of MFCs to RELENG_10 for zfs and cam (thanks
> for all that!) Just wondering if there is more to come, or is this
> perhaps a good time to start testing with all these changes on a few non
> critical boxes ?

At this point I've merged all I planned. There are few more recent ZFS
commits in HEAD that are not merged, but they are not mine, so I leave
them to authors. So yes, I think now it is a good time to start testing.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: VIMAGE kernel broken after 255541

2013-09-14 Thread Alexander Motin

Hi.

The change is reverted. Sorry.

On 14.09.2013 16:05, goran.lowkra...@ismobile.com wrote:

Hi,

After 255541 I can't compile a VIMAGE kernel:
cc1: warnings being treated as errors
/usr/src/sys/kern/sched_ule.c: In function 'cpu_search':
/usr/src/sys/kern/sched_ule.c:638: warning: implicit declaration of
function 'CPU_FFS'
/usr/src/sys/kern/sched_ule.c:638: warning: nested extern declaration of
'CPU_FFS' [-Wnested-externs]
*** [sched_ule.o] Error code 1

Kernconf:
VSERVER:
#
# VSERVER --A VIMAGE kernel configuration file for FreeBSD/amd64
#
# $FreeBSD: stable/9/sys/amd64/conf/XENHVM 239412 2012-08-20 11:34:49Z
cperciva $
#
include SERVER
ident   VSERVER

# VIMAGE config
option  VIMAGE

SERVER:
#
# SERVER -- General server
#

include GENERIC

ident   SERVER

# Update resources for PostgreSQL
options SHMMAXPGS=65536
options SEMMNI=40
options SEMMNS=240
options SEMUME=40
options SEMMNU=120

#
# Compile with kernel debugger related code.
#
options KDB
options KDB_TRACE
options KDB_UNATTENDED
options DDB

#optionsINVARIANTS
#optionsINVARIANT_SUPPORT
#optionsWITNESS

#optionsDEBUG_LOCKS
#optionsDEBUG_VFS_LOCKS

# Include Apple Talk support
options   NETATALK


/glz



--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: GEOM RAID devd events

2013-08-01 Thread Alexander Motin

On 01.08.2013 12:36, Daniel O'Connor wrote:

Hi,
Does anyone know if graid generates devd events for 'interesting' RAID events? (eg 
array becoming degraded, rebuild progress  completion, etc). I had a look and 
I couldn't find any devctl_notify* calls but perhaps they are hidden behind some 
GEOM calls.

If there aren't, are there any plans to add some? I am happy to test, or even 
write if I can find some time.


GEOM RAID does not do anything special about devd now. I had no such 
plans, but probably that is a not a bad idea if do it well.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: GEOM RAID devd events

2013-08-01 Thread Alexander Motin

On 01.08.2013 13:27, Daniel O'Connor wrote:


On 01/08/2013, at 19:56, Daniel O'Connor docon...@gsoft.com.au wrote:

GEOM RAID does not do anything special about devd now. I had no such plans, but 
probably that is a not a bad idea if do it well.


Do you have a recommendation for where I should start looking? (ie a hint about 
where such a thing would go)


After doing the reading I should have done before I sent my last message I see 
that g_raid_update_* look good candidates.


That would be nice to do it is possibly more generic way to be usable 
for other GEOM classes, such as MIRROR, MULTIPATH, etc. At least make 
messages formatting unified.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Supermicro and FreeBSD 9.2 PRERELEASE make_dev_physpath_alias: WARNING

2013-07-15 Thread Alexander Motin

On 15.07.2013 14:10, Sergey Kandaurov wrote:

On 15 July 2013 14:02, Johan Hendriks joh.hendr...@gmail.com wrote:

We use basic supermicro cases for our storage servers in combination with a
LSI 9211-8i controller in IT mode.

Since 9.1 or shortly there after we get for every disk we attach to the SAS
backplane the following error.

make_dev_physpath_alias: WARNING - Unable to alias
gptid/abb586f5-da8d-11e2-aaaf-00259061b51a to
enc@n500304800122877d/type@0/slot@f/elmdesc@Slot_15/gptid/abb586f5-da8d-11e2-aaaf-00259061b51a
- path too long.

I know it does not harm the operation, but every time the server boots or
when we add a disk i get a little scared when i see WRNINGS passing by.

Is there a way to supress these WARNINGS, or is there something i can do
about it.


This is because the name is longer than SPECNAMELEN.
You barely can do anything with it. The warning is hidden under bootverbose
in 10-CURRENT, and I think it should be merged to stable/9 before 9.2 release.
Meantime you can manually apply this change:

http://svnweb.freebsd.org/changeset/base/235899


Thank you for the reminder, sent MFC request.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Marvell 88SE91Ax simple patch

2013-07-09 Thread Alexander Motin

On 09.07.2013 11:24, Dmitry Morozovsky wrote:

Alexander,

trying to activate eSATA port on my home file server I found that the following
simple patch seems to work -- could you please add it, hopefully before 9.2-R?

marck@hamster:/sys svn diff dev/ahci
Index: dev/ahci/ahci.c
===
--- dev/ahci/ahci.c (revision 252889)
+++ dev/ahci/ahci.c (working copy)
@@ -234,6 +234,7 @@
 {0x91301b4b, 0x00, Marvell 88SE9130,  AHCI_Q_NOBSYRES|AHCI_Q_ALTSIG},
 {0x91721b4b, 0x00, Marvell 88SE9172,  AHCI_Q_NOBSYRES},
 {0x91821b4b, 0x00, Marvell 88SE9182,  AHCI_Q_NOBSYRES},
+   {0x91a01b4b, 0x00, Marvell 88SE91Ax,  AHCI_Q_NOBSYRES},
 {0x92201b4b, 0x00, Marvell 88SE9220,  AHCI_Q_NOBSYRES|AHCI_Q_ALTSIG},
 {0x92301b4b, 0x00, Marvell 88SE9230,  AHCI_Q_NOBSYRES|AHCI_Q_ALTSIG},
 {0x92351b4b, 0x00, Marvell 88SE9235,  AHCI_Q_NOBSYRES},


Committed to HEAD. Thanks.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout

2013-06-04 Thread Alexander Motin

On 03.06.2013 23:22, Jeremy Chadwick wrote:

On Mon, Jun 03, 2013 at 03:06:53PM +0100, Mike Pumford wrote:

Ian Lepore wrote:

On Wed, 2013-05-29 at 16:21 +0200, Oliver Fromme wrote:

Steven Hartland wrote:
   Have you checked your sata cables and psu outputs?
  
   Both of these could be the underlying cause of poor signalling.

I can't easily check that because it is a cheap rented
server in a remote location.

But I don't believe it is bad cabling or PSU anyway, or
otherwise the problem would occur intermittently all the
time if the load on the disks is sufficiently high.
But it only occurs at tags=3 and above.  At tags=2 it does
not occur at all, no matter how hard I hammer on the disks.

At the moment I'm inclined to believe that it is either
a bug in the HDD firmware or in the controller.  The disks
aren't exactly new, they're 400 GB Samsung ones that are
several years old.  I think it's not uncommon to have bugs
in the NCQ implementation in such disks.

The only thing that puzzles me is the fact that the problem
also disappears completely when I reduce the SATA rev from
II to I, even at tags=32.



It seems to me that you dismiss signaling problems too quickly.
Consider the possibilities... A bad cable leads to intermittant errors
at higher speeds.  When NCQ is disabled or limited the software handles
these errors pretty much transparently.  When NCQ is not limitted and
there are many outstanding requests, suddenly the error handling in the
software breaks down somehow and a minor recoverable problem becomes an
in-your-face error.


It could also be a software bug in the way CAM handles the failure
of NCQ commands. When command queueing is used on a SCSI drive and a
queued command fails only that command fails. A queued command
failure on a SATA device fails ALL currently queued commands. I've
not looked at the code but do the SATA CAM drivers do the right
thing here?


Quoting T13/2015-D ATA8-ACS2 WD spec:

If an error occurs while the device is processing an NCQ command, then
the device shall return command aborted for all NCQ commands that are in
the queue and shall return command aborted for any new commands, except
a READ LOG EXT command requesting log address 10h, until the device
completes a READ LOG EXT command requesting log address 10h (i.e.,
reading the NCQ Command Error log) without error.

While I can't easily provide an answer to your question, I can tell you
that sys/dev/ahci/ahci.c does execute READ LOG EXT (command 0x2f) for
certain scenarios (the code is in function ahci_issue_recovery()).


I am not aware about any flows in present CAM ATA error recovery logic. 
READ LOG EXT sending indeed implemented on ahci(4) driver level (same as 
siis(4) and mvs(4)) since it was complicated/impossible to do in shared 
code because higher levels have no idea about tags allocation done by 
lower-level drivers.



The one person who can answer this question is mav@, who is now CC'd.


Less commands queued makes it less likely that multiple commands
will be in progress when a failure occurs.  A lower link rate also
makes you more immune to signal failures.


He isn't seeing SATA-level signal/link failure; the AHCI driver would
complain about that, and those messages aren't there.  Unless, of
course, those messages are only visible when verbose booting is enabled
(I hope not).


Just a curious history point: I had one old system on NVIDIA MCP55 
chipset where Linux worked well before, but FreeBSD had problems with 
SATA -- all disk transfers were really slow, but without reporting any 
errors, and after some point system started to hang. That series of 
chipsets had long history of problems, so for some time I was looking 
for some way to handle it in software. But after many experiments I've 
accidentally found out that disabling 6 small but very powerful fans 
workarounded the problem. I've checked PSU voltages, and they were fine. 
Switching fans to separate PSU also helped. Finally I've just replaced 
system's main PSU with different one and problems have gone. My best 
guess was that capacitors in that PSU due to old age were unable to 
filter fan's electric noise that started to interfere with SATA and 
later other signals. Now the same PSU works perfectly fine in the same 
case with smaller Atom-based motherbard without any issues.


I am not telling that ahci(4) driver is perfect, but hardware issues are 
always possible even if system worked fine before that.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ada(4) and ahci(4) quirk printing

2013-04-23 Thread Alexander Motin

On 22.04.2013 08:14, Jeremy Chadwick wrote:

I've written the following patches and done the following testing (see
the results.*.txt files):

http://jdc.koitsu.org/freebsd/quirk_printing/

Important: these are against stable/9 r249715.

Folks are welcome to try these; I've tested about as best as I can.

Questions/comments for Alexander and Kenneth:

1. I'm not sure if the location of where I added the printf() code is
correct or not,


It seems fine for me.


2. Not sure if loader.conf(5) forced-quirks would show up here or not,


As I see, they will.


3. It would be nice to have the same for SCSI da(4).  I took a stab at
this but the printing code I wrote never got called (or the quirks entry
I added wasn't right, not sure which),

4. I strongly believe quirk printing should be shown *without* verbose
booting.  I say this because I noticed some of the CAPAB printf()s only
get shown if bootverbose is true.  In fact, it's what prompted me to
open PR 178040 (My Intel 320 and 510-series SSDs don't show 4K quirks,
yet advertise 512 logical and physical in IDENTIFY?!  PR time!).


Let me disagree. bootverbose keeps dmesg readable for average user, 
while quirks are specific driver workarounds and their names may confuse 
more then really help. If every driver print its quirks, dmesg would be 
two times bigger. There is bootverbose for it.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ada(4) and ahci(4) quirk printing

2013-04-23 Thread Alexander Motin

On 23.04.2013 12:26, Jeremy Chadwick wrote:

On Tue, Apr 23, 2013 at 10:44:57AM +0300, Alexander Motin wrote:

On 22.04.2013 08:14, Jeremy Chadwick wrote:

I've written the following patches and done the following testing (see
the results.*.txt files):

http://jdc.koitsu.org/freebsd/quirk_printing/

Important: these are against stable/9 r249715.

Folks are welcome to try these; I've tested about as best as I can.

Questions/comments for Alexander and Kenneth:

1. I'm not sure if the location of where I added the printf() code is
correct or not,


It seems fine for me.


2. Not sure if loader.conf(5) forced-quirks would show up here or not,


As I see, they will.


3. It would be nice to have the same for SCSI da(4).  I took a stab at
this but the printing code I wrote never got called (or the quirks entry
I added wasn't right, not sure which),

4. I strongly believe quirk printing should be shown *without* verbose
booting.  I say this because I noticed some of the CAPAB printf()s only
get shown if bootverbose is true.  In fact, it's what prompted me to
open PR 178040 (My Intel 320 and 510-series SSDs don't show 4K quirks,
yet advertise 512 logical and physical in IDENTIFY?!  PR time!).


Let me disagree. bootverbose keeps dmesg readable for average user,
while quirks are specific driver workarounds and their names may
confuse more then really help. If every driver print its quirks,
dmesg would be two times bigger. There is bootverbose for it.


I'm willing to bend on this assuming that userland has a way to display
the quirks.  I've already had one user contact me off-list stating that
displaying of quirks is useful to them, but *without* bootverbose
(because bootverbose shows too much information for them to have to sift
through).  And display of quirks (or in this case) was what prompted me
to create PR 178040, since I had just *assumed* FreeBSD had 4K quirks in
place for both models of SSDs.

I think sysctl would be an ideal place for this.  Is it possible to
export active device quirks to sysctl (say kern.cam.ada.X.quirks),
read-only, and preferably as a string (same printf() style used)?  Or
does that introduce complexities?

If we can't reach an agreement, I'm happy to wrap the relevant bits with
an if (bootverbose), but I really feel users should have some way to
see this information outside of bootverbose.


Both da and ada drivers already have sysctl's. It should be trivial to 
add one more, especially if just numeric.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ada(4) and ahci(4) quirk printing

2013-04-23 Thread Alexander Motin

On 23.04.2013 13:49, Jeremy Chadwick wrote:

On Tue, Apr 23, 2013 at 12:29:10PM +0300, Alexander Motin wrote:

On 23.04.2013 12:26, Jeremy Chadwick wrote:

On Tue, Apr 23, 2013 at 10:44:57AM +0300, Alexander Motin wrote:

On 22.04.2013 08:14, Jeremy Chadwick wrote:

I've written the following patches and done the following testing (see
the results.*.txt files):

http://jdc.koitsu.org/freebsd/quirk_printing/

Important: these are against stable/9 r249715.

Folks are welcome to try these; I've tested about as best as I can.

Questions/comments for Alexander and Kenneth:

1. I'm not sure if the location of where I added the printf() code is
correct or not,


It seems fine for me.


2. Not sure if loader.conf(5) forced-quirks would show up here or not,


As I see, they will.


3. It would be nice to have the same for SCSI da(4).  I took a stab at
this but the printing code I wrote never got called (or the quirks entry
I added wasn't right, not sure which),

4. I strongly believe quirk printing should be shown *without* verbose
booting.  I say this because I noticed some of the CAPAB printf()s only
get shown if bootverbose is true.  In fact, it's what prompted me to
open PR 178040 (My Intel 320 and 510-series SSDs don't show 4K quirks,
yet advertise 512 logical and physical in IDENTIFY?!  PR time!).


Let me disagree. bootverbose keeps dmesg readable for average user,
while quirks are specific driver workarounds and their names may
confuse more then really help. If every driver print its quirks,
dmesg would be two times bigger. There is bootverbose for it.


I'm willing to bend on this assuming that userland has a way to display
the quirks.  I've already had one user contact me off-list stating that
displaying of quirks is useful to them, but *without* bootverbose
(because bootverbose shows too much information for them to have to sift
through).  And display of quirks (or in this case) was what prompted me
to create PR 178040, since I had just *assumed* FreeBSD had 4K quirks in
place for both models of SSDs.

I think sysctl would be an ideal place for this.  Is it possible to
export active device quirks to sysctl (say kern.cam.ada.X.quirks),
read-only, and preferably as a string (same printf() style used)?  Or
does that introduce complexities?

If we can't reach an agreement, I'm happy to wrap the relevant bits with
an if (bootverbose), but I really feel users should have some way to
see this information outside of bootverbose.


Both da and ada drivers already have sysctl's. It should be trivial
to add one more, especially if just numeric.


I was hoping for an ASCII string, specifically something like what's
outputted in my patches, i.e.:

kern.cam.ada.2.quirks: 0x14K

And ideally it'd be nice to have the same thing for ahci(4), which right
now doesn't appear to have anything other than the dev.ahci.X.%xxx tree
stuff (which I think is handled by the device registration stuff, not
the ahci driver natively).  I'll worry about that later.

The problem with just leaving it as a numeric is that it doesn't provide
the user with any idea of what the value represents.  They're forced to
go through the source code + decode the numeric into it's bit values and
figure out what's what.


I haven't told that it is impossible. I would just prefer to not 
complicate the code too much with rarely used debugging features.



I'm pretty sure I can work this into sys/cam/ata/ata_da.c (looking at
read_ahead as an example, though using SYSCTL_PROC not SYSCTL_INT, and
for how SYSCTL_PROC works with this type of thing, referring to
machdep.c for an example), but it'd be my first time doing any of this.

I'll give it a shot.  I really need to get myself a SFF PC for FreeBSD
just for testing these types of things, unless FreeBSD has some magical
way to test a kernel on a live system without having to reboot.
(Sounds like black magic to me ;-) )


Virtual machine?

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-21 Thread Alexander Motin

On 21.04.2013 00:29, Jeremy Chadwick wrote:

- The ATA commands which lead up to the error also vary.  Many are for
   write requests, and from some entries I can see that the OS was doing
   NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a
   classic 28-bit LBA write (WRITE DMA).  I'm not sure why an OS would do
   this (there's nothing optimal about it) unless there were conditions
   occurring where the OS/ATA driver said this NCQ write isn't working
   (timeout, etc.), let me retry with a classic 28-bit LBA write.


ATA disk driver in CAM inserts non-queued command every several seconds 
of continuous load to limit possible command starvation inside the disk. 
SCSI driver does alike things, but inserts ordered command flag, that 
does not exist in SATA, instead of different command.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-21 Thread Alexander Motin
ATA controller drivers are delaying conflicting commands, avoiding
conflicts in device.
21.04.2013 14:32 пользователь Jeremy Chadwick j...@koitsu.org написал:

 On Sun, Apr 21, 2013 at 02:11:04PM +0300, Alexander Motin wrote:
  On 21.04.2013 00:29, Jeremy Chadwick wrote:
  - The ATA commands which lead up to the error also vary.  Many are for
 write requests, and from some entries I can see that the OS was doing
 NCQ writes (WRITE FPDMA QUEUED) and then suddenly decided to do a
 classic 28-bit LBA write (WRITE DMA).  I'm not sure why an OS would
 do
 this (there's nothing optimal about it) unless there were conditions
 occurring where the OS/ATA driver said this NCQ write isn't working
 (timeout, etc.), let me retry with a classic 28-bit LBA write.
 
  ATA disk driver in CAM inserts non-queued command every several
  seconds of continuous load to limit possible command starvation
  inside the disk. SCSI driver does alike things, but inserts ordered
  command flag, that does not exist in SATA, instead of different
  command.

 Thanks for the insights Alexander, greatly appreciated.

 I'm a little confused by your description, because if I'm reading it
 right, it sounds like it conflicts with what the ACS-2 spec states.
 Quoting T13/2015-D rev 3 (I'm aware it's a working draft), section
 4.16.1:

 If the device receives a command that is not an NCQ command while NCQ
 commands are in the queue, then the device shall return command aborted
 for the new command and for all of the NCQ commands that are in the
 queue.

 I assume this means ABRT status is returned to the host controller; if
 so (and by design of course), how do we differentiate between that
 condition and any other I/O condition that induces ABRT?

 Possibly in the answer is in this admission: I should probably get
 around to reading ATA8-AST sometime.  :-)

 --
 | Jeremy Chadwick   j...@koitsu.org |
 | UNIX Systems Administratorhttp://jdc.koitsu.org/ |
 | Mountain View, CA, US|
 | Making life hard for others since 1977. PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Lost CDROM on 9.1 with ATA_CAM on Promise controller

2013-04-18 Thread Alexander Motin

On 17.04.2013 12:47, Andre Albsmeier wrote:

On Wed, 17-Apr-2013 at 10:53:54 +0200, Jeremy Chadwick wrote:

On Wed, Apr 17, 2013 at 08:26:00AM +0200, Andre Albsmeier wrote:

On Tue, 16-Apr-2013 at 21:38:22 +0200, Jeremy Chadwick wrote:

On Tue, Apr 16, 2013 at 07:55:20PM +0200, Andre Albsmeier wrote:

I have lost one of my CDROM drives (HL-DT-STDVD-RAM GH22LP20/2.00)
after going from 7.4 to 9.1 when using ATA_CAM. It is attached to
a Promise PDC20268 UDMA100 controller. A standard harddisk drive
attached to this controller works well. Cables, controller and drive
where replaced already.

Kernel gives me:

atapci1: Promise PDC20268 UDMA100 controller port 
0xb000-0xb007,0xa800-0xa803,0xa400-0xa407,0xa000-0xa003,0x9800-0x980f mem 
0xdf80-0xdf803fff irq 11 at device 12.0 on pci0
ata2: ATA channel at channel 0 on atapci1
ata3: ATA channel at channel 1 on atapci1
...
ada0 at ata2 bus 0 scbus2 target 0 lun 0
ada0: Maxtor 7B300R0 BAH41G10 ATA-7 device
ada0: 100.000MB/s transfers (UDMA5, PIO 8192bytes)
ada0: 286188MB (586114704 512 byte sectors: 16H 63S/T 16383C)
...
(cd2:ata3:0:0:0): got CAM status 0x50
(cd2:ata3:0:0:0): fatal error, failed to attach to device
(cd2:ata3:0:0:0): lost device, 4 refs
(cd2:ata3:0:0:0): removing device entry
...

Attaching the CDROM drive to the controller that is integrated on
the mainboard (Intel PIIX4 UDMA33 controller) does not show this
problem (but here I don't have UDMA66).

It also works when not using ATA_CAM:

...
acd0: DVDR HL-DT-STDVD-RAM GH22LP20/2.00 at ata3-master UDMA66
...

So this semes to be a problem with the Promise controller and ATA_CAM.

Any ideas? Or should I file PR?


The controller in question is a Promise Ultra100 TX2.


Right. Tried with an Ultra133, same effect.



The error message comes from sys/cam/scsi/scsi_cd.c, in function
cddone().  The logic is a little hard for me to follow (I understand
about 70% of it).  Look at lines 1724 to 1877 for stable/9.

1. Can you provide full output from a verbose boot when the CD/DVD drive
is attached to the Promise controller?


Attached below. I have just filtered out some ahc cruft...

Later I will try to boot a -current kernel -- just to see
how this behaves...



2. What firmware version the card is using?  The PDC20268 had many, many
firmware problems relating to ATAPI devices.


It is the latest BIOS: 2.20.0.15.



3. I wouldn't worry about ATA66 vs. ATA33; this drive can only support
up to about 22MBytes/second so ATA66 isn't going to get you anything,
so as a workaround, using the PIIX4 for it would not hurt you.


Probably. But I already had cdrecord complain when it
came to the funky DMA speed test it is doing. It went
away when using the UDMA66 port. And on the other hand
I sometimes use the PIIX4 port for other stuff and I
do not want to attach the cdrom to the slave port.



4. ONLY if this turns out to be a controller thing: I'm not sure how
much effort should be spent trying to make this work, as the PDC20268 is
legacy/deprecated hardware (made/released 13 years ago).


The whole box is more than 13 years old (good old Asus BX board) ;-)

But since it worked in 7.4-STABLE I feel that this is some kind
of regression. I do not want to waste anyone's resources in fixing
it -- just if someone is curious and/or has an idea how to fix
it...

And here is the dmesg:

{snipping for mail brevity}


Thanks.  CC'd ken@ and mav@ for advice on this.  Here's the dmesg:

http://lists.freebsd.org/pipermail/freebsd-stable/2013-April/073131.html

Short details:

The device under scrutiny here is cd2 on ata3, which is an ATAPI
IDE-based optical drive.  The drive works when either:

a) Connected to a different IDE controller (atapci0), or,
b) When ATA_CAM is removed (i.e. use ata(4) exclusively).


And just as a note: The -current kernel from

https://snapshots.glenbarber.us/Latest/FreeBSD-10.0-CURRENT-i386-20130316-r248381-bootonly.iso

shows the same problem...


Some of Promise controllers are known to have problems with ATAPI DMA. 
Have you tried to disable DMA on that channel or device with loader 
tunable like like hint.ata.3.mode=PIO4 ?


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-04-03 Thread Alexander Motin

On 02.04.2013 21:39, Matthias Andree wrote:

Am 31.03.2013 23:02, schrieb Scott Long:


So what I hear you and Matthias saying, I believe, is that it should be easier 
to
force disks to fall back to non-NCQ mode, and/or have a more responsive
black-list for problematic controllers.  Would this help the situation?  It's 
hard to
justify holding back overall forward progress because of some bad controllers;
we do several Tbps off of AHCI controllers with NCQ enabled on FreeBSD 9.x,
enough to make up a sizable percentage of the internet's traffic, and we see no
problems.  How can we move forward but also take care of you guys with
problematic hardware?


Well, I am running the driver fine off of my WD Caviar RE3 disk, and the
problematic drive also works just fine with Windows and Linux, so it
must be something between the problematic drive and the FreeBSD driver.

I would like to see any of this, in decreasing order of precedence:

- debugged driver

- assistance/instructions on helping how to debug the driver/trace NCQ
stuff/...  (as in Jeremy Chadwick's followup in this same thread - this
helps, I will attempt to procure the required information; back then,
reducing the number of tags to 31 was ineffective, including an error
message and getting a value of 32 when reading the setting back)


Unfortunately, I don't know how to debug that. Command timeouts reported 
on the lists before are the kind of errors that are most difficult to 
diagnose since the controller gives no information to do that. We just 
see that sent commands are no longer completing. May be it is some 
incompatibility of specific drive and HBA firmwares, triggered by some 
innocent specifics of our ATA stack, GEOM or filesystems implementation. 
All I can propose is to try to identify such cases and add some quirks 
to workaround it, like disabling NCQ or limiting number of tags. I am 
not sure what else can we do about it without some controlled lab 
environment with affected hardware and SATA analyzer.



- user-space contingency features, such as letting camcontrol limit
the number of open NCQ tags, or disable NCQ, either on a per-drive basis


I've merged support for that to 8/9-STABLE about 9 months ago:
`camcontrol tags ada0 -v -N X` should change number of simultaneously 
used tags,
`camcontrol negotiate ada0 -T (en|dis)able` should enable/disable use of 
NCQ.
I just did some tests on HEAD and these commands seems like working. If 
you can reproduce the problem, it would be nice to collect information 
how these changes affect it.



I am capable of debugging C - mostly with gdb command-line, and
graphical Windows IDEs - but am unfamiliar with FreeBSD kernel
debugging. If necessary, I can pull up a second console, but the PC that
is affected is legacy-free, so serial port only works through a
serial/USB converter.



--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-31 Thread Alexander Motin

On 31.03.2013 08:13, Ian Smith wrote:

On Sat, 30 Mar 2013 21:00:24 -0700, Peter Wemm wrote:
   On Sat, Mar 30, 2013 at 4:29 PM, Matthias Andree mand...@freebsd.org 
wrote:
Am 27.03.2013 22:22, schrieb Alexander Motin:
Hi.
   
Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
stack, using only some controller drivers of old ata(4) by having
`options ATA_CAM` enabled in all kernels by default. I have a wish to
drop non-ATA_CAM ata(4) code, unused since that time from the head
branch to allow further ATA code cleanup.
   
Does any one here still uses legacy ATA stack (kernel explicitly built
without `options ATA_CAM`) for some reason, for example as workaround
for some regression? Does anybody have good ideas why we should not drop
it now?
   
Alexander,
   
The regression in http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/157397
where the SATA NCQ slots stall for some Samsung drives in the new stack,
and consequently hang the computer for prolonged episodes where it is in
the NCQ error handling, disallows removal of the old driver. (Last
checked with 9.1-RELEASE at current patchlevel.)
  
   We're talking about 10.x, so if you want it fixed, you need update
   with 10.x information.
  
   Please put 10.x diagnostics in the PR.

Given Alexander also posted this to -stable, just for clarity, are we
_only_ talking about 10.x here, or might this change get MFC'd to 9?


Yes, I am only going to drop it from 10.x, but bug reports from 9-STABLE 
users are welcome, as at some point they will become 10.x users.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-28 Thread Alexander Motin

On 28.03.2013 02:43, Adrian Chadd wrote:

My main concern with the new stuff is that it requires CAM and that's
reasonably big compared to the standalone ATA code.

It'd be nice if we could slim down the CAM stack a bit first; it makes
embedding it on the smaller devices really freaking painful.


Are there many boards now with ATA, but without USB? But I agree, it 
should be checked.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Alexander Motin

Hi.

Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA 
stack, using only some controller drivers of old ata(4) by having 
`options ATA_CAM` enabled in all kernels by default. I have a wish to 
drop non-ATA_CAM ata(4) code, unused since that time from the head 
branch to allow further ATA code cleanup.


Does any one here still uses legacy ATA stack (kernel explicitly built 
without `options ATA_CAM`) for some reason, for example as workaround 
for some regression? Does anybody have good ideas why we should not drop 
it now?


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Alexander Motin

On 27.03.2013 23:32, Steve Kargl wrote:

On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:

Hi.

Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
stack, using only some controller drivers of old ata(4) by having
`options ATA_CAM` enabled in all kernels by default. I have a wish to
drop non-ATA_CAM ata(4) code, unused since that time from the head
branch to allow further ATA code cleanup.

Does any one here still uses legacy ATA stack (kernel explicitly built
without `options ATA_CAM`) for some reason, for example as workaround
for some regression?


Yes, I use the legacy ATA stack.


On 9.x or HEAD where new one is default?


Does anybody have good ideas why we should not drop
it now?


Because it works?


Any problems with new one?

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Any objections/comments on axing out old ATA stack?

2013-03-27 Thread Alexander Motin

On 28.03.2013 00:05, Steve Kargl wrote:

On Wed, Mar 27, 2013 at 11:35:35PM +0200, Alexander Motin wrote:

On 27.03.2013 23:32, Steve Kargl wrote:

On Wed, Mar 27, 2013 at 11:22:14PM +0200, Alexander Motin wrote:

Hi.

Since FreeBSD 9.0 we are successfully running on the new CAM-based ATA
stack, using only some controller drivers of old ata(4) by having
`options ATA_CAM` enabled in all kernels by default. I have a wish to
drop non-ATA_CAM ata(4) code, unused since that time from the head
branch to allow further ATA code cleanup.

Does any one here still uses legacy ATA stack (kernel explicitly built
without `options ATA_CAM`) for some reason, for example as workaround
for some regression?


Yes, I use the legacy ATA stack.


On 9.x or HEAD where new one is default?


Head.


Does anybody have good ideas why we should not drop
it now?


Because it works?


Any problems with new one?



Last time I tested the new one, and this was several months
ago, the system (a Dell Latitude D530 laptop) would not boot.


Probably we should just fix that. Any more info?

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Old ICH7 SATA-2 question

2013-02-24 Thread Alexander Motin
 printf(%s%d: %d.%03dMB/s transfers,
 2041periph-periph_name, periph-unit_number,
 2042mb, speed % 1000);
 
 The if() statement that is being used in Michael's case is the one for
 XPORT_SATA, not XPORT_PATA; that will be proven further below.  I then
 had two questions:
 
 1. Where does base_transfer_speed get set?
 
 For SATA devices, it gets set in sys/dev/ata/ata-all.c (I think).  The
 default value chosen is 15:
 
 1884 if (ch-flags  ATA_SATA)
 1885 cpi-base_transfer_speed = 15;
 1886 else
 1887 cpi-base_transfer_speed = 3300;

Right. It is the lowest possible speed, that is supported by this HBA.
It is reported if we have no other information sources.

 2. Where does CTS_SATA_VALID_REVISION get set, which can in effect
 override base_transfer_speed?
 
 The jury is still out on this one as you'll see.
 
 Now on to the protocol revision printing code, i.e. SATA 2.x --
 remember we're talking about the negotiated speed/protocol, not what's
 returned from ATA IDENTIFY (e.g. camcontrol identify) for the disk.
 
 2060 if (cts.ccb_h.status == CAM_REQ_CMP  cts.transport == 
 XPORT_SATA) {
 2061 struct ccb_trans_settings_sata *sata =
 2062 cts.xport_specific.sata;
 2063
 2064 printf( ();
 2065 if (sata-valid  CTS_SATA_VALID_REVISION)
 2066 printf(SATA %d.x, , sata-revision);
 2067 else
 2068 printf(SATA, );
 2069 if (sata-valid  CTS_SATA_VALID_MODE)
 2070 printf(%s, , ata_mode2string(sata-mode));
 2071 if ((sata-valid  CTS_ATA_VALID_ATAPI)  sata-atapi 
 != 0)
 2072 printf(ATAPI %dbytes, , sata-atapi);
 2073 if (sata-valid  CTS_SATA_VALID_BYTECOUNT)
 2074 printf(PIO %dbytes, sata-bytecount);
 2075 printf());
 2076 }
 2077 printf(\n);
 
 Here we can see that XPORT_SATA must be set, because Michael's kernel
 output clearly shows the above printf()s.
 
 But once again we're back to CTS_SATA_VALID_REVISION.  Without
 CTS_SATA_VALID_REVISION being set, ata_xpt.c chooses to simply say
 SATA.  That's all -- just SATA.  And that is what Michael and others
 with this chip see.
 
 The question is, simply, why does this model of ICH7 result in the
 bit CTS_SATA_VALID_REVISION, in the valid member of the appropriate
 ccb_trans_settings_sata struct, not being set correctly.

ICH7 SATA may be configured by BIOS in three different ways:
 1. PCI BAR(5) is pointing to standard set of AHCI registers. In such
case controller will be able to work as AHCI and real speeds will be
reported by ahci(4) driver and printed as SATA x.0.
 2. PCI BAR(5) is pointing to vendor-specific set of SATA registers. In
such case controller will work mostly as legacy ATA with ata(4) driver,
but the code in chipset/ata-intel.c will be able use vendor-specific
registers to report speed, that again will be printed as SATA x.0.
 3. PCI BAR(5) is not set at all (ctlr-r_res2 == NULL). In such case
controller will work as pure legacy ATA with ata(4) driver, the code in
chipset/ata-intel.c will still believe it is SATA, following the chip
ID, but it will have no any idea about what is going on on SATA level.
In such case just SATA will be printed and cpi-base_transfer_speed is
used by CAM to report speed.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: WRITE_FPDMA_QUEUED CAM status: ATA Status Error

2012-12-17 Thread Alexander Motin

Hi.

On 18.12.2012 00:07, Mike Tancsa wrote:

Is there a way to tell / narrow down if an issue with errors like below
are due to a bad cable or bad port multiplier ?  The disks in a
particular cage are throwing errors like these below.  (RELENG9 from today)


All the controller, the port multiplier and the disks are firmware- 
based devices. All of them may have firmware problems, that is not 
possible to diagnose from outside. When controller is talking to disk, 
multiplier is transparent, so it may be impossible to say where exactly 
problem happen. Speaking about cables and physical links, the only kind 
of information I can imagine to check physical link is counters 
represented below:



SATA Phy Event Counters (GP Log 0x11)
ID  Size Value  Description
0x0001  21  Command failed due to ICRC error
0x0002  21  R_ERR response for data FIS
0x0003  20  R_ERR response for device-to-host data FIS
0x0004  21  R_ERR response for host-to-device data FIS
0x0005  20  R_ERR response for non-data FIS
0x0006  20  R_ERR response for device-to-host non-data FIS
0x0007  20  R_ERR response for host-to-device non-data FIS
0x000a  20  Device-to-host register FISes sent due to a COMRESET
0x000b  21  CRC errors within host-to-device FIS
0x8000  4 7720  Vendor specific


They may be reported by disks. IIRC they may also be reported by port 
multiplier, but I've never tried to access them and haven't seen the 
existing tools for it, except via doing bin-banging with camcontrol. 
Whether the controller can report something alike, I don't remember.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Samsung SSD 840 PRO fails to probe

2012-11-26 Thread Alexander Motin

Hi.

On 26.11.2012 20:51, Adam McDougall wrote:

My co-worker ordered a Samsung 840 PRO series SSD for his desktop but we
found 9.0-rel would not probe it and 9.1-rc3 shows some errors.  I got
past the problem with a workaround of disabling AHCI mode in the BIOS
which drops it to IDE mode and it detects fine, although runs a little
slower.  Is there something I can try to make it probe properly in AHCI
mode?  We also tried moving it to the SATA data and power cables from
the working SATA HD so I don't think it is the port or controller
driver.  The same model motherboard from another computer did the same
thing.  Thanks.

dmesg line when it is working:
ada0: Samsung SSD 840 PRO Series DXM03B0Q ATA-9 SATA 3.x device

dmesg lines when it is not working: (hand transcribed from a picture)
(aprobe0:ahcich0:0:0): SETFEATURES ENABLE SATA FEATURE. ACB: ef 10 00 00
00 40 00 00 00 00 05 00
(aprobe0:ahcich0:0:0): CAM status: ATA Status Error
(aprobe0:ahcich0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ahcich0:0:0): RES: 51 04 00 00 00 40 00 00 00 00 00
(aprobe0:ahcich0:0:0): Retrying command
(aprobe0:ahcich0:0:0): SETFEATURES ENABLE SATA FEATURE. ACB: ef 10 00 00
00 40 00 00 00 00 05 00
(aprobe0:ahcich0:0:0): CAM status: ATA Status Error
(aprobe0:ahcich0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
(aprobe0:ahcich0:0:0): RES: 51 04 00 00 00 40 00 00 00 00 00
(aprobe0:ahcich0:0:0): Error 5, Retries exhausted


I believe that is SSD's firmware bug. Probably it declares support for 
SATA Asynchronous Notifications in its IDENTIFY data, but returns error 
on attempt to enable it. Switching controller to legacy mode disables 
that functionality and so works as workaround. Patch below should 
workaround the problem from the OS side:


--- ata_xpt.c   (revision 243561)
+++ ata_xpt.c   (working copy)
@@ -745,6 +745,14 @@ probedone(struct cam_periph *periph, union ccb *do
goto noerror;

/*
+* Some Samsung SSDs report supported Asynchronous 
Notification,

+* but return ABORT on attempt to enable it.
+*/
+   } else if (softc-action == PROBE_SETAN 
+   status == CAM_ATA_STATUS_ERROR) {
+   goto noerror;
+
+   /*
 * SES and SAF-TE SEPs have different IDENTIFY commands,
 * but SATA specification doesn't tell how to identify 
them.

 * Until better way found, just try another if first fail.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Increasing the DMESG buffer....

2012-11-24 Thread Alexander Motin

On 25.11.2012 01:43, Adrian Chadd wrote:

I'm surprised it's not tunable via a kenv variable at boottime..


It is tunable. AFAIR that is it:
kern.msgbufsize=65536   # Set size of kernel message buffer

--
Alexander Motin


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Increasing the DMESG buffer....

2012-11-22 Thread Alexander Motin

On 22.11.2012 12:53, Ian Smith wrote:

On Wed, 21 Nov 2012 23:12:17 -0800, Adrian Chadd wrote:
   On 21 November 2012 20:16, Ian Smith smi...@nimnet.asn.au wrote:
On Wed, 21 Nov 2012 12:08:42 -0800, Adrian Chadd wrote:
[..]
T61_dmesg.boot.10.works (file 1 of 2) lines 1813-1861/1861 byte 
82415/82415
   
Cutting just the hdaa0, pcm0 and pcm1 stuff results in:
   
hda_pcm.verbose (file 2 of 2) lines 712-760/760 byte 28531/28531
  
   Is there a way to extract this topology information out of the driver
   without putting it in the verbose output?

We should be asking Alexander, cc'd.  I only have a snd_ich here, where
hw.snd.verbose=3 is as rich as it gets, 105 lines incl. file versions.


Neither ICH, nor any other driver I know have amount of information 
comparable to what HDA hardware provides. So the analogy is not good. 
Respecting that most CODECs have no published datasheets, that 
information is the only input for debugging.


snd_hda also uses hw.snd.verbose=3. But it is used for even deeper 
driver debugging. It also enables a lot of debugging in sound(4), that 
can be too verbose for HDA debugging.


I will recheck again how can it be reorganized, but I think that the 
real problem is not in HDA. We need some way to structure and filter the 
output.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: stable/9 @r241776 panic: REDZONE: Buffer underflow detected...

2012-10-21 Thread Alexander Motin

On 21.10.2012 20:40, Konstantin Belousov wrote:

On Sun, Oct 21, 2012 at 09:46:34AM -0700, David Wolfskill wrote:

On Sun, Oct 21, 2012 at 09:33:22AM -0700, David Wolfskill wrote:

...
So I tried reverting 241749 ... and I failed to reproduce the problem.

Well, one boot out of one, at least.  I'll try a few more reality
checks, and report back if a correction is in order.  But (for now, at
least), it looks to me as if 241749 is presenting a problem on this
laptop.
...


5 for 5.  I'm convinced that 241749 causes problems on this laptop for
attempts to boot without a stop is single-user mode first.

(So that sounds like a timing issue, somehow.)

And thanks again, Konstantin!


I do not know/do not understand the CAM code, the question shall
be addressed to Alexander. It still might be a false positive.


I don't see how increasing buffer size by few bytes in mentioned change 
may cause memory corruption in some other place. I guess change can be 
just innocent witness that affected some memory placement, moving some 
existing corruption from one area to another where it was noticed.


I am curious, how to interpret phrase 42=94966796 bytes allocated in 
log. May be it is just corrupted output, but the number still seems 
quite big, especially for i386 system, making me think about some 
integer overflow. David, could you write down that part once more?


Having few more lines of Allocation backtrace: could also be useful.

Could you show your kernel config? I can try to run it on my tests 
system, hoping to reproduce the problem.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: stable/9 @r241776 panic: REDZONE: Buffer underflow detected...

2012-10-21 Thread Alexander Motin

On 21.10.2012 23:23, David Wolfskill wrote:

On Sun, Oct 21, 2012 at 09:28:06PM +0300, Alexander Motin wrote:

...
I am curious, how to interpret phrase 42=94966796 bytes allocated in
log. May be it is just corrupted output, but the number still seems
quite big, especially for i386 system, making me think about some
integer overflow. David, could you write down that part once more?

Having few more lines of Allocation backtrace: could also be useful.

Could you show your kernel config? I can try to run it on my tests
system, hoping to reproduce the problem.
...


I've used your kernel config and my test system was unable to boot from 
NFS, while GENERIC kernel boots fine. I haven't got panic, but boot just 
stopped on root mounting. You have so many options specified there so I 
can't predict which of them could cause this. Now I am trying to binary 
search for the problematic one(s).


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: stable/9 @r241776 panic: REDZONE: Buffer underflow detected...

2012-10-21 Thread Alexander Motin

On 22.10.2012 01:03, Alexander Motin wrote:

On 21.10.2012 23:23, David Wolfskill wrote:

On Sun, Oct 21, 2012 at 09:28:06PM +0300, Alexander Motin wrote:

...
I am curious, how to interpret phrase 42=94966796 bytes allocated in
log. May be it is just corrupted output, but the number still seems
quite big, especially for i386 system, making me think about some
integer overflow. David, could you write down that part once more?

Having few more lines of Allocation backtrace: could also be useful.

Could you show your kernel config? I can try to run it on my tests
system, hoping to reproduce the problem.
...


I've used your kernel config and my test system was unable to boot from
NFS, while GENERIC kernel boots fine. I haven't got panic, but boot just
stopped on root mounting. You have so many options specified there so I
can't predict which of them could cause this. Now I am trying to binary
search for the problematic one(s).


Sorry. false alarm. I was just closed firewall in your kernel config. 
Without it my test system boots your kernel without any problem.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: time keeps on slipping... slipping...

2012-10-11 Thread Alexander Motin

On 11.10.2012 09:30, John-Mark Gurney wrote:

Alexander Motin wrote this message on Thu, Oct 11, 2012 at 01:43 +0300:

On 08.10.2012 07:02, John-Mark Gurney wrote:

I recently put together a new machine w/ a SuperMicro H8SCM and an
AMD Opteron 4228 HE...  I've having an issue where the clock on the
machine skips around...  The wierd part is that it's very sudden when
it happens...  ntp sometimes brings it back, but it can't when the clock
gets too far ahread (1000 seconds), ntp dies...

In order to catch it happening, I ran a sleep 60 loop fetching time

from another server that keeps time correctly via:

while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com;
done

here are some snippits:
h2:Sun Oct  7 17:12:54 2012^M
Sun Oct  7 17:12:54 PDT 2012
  7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting
h2:Sun Oct  7 17:13:48 2012^M
Sun Oct  7 17:20:21 PDT 2012
  7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting

but then ntp brings it back in sync:
h2:Sun Oct  7 17:28:49 2012^M
Sun Oct  7 17:35:21 PDT 2012
  7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting
h2:Sun Oct  7 17:29:49 2012^M
Sun Oct  7 17:29:49 PDT 2012
  7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting

It happens pretty often:
Oct  7 00:19:13 gold ntpd[3721]: time reset -785.347912 s
Oct  7 00:46:37 gold ntpd[3721]: time reset -392.673256 s
Oct  7 01:04:24 gold ntpd[3721]: time reset -785.346533 s
Oct  7 15:00:59 gold ntpd[3721]: time reset -392.681720 s
Oct  7 16:32:11 gold ntpd[3721]: time reset -392.671268 s
Oct  7 17:29:29 gold ntpd[3721]: time reset -392.671752 s
Oct  7 18:04:37 gold ntpd[3721]: time reset -785.346987 s

but as you can see above, the time slip happens abruptly.. looks like
a rounding error or something...

I'm now reducing the sleep to 5 seconds... but as you can see the sleep
ends a few seconds early and local time suddenly jumped forward 6
minutes 33 seconds...

$ sysctl kern.timecounter
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0)
dummy(-100)
kern.timecounter.hardware: TSC-low
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 11598
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 3257069245
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.ACPI-safe.mask: 16777215
kern.timecounter.tc.ACPI-safe.counter: 4219134510
kern.timecounter.tc.ACPI-safe.frequency: 3579545
kern.timecounter.tc.ACPI-safe.quality: 850
kern.timecounter.tc.TSC-low.mask: 4294967295
kern.timecounter.tc.TSC-low.counter: 2854866610
kern.timecounter.tc.TSC-low.frequency: 10937740
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
$ sysctl kern.eventtimer
kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
kern.eventtimer.et.LAPIC.flags: 15
kern.eventtimer.et.LAPIC.frequency: 12217
kern.eventtimer.et.LAPIC.quality: 400
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.periodic: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.activetick: 1
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2

I have switched my timecounter to HPET to see if things are different...

Any clues?


Mentioned switching to HPET could tell a lot about the problem.
Switching event timer also may be interesting.


Since I switch to HPET, it hasn't happened at all in the last 3 days..


That is probably tells about some problems with TSC timecounter. What is 
strange to me is time jump size of 5 minutes. TSC timecounter should 
overflow each few seconds, so single jump should be just that big.



Should I try switching back to TSC and switching event timer? do you
need any other info, or want me to try anything else?


You may try to do it to be sure eventtimers are not related to the case.


Oh, forgot to include the specific processor info in my previous
email:
CPU: AMD Opteron(tm) Processor 4228 HE   (2800.05-MHz K8-class CPU)
   Origin = AuthenticAMD  Id = 0x600f12  Family = 0x15  Model = 0x1  Stepping 
= 2
   
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
   
Features2=0x1e98220bSSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AESNI,XSAVE,OSXSAVE,AVX
   AMD Features=0x2e500800SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM
   AMD 
Features2=0x1c9bfffLAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,XOP,SKINIT,WDT,LWP,FMA4,NodeId,Topology,b23,b24
   TSC: P-state invariant, performance statistics


Unfortunately, I don't know AMD processors specifics. May be jkim@ or 
avg

Re: time keeps on slipping... slipping...

2012-10-10 Thread Alexander Motin

On 08.10.2012 07:02, John-Mark Gurney wrote:

I recently put together a new machine w/ a SuperMicro H8SCM and an
AMD Opteron 4228 HE...  I've having an issue where the clock on the
machine skips around...  The wierd part is that it's very sudden when
it happens...  ntp sometimes brings it back, but it can't when the clock
gets too far ahread (1000 seconds), ntp dies...

In order to catch it happening, I ran a sleep 60 loop fetching time
from another server that keeps time correctly via:
while sleep 60; do echo -n h2:; nc h2 13; date; ntpdate h2.funkthat.com; done

here are some snippits:
h2:Sun Oct  7 17:12:54 2012^M
Sun Oct  7 17:12:54 PDT 2012
  7 Oct 17:12:54 ntpdate[31036]: the NTP socket is in use, exiting
h2:Sun Oct  7 17:13:48 2012^M
Sun Oct  7 17:20:21 PDT 2012
  7 Oct 17:20:21 ntpdate[31045]: the NTP socket is in use, exiting

but then ntp brings it back in sync:
h2:Sun Oct  7 17:28:49 2012^M
Sun Oct  7 17:35:21 PDT 2012
  7 Oct 17:35:21 ntpdate[31164]: the NTP socket is in use, exiting
h2:Sun Oct  7 17:29:49 2012^M
Sun Oct  7 17:29:49 PDT 2012
  7 Oct 17:29:49 ntpdate[31170]: the NTP socket is in use, exiting

It happens pretty often:
Oct  7 00:19:13 gold ntpd[3721]: time reset -785.347912 s
Oct  7 00:46:37 gold ntpd[3721]: time reset -392.673256 s
Oct  7 01:04:24 gold ntpd[3721]: time reset -785.346533 s
Oct  7 15:00:59 gold ntpd[3721]: time reset -392.681720 s
Oct  7 16:32:11 gold ntpd[3721]: time reset -392.671268 s
Oct  7 17:29:29 gold ntpd[3721]: time reset -392.671752 s
Oct  7 18:04:37 gold ntpd[3721]: time reset -785.346987 s

but as you can see above, the time slip happens abruptly.. looks like
a rounding error or something...

I'm now reducing the sleep to 5 seconds... but as you can see the sleep
ends a few seconds early and local time suddenly jumped forward 6
minutes 33 seconds...

$ sysctl kern.timecounter
kern.timecounter.fast_gettime: 1
kern.timecounter.tick: 1
kern.timecounter.choice: TSC-low(1000) ACPI-safe(850) HPET(950) i8254(0) 
dummy(-100)
kern.timecounter.hardware: TSC-low
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 11598
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 3257069245
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.ACPI-safe.mask: 16777215
kern.timecounter.tc.ACPI-safe.counter: 4219134510
kern.timecounter.tc.ACPI-safe.frequency: 3579545
kern.timecounter.tc.ACPI-safe.quality: 850
kern.timecounter.tc.TSC-low.mask: 4294967295
kern.timecounter.tc.TSC-low.counter: 2854866610
kern.timecounter.tc.TSC-low.frequency: 10937740
kern.timecounter.tc.TSC-low.quality: 1000
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1
$ sysctl kern.eventtimer
kern.eventtimer.choice: LAPIC(400) i8254(100) RTC(0)
kern.eventtimer.et.LAPIC.flags: 15
kern.eventtimer.et.LAPIC.frequency: 12217
kern.eventtimer.et.LAPIC.quality: 400
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.periodic: 0
kern.eventtimer.timer: LAPIC
kern.eventtimer.activetick: 1
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2

I have switched my timecounter to HPET to see if things are different...

Any clues?


Mentioned switching to HPET could tell a lot about the problem. 
Switching event timer also may be interesting.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ahcich reset - cannot mount zfs root in 9.1-PRE

2012-10-02 Thread Alexander Motin

On 02.10.2012 16:51, Andriy Gapon wrote:

on 02/10/2012 16:16 geoffroy desvernay said the following:

Hi all,

Trying to upgrade a system from 9.0-RELEASE to 9.1-PRE from yesterday on
my machine (GEOM+ZFS mirror setup on ada[01]p3), the new kernel becomes
unable to mount root... The only way to recover is to boot from 9.0 kernel.
The disks were already named ada[01] in 9.0, so I suspect nothing there...

I tried
  - disabling AHCI in bios (no change seen)
  - change cables, check PSU, test disks with smartctl

Here are some bits (via serial console):
ahci0: ATI IXP600 AHCI SATA controller port
0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f
mem 0xfe9ff800-0xfe9ffbff irq 22 at device 18.0 on pci0
ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported
ahci0: Caps: 64bit NCQ SNTF MPS AL CLO 3Gbps PM PMD SSC PSC 32cmd CCC 4ports
ahcich0: AHCI channel at channel 0 on ahci0
ahcich0: Caps: HPCP
ahcich1: AHCI channel at channel 1 on ahci0
ahcich1: Caps: HPCP
ahcich2: AHCI channel at channel 2 on ahci0
ahcich2: Caps: HPCP
ahcich3: AHCI channel at channel 3 on ahci0
ahcich3: Caps: HPCP
ahcich0: AHCI reset...
ahcich0: SATA connect time=100us status=0123
ahcich0: AHCI reset: device found
ahcich0: AHCI reset: device ready after 0ms

The difference with 9.0 is after that: here is 9.0's next lines: (same
for ahcich1)
(aprobe0:ahcich0:0:15:0): Command timed out
(aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted
(aprobe0:ahcich0:0:0:0): SIGNATURE: 

And 9.1-PRE's:
(aprobe0:ahcich0:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich0:0:15:0): CAM status: Command timeout
(aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted

In both cases ada[01] are detected and available, but with 9.1-PRE I see:
GEOM_RAID: Promise: Disk ada0 state changed from NONE to SPARE.
GEOM_RAID: Promise: Disk ada1 state changed from NONE to SPARE.

(I see the same when I # kldload geom_raid # from running 9.0, doesn't
breaks anything...)

I attach the full boot log with 9.1-PRE (bios with NO-raid nor AHCI
enabled, but this changes nothing in the output)

I could test patches or try any command required to debug this… But for
the moment I don't know where to search (and kernel code is far away
from my current skills in debugging…)


You probably need to clear RAID metadata on the disks as I think that disabling
geom_raid is not possible in 9.1-PRE.
I think that Alexander can help you more here.


The right way is to clear RAID metadata on disks. If it is possible to 
boot from any other source, you can just do `graid delete Promise` and 
then reboot.


Alternatively it is possible to disable geom_raid module using recently 
added loader tunable kern.geom.raid.enable=0. After that your system 
should boot and run fine. I would still recommend you to erase metadata, 
but after setting that tunable it will be impossible to do it via graid 
tool, only with manual dd surgery. In case of Promise format metadata 
use up to 63 last sectors of the disk. You can identify respective 
sectors to erase by signature Promise Technology, Inc. in the 
beginning of the sector.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Fatal trap 12: hda

2012-09-24 Thread Alexander Motin

On 23.09.2012 23:41, Andriy Gapon wrote:

on 23/09/2012 23:10 Barbara said the following:

After updating src on RELENG_9 from r240236 to r240821 I have rebuilt
my world+kernel.
On reboot I had a kernel panic, supervisor read, page not present
for process swapper.
Trying to reboot in Single User Mode I accidentally disabled ACPI.
Luckily the machine booted succesfully but there was nothing new in
/var/crash.
Then I tried again with ACPI enabled: same kernel panic.
So I run nm on the instruction pointer of the panic and I noticed that
it was in hdaa_sense_init, in sys/dev/sound/pci/hda/hdaa.c.
BTW, I have device sound and device snd_hda in my KERNCONF, and
the sound hw detection happens before HDs, is that the reason why I
wasn't able to get a dump or dumping using DDB and the panicking
process is swapper? Is there any trick I'm missing for that?

Booting in verbose mode and comparing the output with ACPI enabled
(where the panic happens) and disabled, I guessed that the problem was
where No presence detection support at nid... is printed, as it was
missing in the former case for nid 27 - Headphone (Green Jack). With
ACPI disabled the value was looking quite weird: 36765696.
So I made the following change:


--- sys/dev/sound/pci/hda/hdaa.c.orig   2012-09-22 20:06:20.0 +0200
+++ sys/dev/sound/pci/hda/hdaa.c2012-09-23 20:39:32.0 +0200
@@ -627,7 +627,7 @@
(HDA_CONFIG_DEFAULTCONF_MISC(w-wclass.pin.config) 
 1) != 0) {
device_printf(devinfo-dev,
No presence detection support at nid %d\n,
-   as[i].pins[15]);
+   as-pins[15]);
} else {
if (w-unsol  0)
poll = 1;


Maybe the fix is not correct, but at least the new kernel boots successfully.
Can someone review that?
I tried looking in svn commits between the two builds, but I don't
know what exposed the problem.
If anyone is interested in my verbose log, or doing some tests, please ask.


Your patch looks correct, looks like a bug could have been introduced via
copy+paste.


Good catch. Thank you. Slightly modified patch committed at r240884.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: GEOM_RAID in GENERIC is harmful

2012-09-13 Thread Alexander Motin

On 13.09.2012 08:31, Eugene Grosbein wrote:

9-STABLE has got options GEOM_RAID in GENERIC.
In real world, this change is pretty harmful and there are lots of cases
when 9.0-RELEASE systems upgraded to 9-STABLE fail to mount root UFS filesystem
or attach ZFS.

It seems, there are lots of HDDs supplied with pseudo-RAID labels at the end:
pre-installed Windows machined having motherboards with pseudo-RAID
like Intel RapidStore and alike. One can not even be aware of these labels.

9.0-RELEASE can be installed on such HDDs and use them with GMIRROR or ZFS
without a problem. Upgraded to 9-STABLE, such system fails to build due
to GRAID jumping out of box and grabbing HDDs for itself,
so GMIRROR or ZFS got broken.

That's makes users very angry when production server fails to boot
with GENERIC kernel after correctly performed upgrade.

GEOM_RAID compiled in GENERIC should be deactivated and require activation
with some loader knob. Also, we need distinct RELEASE NOTES warning about the 
issue.


Problem of on-disk metadata garbage is not limited to GEOM_RAID. For 
example, I had case where remainders of old UFS file system were found 
by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT 
partition, making other partitions inaccessible. Does it mean we should 
remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty 
in is that it was not in place for 9.0 release. If we remove it now, it 
will just postpone the problem for later time or will never be able to 
add it again because of the same reasons.


Unlike GEOM_LABEL, metadata of GEOM_RAID is quite easy to delete without 
complete disk erase: `graid status -ag`, `graid delete ...`. Yes, it can 
be a problem if system can't boot, but now we at least have live mode on 
installation images, that should allow to do it.


Adding some loader tunables indeed could simplify recovery in case of 
boot problem. I will probably add such ones now. It won't hurt. But I 
disagree they should be disabled by default, limiting users who really 
want to use BIOS RAID. Disabling them will also make metadata removal 
without full wipe more difficult because different RAIDs have different 
on-disk metadata layout, and you should know where exactly to apply dd.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: GEOM_RAID in GENERIC is harmful

2012-09-13 Thread Alexander Motin

On 13.09.2012 13:01, Eugene Grosbein wrote:

13.09.2012 16:51, Alexander Motin wrote:


That's makes users very angry when production server fails to boot
with GENERIC kernel after correctly performed upgrade.

GEOM_RAID compiled in GENERIC should be deactivated and require activation
with some loader knob. Also, we need distinct RELEASE NOTES warning about the 
issue.


Problem of on-disk metadata garbage is not limited to GEOM_RAID. For
example, I had case where remainders of old UFS file system were found
by GEOM_LABEL and ZFS incorrectly attached to it instead of proper GPT
partition, making other partitions inaccessible. Does it mean we should
remove GEOM_LABEL also? I don't think so. All what GEOM_RAID is guilty
in is that it was not in place for 9.0 release. If we remove it now, it
will just postpone the problem for later time or will never be able to
add it again because of the same reasons.


We must be ready for lots of angry users of 9.1-RELEASE then
and have BIG RED WARNING in RELEASE NOTES.


Warning is good, but I don't think it will be lots. It is enabled in 
9-STABLE for some time now and I haven't seen many complains. If re@ 
permit to MFC r240465 in few days, solution for those who may need it 
will be simple: kern.geom.raid.enable=0.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Thinkpad X61s cannot boot 9.1-BETA1

2012-09-13 Thread Alexander Motin

On 13.09.2012 10:44, Lars Engels wrote:

On Wed, Sep 12, 2012 at 11:08:25PM +0300, Alexander Motin wrote:

On 12.09.2012 22:58, Lars Engels wrote:

On Wed, Sep 12, 2012 at 09:58:31PM +0300, Alexander Motin wrote:

On 12.09.2012 20:46, Lars Engels wrote:

On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote:

on 12/09/2012 20:25 Lars Engels said the following:

On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote:

Could you try to play with different eventtimer settings (preferably in 
current) ?
You can use this thread / PR as a guide:
http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495

The place where boot stop looks suspiciously close to the place where timer
interrupts should start driving the system.


Yes, that's it!
Setting  kern.eventtimer.timer=i8254 let's the Thinkpad boot on
CURRENT with the AC cable inserted.



Please share your sysctl kern.eventtimer output with Alexander.
He will probably ask for some additional information :-)


Sorry if I've missed, but it would be useful to see verbose dmesg in
situation where system couldn't boot without switching eventtimer.


No problem. See: http://bsd-geek.de/FreeBSD/IMAG0190.jpg


No, I've seen that one and I don't mean it. I mean full verbose dmesg of
successful boot in conditions where system was not booting before
without setting kern.eventtimer.timer=i8254.


Ok, sorry.
Here's a verbose dmesg booting CURRENT without AC power:
http://bsd-geek.de/FreeBSD/T61_dmesg.boot.works


Hmm. I see nothing suspicious. HPET driver output is typical for ICH8M 
chipset, many of which are working fine in different systems, including 
several mine. There was no significant changes in HPET after 9.0-RELASE 
except r231161. It changed device probe order that increased chance of 
interrupt sharing. It should not be a problem, but who knows. You can 
try to hint HPET driver specific IRQ 23 (that looks unused) to avoid 
sharing by setting hint.hpet.0.allowed_irqs=0x0080.


You've told that problem related to AC power state. Have you compared 
dmesg outputs with and without it?


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Thinkpad X61s cannot boot 9.1-BETA1

2012-09-12 Thread Alexander Motin

On 12.09.2012 20:46, Lars Engels wrote:

On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote:

on 12/09/2012 20:25 Lars Engels said the following:

On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote:

Could you try to play with different eventtimer settings (preferably in 
current) ?
You can use this thread / PR as a guide:
http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495

The place where boot stop looks suspiciously close to the place where timer
interrupts should start driving the system.


Yes, that's it!
Setting  kern.eventtimer.timer=i8254 let's the Thinkpad boot on
CURRENT with the AC cable inserted.



Please share your sysctl kern.eventtimer output with Alexander.
He will probably ask for some additional information :-)


Sorry if I've missed, but it would be useful to see verbose dmesg in 
situation where system couldn't boot without switching eventtimer.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Thinkpad X61s cannot boot 9.1-BETA1

2012-09-12 Thread Alexander Motin

On 12.09.2012 22:58, Lars Engels wrote:

On Wed, Sep 12, 2012 at 09:58:31PM +0300, Alexander Motin wrote:

On 12.09.2012 20:46, Lars Engels wrote:

On Wed, Sep 12, 2012 at 08:30:36PM +0300, Andriy Gapon wrote:

on 12/09/2012 20:25 Lars Engels said the following:

On Wed, Sep 12, 2012 at 03:54:30PM +0300, Andriy Gapon wrote:

Could you try to play with different eventtimer settings (preferably in 
current) ?
You can use this thread / PR as a guide:
http://thread.gmane.org/gmane.os.freebsd.devel.amd64/14480/focus=14495

The place where boot stop looks suspiciously close to the place where timer
interrupts should start driving the system.


Yes, that's it!
Setting  kern.eventtimer.timer=i8254 let's the Thinkpad boot on
CURRENT with the AC cable inserted.



Please share your sysctl kern.eventtimer output with Alexander.
He will probably ask for some additional information :-)


Sorry if I've missed, but it would be useful to see verbose dmesg in
situation where system couldn't boot without switching eventtimer.


No problem. See: http://bsd-geek.de/FreeBSD/IMAG0190.jpg


No, I've seen that one and I don't mean it. I mean full verbose dmesg of 
successful boot in conditions where system was not booting before 
without setting kern.eventtimer.timer=i8254.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 RC1 and CAM issues with old SCSI drive

2012-09-09 Thread Alexander Motin

On 09.09.2012 16:25, kirk russell wrote:

On Sat, Sep 8, 2012 at 12:29 PM, Alexander Motin m...@freebsd.org wrote:

Hi.

It seems like both of your problems have the same cause: device report wrong
size of INQUIRY data, that causes failure on attempt to fetch it. With
FreeBSD 9.0 it caused domain validation failures and so reduced transfer
rate, on 9.1 it also causes detection failure. I am not sure why detection
worked on 9.0, it needs some deeper code comparison, but I think it is
mostly device problem.

Could you send me output of such commands from FreeBSD 9.0:
camcontrol cmd da0 -vEc 12 00 00 00 24 00 -i 36 - | hd
camcontrol cmd da0 -vEc 12 00 00 00 fe 00 -i 254 - | hd
camcontrol cmd da0 -vEc 12 00 00 01 00 00 -i 256 - | hd

--
Alexander Motin


This is running 9.0-RELEASE.

# camcontrol cmd da0 -vEc 12 00 00 00 24 00 -i 36 - | hd
  00 00 02 02 fa 00 00 3e  43 4f 4d 50 41 51 50 43  |...COMPAQPC|
0010  57 44 45 39 31 30 30 57  20 20 20 20 20 20 20 20  |WDE9100W|
0020  31 2e 30 31   |1.01|
0024
# camcontrol cmd da0 -vEc 12 00 00 00 fe 00 -i 254 - | hd
  00 00 02 02 fa 00 00 3e  43 4f 4d 50 41 51 50 43  |...COMPAQPC|
0010  57 44 45 39 31 30 30 57  20 20 20 20 20 20 20 20  |WDE9100W|
0020  31 2e 30 31 32 33 30 31  57 53 37 30 32 30 33 37  |1.012301WS702037|
0030  32 34 39 33 00 00 00 00  20 20 20 20 20 20 20 20  |2493|
0040  20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20  ||
*
0060  57 44 45 39 31 30 30 2d  36 30 30 35 44 30 20 20  |WDE9100-6005D0  |
0070  34 30 36 31 30 30 31 31  39 31 30 30 32 43 30 20  |4061001191002C0 |
0080  32 34 30 38 00 00 00 00  00 00 00 00 00 00 00 00  |2408|
0090  00 00 00 00 4e 32 30 35  30 30 39 39 30 32 35 35  |N20500990255|
00a0  33 20 20 20 50 20 30 30  00 00 00 00 00 00 42 41  |3   P 00..BA|
00b0  43 43 42 45 4b 43 31 39  39 38 30 38 32 38 57 53  |CCBEKC19980828WS|
00c0  36 30 44 20 04 03 00 04  02 01 00 00 00 00 00 00  |60D |
00d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
00f0
# camcontrol cmd da0 -vEc 12 00 00 01 00 00 -i 256 - | hd
(pass1:ahc0:0:0:0): INQUIRY. CDB: 12 0 0 1 0 0
(pass1:ahc0:0:0:0): CAM status: SCSI Status Error
(pass1:ahc0:0:0:0): SCSI status: Check Condition
(pass1:ahc0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB)
(pass1:ahc0:0:0:0): Command Specific Info: 0x
(pass1:ahc0:0:0:0): Command byte 3 is invalid
camcontrol: error sending command
(pass1:ahc0:0:0:0): INQUIRY. CDB: 12 0 0 1 0 0
(pass1:ahc0:0:0:0): CAM status: SCSI Status Error
(pass1:ahc0:0:0:0): SCSI status: Check Condition
(pass1:ahc0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:24,0 (Invalid field in CDB)
(pass1:ahc0:0:0:0): Command Specific Info: 0x
(pass1:ahc0:0:0:0): Command byte 3 is invalid


It seems that problem can be in our SCSI code that rounds inquiry data 
size up to even. Please try to comment out line

inquiry_len = roundup2(inquiry_len, 2);
in sys/cam/scsi/scsi_xpt.c and rebuild the kernel. It should probably 
fix both device detection and transfer speed.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.1 RC1 and CAM issues with old SCSI drive

2012-09-08 Thread Alexander Motin

Hi.

It seems like both of your problems have the same cause: device report 
wrong size of INQUIRY data, that causes failure on attempt to fetch it. 
With FreeBSD 9.0 it caused domain validation failures and so reduced 
transfer rate, on 9.1 it also causes detection failure. I am not sure 
why detection worked on 9.0, it needs some deeper code comparison, but I 
think it is mostly device problem.


Could you send me output of such commands from FreeBSD 9.0:
camcontrol cmd da0 -vEc 12 00 00 00 24 00 -i 36 - | hd
camcontrol cmd da0 -vEc 12 00 00 00 fe 00 -i 254 - | hd
camcontrol cmd da0 -vEc 12 00 00 01 00 00 -i 256 - | hd

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-08-14 Thread Alexander Motin

On 14.08.2012 22:25, Adam McDougall wrote:

On Sun, Apr 29, 2012 at 04:39:29PM +0300, Alexander Motin wrote:

   On 04/29/12 16:30, Alex Kozlov wrote:
On Sun, Apr 29, 2012 at 04:11:20PM +0300, Alexander Motin wrote:
On 04/29/12 15:27, Alex Kozlov wrote:
On Sun, Apr 29, 2012 at 03:07:40PM +0300, Alexander Motin wrote:
On 04/29/12 15:04, Oliver Pinter wrote:
Removing dummynet from kernel don't chanage anything, that is releated
to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh +
top)
   
New ktr dump?
I have similar issue on one of my laptops. Should I provide ktr dump?

http://lists.freebsd.org/pipermail/freebsd-current/2011-September/027133.html
In your case HPET also shares interrupt with other devices. I suspect
that may be a reason. Every time when swi thread runs loadavg, other CPU
runs shared interrupt handler, that is accounted as result. Please show
your verbose dmesg.
Attached.

   In your case HPET could solely use IRQ22 that seems free now. After
   recent changes in ACPI code it is detected before PCI devices and so
   doesn't avoids sharing. You may try to hint it specific IRQ by adding to
   loader,conf line:
   hint.hpet.0.allowed_irqs=0x0040

   --
   Alexander Motin

I think I am having the same issue on my Sun Fire x4150 servers.  It
goes away when I sysctl kern.eventtimer.timer=LAPIC but I'm hesitant to
use local workarounds in case they become pessimistic in the future.
I'm not sure all of my systems would have the same free irqs (including
after potential addition of expansion cards) so it might be a pain to
determine an appropriate allowed_irqs setting for each.  I tried
hint.hpet.0.allowed_irqs=0x for the sake of experiment and
that just results in LAPIC being used since HPET is removed from
kern.eventtimer.choice.  I've attached a verbose dmesg (will probably be
stripped from the list, hence the Cc:).

Is there a limit to how high the irq can be set or could I perhaps set
it high enough that it is unlikely to conflict with other hardware?  Is
there a chance we can find an automatic fix for this issue, or should I
just stick with LAPIC at the expense of whatever the HPET event timer
gets me?  Or something else?  I feel the partially random load average
level makes it difficult to measure a low load and can be misleading
during problem debugging.  Thanks.


HPET theoretically can use any IRQ from 0 to 31. Practically there could 
be different limitations. It is BIOS duty to tell us which IRQs are 
allowed to use. In your case IRQs 20-23 are allowed. Unluckily now 
system just gives to the HPET driver the first from the range.


Problem with LAPIC timer is that it stops working when CPU goes to C3 or 
deeper idle state. These states are not enabled by default, so unless 
you enabled them explicitly, it is safe to use LAPIC. In any case 
present 9-STABLE system should prevent you from using unsafe C-state if 
LAPIC timer is used. From all other perspectives LAPIC is preferable, as 
it is faster and easier to operate then HPET. Latest CPUs fixed the 
LAPIC timer problem, so I don't think that switching to it will be 
pessimistic in foreseeable future.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: GEOM_RAID in GENERIC 9.1

2012-07-30 Thread Alexander Motin

On 30.07.2012 08:33, Eugene M. Zheganin wrote:

On 30.07.2012 11:04, Eugene M. Zheganin wrote:

I am aware about how this thing works and what it does. However, every
time I upgrade new server I got hit by it again and again, simply
forgetting to remove it from the kernel's config.

I'm afraid this thing will hit lots of FreeBSD installations after the
release; it may be easily removed but still it will poison the life of
many engineers and I really think it's a bomb, and should be removed
from GENERIC.


Okay, I feel like I need to clarify this, as some decent guys pointed me
out that I'm very unclear and even rude (sorry for that, that's
unintentional).

GEOM_RAID was inserted instead of ataraid, but ataraid wasn't messing
with zpooled disks: with GEOM_RAID the kernel takes both (in case of
mirrored pool) providers, and mountroot just fails, as it sees no zfs pool.

Plus, it's even more. This time I have disabled the raid in it's
BIOS before installing FreeBSD. After mountroot failed, I booted 9.0-R
from usb flash, trying to avoid any surgery with kernel files, like
manual install from another machine. I was curious if I will be able to
resolve this issue using base utilities. So, I loaded geom_raid via
'graid load', kernel said like 'Doh... I have ada0/ada1 spare disks',
then I tried to remove the softraid label remains with 'graid remove' -
and it failed, because there's no array at all, only spares. So, the
'graid status' is empty, 'graid list' is empty' and it's obvious that
some surgery is needed.

And I'm not disappointed that it's happened to me, no, because I know
how to resolve this.
But the thing that I'm really afraid of is that this default option will
hit the less experienced engineers.


Thank you for your report. I will recheck deletion of spare disks. But 
what's about `geom status/list` in this case, there are special options 
-a and -g to handle geoms without providers.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: AHCI Timeout errors on Intel Patsburg

2012-07-29 Thread Alexander Motin

Hi.

 is  cs  ss 0001 rs 0001 tfd 40 serr 0088

This line (ss and rs fields) tells me that device haven't confirmed 
completion of one NCQ command. Bits set in serr field mean 10b to 8b 
Decode Error and Link Sequence Error. I would suggest that something 
wrong with the link quality. That may explain why reducing speed helps.


--
Alexander Motin

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mfi(4) IO performance regression, post 8.1

2012-07-20 Thread Alexander Motin

On 19.07.2012 18:28, Adrian Chadd wrote:

Hm! A timer related bug?

I'll CC mav@ on this, as it was his commit (and work in his general area.)

I wonder what's going on - is it something to do with the two ACPI
calls inserted there, or is it something to do with the change in
event timer values?

mav? Any ideas?


I can just agree with earlier made guess that for some reason ACPI timer 
on that system is very slow. Unless user explicitly enabled deeper 
C-states, values returned by the timer are not really used for anything, 
so there is just no place for other bug.


When doing this change I was expecting that it may have cost, but on 
most systems that cost makes effect only during high interrupt rates, 
where it is covered by automatic fallback to using faster MWAIT as idle 
method. Unluckily, that code still was not merged to 8-STABLE (only 9). 
I will recheck is there problem to merge it now.


Manual switching to MWAIT via sysctl is correct workaround for this 
situation. It may give slightly higher power consumption, but for this 
workload with many interrupts probably the best possible performance.



On 17 July 2012 13:39, Steve McCoy smc...@greatbaysoftware.com wrote:


Alright, I've finally narrowed it down to r209897, which only affects
acpi_cpu_idle():

--- stable/8/sys/dev/acpica/acpi_cpu.c  2010/06/23 17:04:42 209471
+++ stable/8/sys/dev/acpica/acpi_cpu.c  2010/07/11 11:58:46 209897
@@ -930,12 +930,16 @@

  /*
   * Execute HLT (or equivalent) and wait for an interrupt.  We can't
- * calculate the time spent in C1 since the place we wake up is an
- * ISR.  Assume we slept half of quantum and return.
+ * precisely calculate the time spent in C1 since the place we wake up
+ * is an ISR.  Assume we slept no more then half of quantum.
   */
  if (cx_next-type == ACPI_STATE_C1) {
-   sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 + 50 / hz) / 4;
+   AcpiHwRead(start_time, AcpiGbl_FADT.XPmTimerBlock);
 acpi_cpu_c1();
+   AcpiHwRead(end_time, AcpiGbl_FADT.XPmTimerBlock);
+end_time = acpi_TimerDelta(end_time, start_time);
+   sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 +
+   min(PM_USEC(end_time), 50 / hz)) / 4;
 return;
  }

My current guess is that AcpiHwRead() is a problem on our hardware. It's an
isolated change and, to my desperate eyes, the commit message implies that
it isn't critical — Do you think we could buy ourselves some time by pulling
it out of our version of the kernel? Or is this essential for correctness?
Any thoughts are appreciated, thanks!



--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mfi(4) IO performance regression, post 8.1

2012-07-20 Thread Alexander Motin

On 20.07.2012 16:38, Alexander Motin wrote:

On 19.07.2012 18:28, Adrian Chadd wrote:

Hm! A timer related bug?

I'll CC mav@ on this, as it was his commit (and work in his general
area.)

I wonder what's going on - is it something to do with the two ACPI
calls inserted there, or is it something to do with the change in
event timer values?

mav? Any ideas?


I can just agree with earlier made guess that for some reason ACPI timer
on that system is very slow. Unless user explicitly enabled deeper
C-states, values returned by the timer are not really used for anything,
so there is just no place for other bug.

When doing this change I was expecting that it may have cost, but on
most systems that cost makes effect only during high interrupt rates,
where it is covered by automatic fallback to using faster MWAIT as idle
method. Unluckily, that code still was not merged to 8-STABLE (only 9).
I will recheck is there problem to merge it now.


I've just merged that to 8-STABLE at r238658. Testers are welcome.


Manual switching to MWAIT via sysctl is correct workaround for this
situation. It may give slightly higher power consumption, but for this
workload with many interrupts probably the best possible performance.


On 17 July 2012 13:39, Steve McCoy smc...@greatbaysoftware.com wrote:


Alright, I've finally narrowed it down to r209897, which only affects
acpi_cpu_idle():

--- stable/8/sys/dev/acpica/acpi_cpu.c  2010/06/23 17:04:42 209471
+++ stable/8/sys/dev/acpica/acpi_cpu.c  2010/07/11 11:58:46 209897
@@ -930,12 +930,16 @@

  /*
   * Execute HLT (or equivalent) and wait for an interrupt.  We
can't
- * calculate the time spent in C1 since the place we wake up is an
- * ISR.  Assume we slept half of quantum and return.
+ * precisely calculate the time spent in C1 since the place we
wake up
+ * is an ISR.  Assume we slept no more then half of quantum.
   */
  if (cx_next-type == ACPI_STATE_C1) {
-   sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 + 50 / hz) / 4;
+   AcpiHwRead(start_time, AcpiGbl_FADT.XPmTimerBlock);
 acpi_cpu_c1();
+   AcpiHwRead(end_time, AcpiGbl_FADT.XPmTimerBlock);
+end_time = acpi_TimerDelta(end_time, start_time);
+   sc-cpu_prev_sleep = (sc-cpu_prev_sleep * 3 +
+   min(PM_USEC(end_time), 50 / hz)) / 4;
 return;
  }

My current guess is that AcpiHwRead() is a problem on our hardware.
It's an
isolated change and, to my desperate eyes, the commit message implies
that
it isn't critical — Do you think we could buy ourselves some time by
pulling
it out of our version of the kernel? Or is this essential for
correctness?
Any thoughts are appreciated, thanks!






--
Alexander Motin


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mfi(4) IO performance regression, post 8.1

2012-07-20 Thread Alexander Motin

Hi.

On 20.07.2012 22:38, Adrian Chadd wrote:

I'm worried that this won't be the only source of freebsd is slower
than linux issues.

What can we add to the timer path to make identifying and root causing
this issue easy? I'd just like to be absolutely sure that we're not
only doing the best job possible, but we can provide some tools and
statistics to the user/administrator so as to make debugging much
easier.


The only instrument to diagnose this problem without provided input I 
could propose is hwpmc profiling. It should be able to show that we are 
spending much time in those timer routines. If we guessed somehow that 
reason is in slow ACPI timer, it is easy to write respective benchmark, 
but we can't write tests for everything, and even if we could, users 
won't be able to run/analyze output of them without some level of knowledge.


I've spent much time profiling that on hardware I have, but the only way 
to be sure in general case I see is more testing and feedbacks. For this 
specific area I am using very simple test, that effectively depends on 
interrupt latency and CPUs wakeup times: `dd if=/dev/ada0 of=/dev/null 
bs=512`. Depending on device, controller and other factors, gives me 
about 20-30K IOPS.


If you have some ideas what and how could we test automatically -- welcome.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: svn commit: r237318 - in stable/8: share/man/man4 sys/cam sys/cam/scsi sys/conf

2012-06-22 Thread Alexander Motin

On 06/22/12 21:41, Mike Tancsa wrote:

On 6/20/2012 10:39 AM, Alexander Motin wrote:

Author: mav
Date: Wed Jun 20 14:39:35 2012
New Revision: 237318
URL: http://svn.freebsd.org/changeset/base/237318

Log:
   MFC r236712:
   To make CAM debugging easier, compile in some debug flags (CAM_DEBUG_INFO,
   CAM_DEBUG_CDB, CAM_DEBUG_PERIPH and CAM_DEBUG_PROBE) by default.
   List of these flags can be modified with CAM_DEBUG_COMPILE kernel option.
   CAMDEBUG kernel option still enables all possible debug, if not overriden.

   Additional 50KB of kernel size is a good price for the ability to debug
   problems without rebuilding the kernel. In case where size is important,
   debugging can be compiled out by setting CAM_DEBUG_COMPILE option to 0.


Hi,
Not sure if this is the commit or not, but a kernel from the 18th seems 
to function normally, and a kernel from today has a great deal of messages like 
the ones below. I also dont know if this is just exposing an existing bug in 
the driver that was upto now hidden ?


That's not. That's a bit later.


Boot time, I see the following

(probe1:twa0:0:1:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe1:twa0:0:1:0): CAM status: Invalid Target ID
(probe1:twa0:0:1:0): Error 22, Unretryable error
(probe2:twa0:0:2:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe2:twa0:0:2:0): CAM status: Invalid Target ID
(probe2:twa0:0:2:0): Error 22, Unretryable error
(probe3:twa0:0:3:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe3:twa0:0:3:0): CAM status: Invalid Target ID
(probe3:twa0:0:3:0): Error 22, Unretryable error
(probe4:twa0:0:4:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe4:twa0:0:4:0): CAM status: Invalid Target ID
(probe4:twa0:0:4:0): Error 22, Unretryable error
(probe15:twa0:0:15:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe15:twa0:0:15:0): CAM status: Invalid Target ID
(probe15:twa0:0:15:0): Error 22, Unretryable error
(probe16:twa0:0:16:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe16:twa0:0:16:0): CAM status: Invalid Target ID
(probe16:twa0:0:16:0): Error 22, Unretryable error
(probe17:twa0:0:17:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe17:twa0:0:17:0): CAM status: Invalid Target ID
(probe17:twa0:0:17:0): Error 22, Unretryable error
(probe18:twa0:0:18:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe18:twa0:0:18:0): CAM status: Invalid Target ID
(probe18:twa0:0:18:0): Error 22, Unretryable error
(probe19:twa0:0:19:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe19:twa0:0:19:0): CAM status: Invalid Target ID
(probe19:twa0:0:19:0): Error 22, Unretryable error
(probe20:twa0:0:20:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe20:twa0:0:20:0): CAM status: Invalid Target ID
(probe20:twa0:0:20:0): Error 22, Unretryable error
(probe21:twa0:0:21:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe21:twa0:0:21:0): CAM status: Invalid Target ID
(probe21:twa0:0:21:0): Error 22, Unretryable error
(probe22:twa0:0:22:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe22:twa0:0:22:0): CAM status: Invalid Target ID
(probe22:twa0:0:22:0): Error 22, Unretryable error
(probe23:twa0:0:23:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe23:twa0:0:23:0): CAM status: Invalid Target ID
(probe23:twa0:0:23:0): Error 22, Unretryable error
(probe24:twa0:0:24:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe24:twa0:0:24:0): CAM status: Invalid Target ID
(probe24:twa0:0:24:0): Error 22, Unretryable error
(probe25:twa0:0:25:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe25:twa0:0:25:0): CAM status: Invalid Target ID
(probe25:twa0:0:25:0): Error 22, Unretryable error
(probe26:twa0:0:26:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe26:twa0:0:26:0): CAM status: Invalid Target ID
(probe26:twa0:0:26:0): Error 22, Unretryable error
(probe5:twa0:0:5:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe5:twa0:0:5:0): CAM status: Invalid Target ID
(probe5:twa0:0:5:0): Error 22, Unretryable error
(probe6:twa0:0:6:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe6:twa0:0:6:0): CAM status: Invalid Target ID
(probe6:twa0:0:6:0): Error 22, Unretryable error
(probe7:twa0:0:7:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe7:twa0:0:7:0): CAM status: Invalid Target ID
(probe7:twa0:0:7:0): Error 22, Unretryable error
(probe8:twa0:0:8:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe8:twa0:0:8:0): CAM status: Invalid Target ID
(probe8:twa0:0:8:0): Error 22, Unretryable error
(probe9:twa0:0:9:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe9:twa0:0:9:0): CAM status: Invalid Target ID
(probe9:twa0:0:9:0): Error 22, Unretryable error
(probe10:twa0:0:10:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe10:twa0:0:10:0): CAM status: Invalid Target ID
(probe10:twa0:0:10:0): Error 22, Unretryable error
(probe11:twa0:0:11:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe11:twa0:0:11:0): CAM status: Invalid Target ID
(probe11:twa0:0:11:0): Error 22, Unretryable error
(probe12:twa0:0:12:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe12:twa0:0:12:0): CAM status: Invalid Target ID
(probe12:twa0:0:12:0): Error 22, Unretryable error
(probe13:twa0:0:13:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe13:twa0:0:13:0): CAM status: Invalid Target ID
(probe13:twa0:0:13:0): Error 22, Unretryable error
(probe14:twa0:0:14:0): INQUIRY. CDB: 12 0 0 0 24 0
(probe14:twa0:0:14:0

Re: [stable 9] broken hwpstate calls

2012-06-07 Thread Alexander Motin

On 06/07/12 11:10, Andriy Gapon wrote:

on 07/06/2012 02:02 Jung-uk Kim said the following:

Any way, hwpstate still isn't quite right even without your patch.

sys/kern/kern_cpu.c cpufreq_curr_sysctl() -  CPUFREQ_SET() -/* for all
CPU devices */ cf_set_method() -/* thread_lock(), sched_bind(), ... */
CPUFREQ_DRV_SET() -  sys/x86/cpufreq/hwpstate.c hwpstate_set() -
hwpstate_goto_pstate()  /* for each CPU unit */ /* thread_lock(),
sched_bind(), ... */


Oh, I didn't realize that there was the cpufreq-level loop over all CPUs!
That really sucks.

Maybe some day we should accept that different CPUs could legitimately be in
different P-states and provide support for that throughout the stack (from
powerd to drivers).


Support for different P-states on different CPUs can be useful if CPUs 
have different capabilities. I believe it is very rare, but possible. At 
this moment cpufreq should set for each CPU frequency closest to one 
that was set on BSP. It should be possible to make powerd to read sets 
of frequencies from all CPUs and do the same, just more intelligently.


Same time using very different frequencies for different CPUs can IMHO 
be very problematic even in theory. For SMP systems it is quite 
difficult (because of threads migration and possible inter-operations of 
multiple threads) to identify cases when even global frequency can be 
reduced without proportional performance penalty. Making in per-CPU 
multiplies number of options and requires awareness from the scheduler.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-04-29 Thread Alexander Motin

On 04/29/12 09:09, Ian Smith wrote:

On Sun, 29 Apr 2012 08:17:38 +0300, Alexander Motin wrote:
On 04/29/12 01:53, Oliver Pinter wrote:
  Attached the ktr file. This is on core2duo P9400 cpu (
  smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ). The workload
  is only a single user boost: sh + top running, but the load average is
  near 0.5.
  
ktr shows no real load there. But it shows that you are using dummynet, 
that
schedules its runs on every hardclock tick. I believe that load you see is
the result or synchronization between dummynet calls and loadvg sampling,
both of which called from hardclock. I think removing dummynet from 
equation,
should hide this problem and also reduce you laptops power consumption.
  
What's about fixing this, it is loadavg sampling algorithm that should be
changed. Fixing dummynet to not run on every hardclock tick would also be
great.

Wading in out of my depth, and copying Luigi in case he misses it .. but
even back in the olden days when HZ defaulted to 100, one was advised to
use HZ= 1000 for smooth dummynet traffic shaping dispatch scheduling.

I wonder, with the newer clocks and timers, whether there is another
clock that could be used for dummynet scheduling, that would not have
this effect (even if largely cosmetic?) on load average calculation?


First of all, the easiest solution would be to make dummynet to schedule 
callout not automatically, but on first queued packet. I believe that in 
case of laptop the queue should be empty most of time and the callout 
calls are completely useless there. Luigi promised to look on this once.


What's about better precision/removing synchronization -- there is 
starting GSoC project now (by davide@) to rewrite callout(9) subsystem 
to use better precision allowed by new timer drivers. While now it is 
possible to get raw access to additional timer hardware available on 
some systems, I don't think it is a good idea.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-04-29 Thread Alexander Motin

On 04/29/12 15:04, Oliver Pinter wrote:

Removing dummynet from kernel don't chanage anything, that is releated
to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh +
top)


New ktr dump?


On 4/29/12, Alexander Motinm...@freebsd.org  wrote:

On 04/29/12 09:09, Ian Smith wrote:

On Sun, 29 Apr 2012 08:17:38 +0300, Alexander Motin wrote:
  On 04/29/12 01:53, Oliver Pinter wrote:
 Attached the ktr file. This is on core2duo P9400 cpu (
 smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ). The
workload
 is only a single user boost: sh + top running, but the load
average is
 near 0.5.
   
  ktr shows no real load there. But it shows that you are using
dummynet, that
  schedules its runs on every hardclock tick. I believe that load you
see is
  the result or synchronization between dummynet calls and loadvg
sampling,
  both of which called from hardclock. I think removing dummynet from
equation,
  should hide this problem and also reduce you laptops power
consumption.
   
  What's about fixing this, it is loadavg sampling algorithm that
should be
  changed. Fixing dummynet to not run on every hardclock tick would
also be
  great.

Wading in out of my depth, and copying Luigi in case he misses it .. but
even back in the olden days when HZ defaulted to 100, one was advised to
use HZ= 1000 for smooth dummynet traffic shaping dispatch scheduling.

I wonder, with the newer clocks and timers, whether there is another
clock that could be used for dummynet scheduling, that would not have
this effect (even if largely cosmetic?) on load average calculation?


First of all, the easiest solution would be to make dummynet to schedule
callout not automatically, but on first queued packet. I believe that in
case of laptop the queue should be empty most of time and the callout
calls are completely useless there. Luigi promised to look on this once.

What's about better precision/removing synchronization -- there is
starting GSoC project now (by davide@) to rewrite callout(9) subsystem
to use better precision allowed by new timer drivers. While now it is
possible to get raw access to additional timer hardware available on
some systems, I don't think it is a good idea.



--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-04-29 Thread Alexander Motin

On 04/29/12 15:27, Oliver Pinter wrote:

http://oliverp.teteny.bme.hu/freebsd/ktr/


OK. Now there is no dummynet, but I've found there two more things:
 1. for some reason some acpi_thremal thread seems to consume about 
0.37s of time every 10s. I have no idea what is this. It's not 0.7 load, 
but still strange at least.
 2. I suspect another possible synchronization between ehci driver and 
loadavg as result of interrupt sharing between HPET timer used for time 
events and EHCI USB hardware. Not sure what to do about this. Please 
send _verbose_ dmesg to check whether this interrupt sharing is unavoidable.



On 4/29/12, Alexander Motinm...@freebsd.org  wrote:

On 04/29/12 15:04, Oliver Pinter wrote:

Removing dummynet from kernel don't chanage anything, that is releated
to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh +
top)


New ktr dump?


On 4/29/12, Alexander Motinm...@freebsd.org   wrote:

On 04/29/12 09:09, Ian Smith wrote:

On Sun, 29 Apr 2012 08:17:38 +0300, Alexander Motin wrote:
On 04/29/12 01:53, Oliver Pinter wrote:
Attached the ktr file. This is on core2duo P9400 cpu (
smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ).
The
workload
is only a single user boost: sh + top running, but the load
average is
near 0.5.

ktr shows no real load there. But it shows that you are using
dummynet, that
schedules its runs on every hardclock tick. I believe that load
you
see is
the result or synchronization between dummynet calls and loadvg
sampling,
both of which called from hardclock. I think removing dummynet
from
equation,
should hide this problem and also reduce you laptops power
consumption.

What's about fixing this, it is loadavg sampling algorithm that
should be
changed. Fixing dummynet to not run on every hardclock tick
would
also be
great.

Wading in out of my depth, and copying Luigi in case he misses it ..
but
even back in the olden days when HZ defaulted to 100, one was advised
to
use HZ= 1000 for smooth dummynet traffic shaping dispatch scheduling.

I wonder, with the newer clocks and timers, whether there is another
clock that could be used for dummynet scheduling, that would not have
this effect (even if largely cosmetic?) on load average calculation?


First of all, the easiest solution would be to make dummynet to schedule
callout not automatically, but on first queued packet. I believe that in
case of laptop the queue should be empty most of time and the callout
calls are completely useless there. Luigi promised to look on this once.

What's about better precision/removing synchronization -- there is
starting GSoC project now (by davide@) to rewrite callout(9) subsystem
to use better precision allowed by new timer drivers. While now it is
possible to get raw access to additional timer hardware available on
some systems, I don't think it is a good idea.



--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-04-29 Thread Alexander Motin

On 04/29/12 15:27, Alex Kozlov wrote:

On Sun, Apr 29, 2012 at 03:07:40PM +0300, Alexander Motin wrote:

On 04/29/12 15:04, Oliver Pinter wrote:

Removing dummynet from kernel don't chanage anything, that is releated
to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh +
top)


New ktr dump?

I have similar issue on one of my laptops. Should I provide ktr dump?
http://lists.freebsd.org/pipermail/freebsd-current/2011-September/027133.html


In your case HPET also shares interrupt with other devices. I suspect 
that may be a reason. Every time when swi thread runs loadavg, other CPU 
runs shared interrupt handler, that is accounted as result. Please show 
your verbose dmesg.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-04-29 Thread Alexander Motin

On 04/29/12 16:30, Alex Kozlov wrote:

On Sun, Apr 29, 2012 at 04:11:20PM +0300, Alexander Motin wrote:

On 04/29/12 15:27, Alex Kozlov wrote:

On Sun, Apr 29, 2012 at 03:07:40PM +0300, Alexander Motin wrote:

On 04/29/12 15:04, Oliver Pinter wrote:

Removing dummynet from kernel don't chanage anything, that is releated
to load average. The loadavg hold to 0.70 +/- 0.2. (single user : sh +
top)


New ktr dump?

I have similar issue on one of my laptops. Should I provide ktr dump?
http://lists.freebsd.org/pipermail/freebsd-current/2011-September/027133.html

In your case HPET also shares interrupt with other devices. I suspect
that may be a reason. Every time when swi thread runs loadavg, other CPU
runs shared interrupt handler, that is accounted as result. Please show
your verbose dmesg.

Attached.


In your case HPET could solely use IRQ22 that seems free now. After 
recent changes in ACPI code it is detected before PCI devices and so 
doesn't avoids sharing. You may try to hint it specific IRQ by adding to 
loader,conf line:

hint.hpet.0.allowed_irqs=0x0040

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-04-28 Thread Alexander Motin

On 04/28/12 00:34, Albert Shih wrote:

  Le 27/04/2012 ? 22:45:40+0200, Oliver Pinter a écrit

I'm running 9-stable on all my computer. (csup yesterday).

On my desktop everything is fine. But I've two laptop, (both are Dell). On
both latptop I've problem about the load, event when I do nothing I got a
load between 0.5-1.

Here the result of a «top» on the laptop :

last pid:  2434;  load averages:  0.63,  0.67,  0.59 up 0+00:23:59
22:25:29
57 processes:  3 running, 54 sleeping
CPU:  2.7% user,  0.0% nice,  3.7% system,  1.4% interrupt, 92.2% idle
Mem: 89M Active, 92M Inact, 198M Wired, 13M Cache, 100M Buf, 3529M Free
Swap: 4096M Total, 4096M Free

Here on the desktop :

last pid: 61010;  load averages:  0.00,  0.00,  0.00 up 2+11:02:42
22:29:08
126 processes: 1 running, 125 sleeping
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 803M Active, 2874M Inact, 1901M Wired, 112M Cache, 620M Buf, 202M Free
Swap: 6144M Total, 36M Used, 6107M Free



http://lists.freebsd.org/pipermail/freebsd-bugs/2012-April/048213.html


What I understand of your message (I'm definitvly not a dev) is that's only
a little problem of accounting.

I'm not absolute sure of that because my laptop fan never stop...

If you want any more information...


Definitely, because here I don't see much.

Generally, all CPU loads and load averages now calculated via sampling, 
so theoretically with spiky load numbers may vary for many reasons. I 
would start from collecting information about running processes. To find 
fast switching processes that could hide from accounting try `top -SH -m 
io -o vcsw`. To get more information about scheduler work, use 
/usr/src/tools/sched/schedgraph.py (instruction inside it).


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: High load event idl.

2012-04-28 Thread Alexander Motin

On 04/29/12 01:53, Oliver Pinter wrote:

Attached the ktr file. This is on core2duo P9400 cpu (
smbios.system.product=HP ProBook 5310m (WD792EA#ABU) ). The workload
is only a single user boost: sh + top running, but the load average is
near 0.5.


ktr shows no real load there. But it shows that you are using dummynet, 
that schedules its runs on every hardclock tick. I believe that load you 
see is the result or synchronization between dummynet calls and loadvg 
sampling, both of which called from hardclock. I think removing dummynet 
from equation, should hide this problem and also reduce you laptops 
power consumption.


What's about fixing this, it is loadavg sampling algorithm that should 
be changed. Fixing dummynet to not run on every hardclock tick would 
also be great.



On 4/28/12, Alexander Motinm...@freebsd.org  wrote:

On 04/28/12 00:34, Albert Shih wrote:

   Le 27/04/2012 ? 22:45:40+0200, Oliver Pinter a écrit

I'm running 9-stable on all my computer. (csup yesterday).

On my desktop everything is fine. But I've two laptop, (both are Dell).
On
both latptop I've problem about the load, event when I do nothing I got
a
load between 0.5-1.

Here the result of a «top» on the laptop :

last pid:  2434;  load averages:  0.63,  0.67,  0.59 up 0+00:23:59
22:25:29
57 processes:  3 running, 54 sleeping
CPU:  2.7% user,  0.0% nice,  3.7% system,  1.4% interrupt, 92.2% idle
Mem: 89M Active, 92M Inact, 198M Wired, 13M Cache, 100M Buf, 3529M Free
Swap: 4096M Total, 4096M Free

Here on the desktop :

last pid: 61010;  load averages:  0.00,  0.00,  0.00 up 2+11:02:42
22:29:08
126 processes: 1 running, 125 sleeping
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 803M Active, 2874M Inact, 1901M Wired, 112M Cache, 620M Buf, 202M
Free
Swap: 6144M Total, 36M Used, 6107M Free



http://lists.freebsd.org/pipermail/freebsd-bugs/2012-April/048213.html


What I understand of your message (I'm definitvly not a dev) is that's
only
a little problem of accounting.

I'm not absolute sure of that because my laptop fan never stop...

If you want any more information...


Definitely, because here I don't see much.

Generally, all CPU loads and load averages now calculated via sampling,
so theoretically with spiky load numbers may vary for many reasons. I
would start from collecting information about running processes. To find
fast switching processes that could hide from accounting try `top -SH -m
io -o vcsw`. To get more information about scheduler work, use
/usr/src/tools/sched/schedgraph.py (instruction inside it).

--
Alexander Motin




--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [stable-ish 9] Dell R815 ipmi(4) attach failure

2012-04-06 Thread Alexander Motin
|  |ipmi1:IPMI System Interface  on isa0
|  |device_attach: ipmi1 attach returned 16
|  |ipmi1:IPMI System Interface  on isa0
|  |device_attach: ipmi1 attach returned 16
|  |ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2
|  |ipmi0: DEBUG ipmi_complete_request 527 before wakeup 6201
|  |ipmi0: DEBUG ipmi_complete_request 529 after wakeup 6263
|  |ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 6323
|  |
|  | Actually, can you compile with:
|  |
|  | optionsKTR
|  | optionsKTR_COMPILE=KTR_SCHED
|  | optionsKTR_MASK=KTR_SCHED
|  |
|  | and then add a temporary hack to ipmi.c to set ktr_mask to 0 after
|  | ipmi_submit_driver_request() returns in ipmi_startup()?  You can
|  | then use 'ktrdump -ct' after boot to capture a log of what the

scheduler

|  | did including if it timed out the sleep, etc.  I think this would be
|  | useful for figuring out what went wrong.  It does seem that it timed
|  | out after 3 seconds.
|
|  Assuming I didn't mess up, the log should be at:
|   http://people.freebsd.org/~ambrisko/ipmi_ktr_dump.txt
|  again, I using ipmi(4) as module loaded via the loader.
|
| If you use -ct then you get a file you can feed into schedgraph.
| However, just reading the log, it seems that IRQ 20 keeps preempting
| the KCS worker thread preventing it from getting anything done.  Also,
| there seem to be a lot of threads on CPU 0's runqueue waiting for a
| chance to run (load average of 12 or 13 the entire time).  You can try
| just bumping up the max timeout from 3 seconds to higher perhaps.  Not
| sure why IRQ 20 keeps firing though.  It might be related to USB, so
| you could try fiddling with USB options in the BIOS perhaps, or disabling
| the USB drivers to see if that fixes IPMI.

Tried without USB in kernel:
http://people.freebsd.org/~ambrisko/ipmi_ktr_dump_no_usb.txt


Hmm, it's still just running constantly (note that the idle thread is
_never_ scheduled).  The lion's share of the time seems to be spent in
xpt_thrd.  Note that there are several places where nothing happens except
that xpt_thrd runs constantly (spinning) during 10's of statclock ticks.  I
would maybe start debugging that to see what in the world it is doing.  Maybe
it is polling some hardware down in xpt_action() (i.e., xpt_action() for a
single bus called down into a driver and it is just spinning using polling
instead of sleeping and waiting for an interrupt).


xpt_thrd is a bus scanner thread. It is scheduled by CAM for every bus 
on attach and by controller driver on hot-plug events. For some 
controllers it may be quite CPU-hungry. For example, for legacy ATA 
controllers, where bus reset may take many seconds of hardware polling, 
while devices just spinning up. For ahci(4) it was improved about year 
ago to not use polling when possible, but it still may loop for some 
time if controller is not responding on reset. What mfi(4), mentioned in 
log, does during scanning, I am not sure.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [stable-ish 9] Dell R815 ipmi(4) attach failure

2012-04-06 Thread Alexander Motin

On 04/06/12 20:12, Doug Ambrisko wrote:

Alexander Motin writes:
[ Charset ISO-8859-1 unsupported, converting... ]
| On 04/04/12 21:47, John Baldwin wrote:
|  On Wednesday, April 04, 2012 12:24:33 pm Doug Ambrisko wrote:
|  John Baldwin writes:
|  | On Tuesday, April 03, 2012 12:37:50 pm Doug Ambrisko wrote:
|  |   John Baldwin writes:
|  |   | On Monday, April 02, 2012 7:27:13 pm Doug Ambrisko wrote:
|  |   |   Doug Ambrisko writes:
|  |   |   | John Baldwin writes:
|  |   |   | | On Saturday, March 31, 2012 3:25:48 pm Doug Ambrisko wrote:
|  |   |   | |   Sean Bruno writes:
|  |   |   | |   | Noting a failure to attach to the onboard IPMI 
controller
|  with
|  | this
|  |   | dell
|  |   |   | |   | R815.  Not sure what to start poking at and thought I'd
|  though
|  | this
|  |   | over
|  |   |   | |   | here for comment.
|  |   |   | |   |
|  |   |   | |   | -bash-4.2$ dmesg |grep ipmi
|  |   |   | |   | ipmi0: KCS mode found at io 0xca8 on acpi
|  |   |   | |   | ipmi1:IPMI System Interface   on isa0
|  |   |   | |   | device_attach: ipmi1 attach returned 16
|  |   |   | |   | ipmi1:IPMI System Interface   on isa0
|  |   |   | |   | device_attach: ipmi1 attach returned 16
|  |   |   | |   | ipmi0: Timed out waiting for GET_DEVICE_ID
|  |   |   | |
|  |   |   | |   I've run into this recently.  A quick hack to fix it is:
|  |   |   | |
|  |   |   | |   Index: ipmi.c
|  |   |   | |
[snip]
|  | If you use -ct then you get a file you can feed into schedgraph.
|  | However, just reading the log, it seems that IRQ 20 keeps preempting
|  | the KCS worker thread preventing it from getting anything done.  Also,
|  | there seem to be a lot of threads on CPU 0's runqueue waiting for a
|  | chance to run (load average of 12 or 13 the entire time).  You can try
|  | just bumping up the max timeout from 3 seconds to higher perhaps.  Not
|  | sure why IRQ 20 keeps firing though.  It might be related to USB, so
|  | you could try fiddling with USB options in the BIOS perhaps, or disabling
|  | the USB drivers to see if that fixes IPMI.
|
|  Tried without USB in kernel:
|   http://people.freebsd.org/~ambrisko/ipmi_ktr_dump_no_usb.txt
|
|  Hmm, it's still just running constantly (note that the idle thread is
|  _never_ scheduled).  The lion's share of the time seems to be spent in
|  xpt_thrd.  Note that there are several places where nothing happens except
|  that xpt_thrd runs constantly (spinning) during 10's of statclock ticks.  
I
|  would maybe start debugging that to see what in the world it is doing.  
Maybe
|  it is polling some hardware down in xpt_action() (i.e., xpt_action() for a
|  single bus called down into a driver and it is just spinning using polling
|  instead of sleeping and waiting for an interrupt).
|
| xpt_thrd is a bus scanner thread. It is scheduled by CAM for every bus
| on attach and by controller driver on hot-plug events. For some
| controllers it may be quite CPU-hungry. For example, for legacy ATA
| controllers, where bus reset may take many seconds of hardware polling,
| while devices just spinning up. For ahci(4) it was improved about year
| ago to not use polling when possible, but it still may loop for some
| time if controller is not responding on reset. What mfi(4), mentioned in
| log, does during scanning, I am not sure.

I thought that mfi(4) could be an issue.  There are some ata controllers
with nothing attached.  I built a GENERIC with USB and mfi commented out
and then the timeout issue went away:
   ipmi0: KCS mode found at io 0xca8 on acpi
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 1
   ipmi0: DEBUG ipmi_complete_request 527 before wakeup 2211
   ipmi0: DEBUG ipmi_complete_request 529 after wakeup 2272
   ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 2332
   ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0

Without mfi and with USB and it had issues:
   ipmi0: KCS mode found at io 0xca8 on acpi
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi1:IPMI System Interface  on isa0
   device_attach: ipmi1 attach returned 16
   ipmi0: DEBUG ipmi_submit_driver_request 551 before msleep 2
   ipmi0: DEBUG ipmi_complete_request 527 before wakeup 3137
   ipmi0: DEBUG ipmi_complete_request 529 after wakeup 3199
   ipmi0: DEBUG ipmi_submit_driver_request 553 after msleep 3259
   ipmi0: Timed out waiting for GET_DEVICE_ID
   ipmi0: IPMI device rev. 0, firmware rev. 1.61, version 2.0

I can post more ktrdump traces if needed.  A 1U Dell machine without
mfi also has this problem.  As John mentioned it might be good to
bump up the timeout from 3s to 6s.  I did that with the USB no mfi
kernel and that passed:

   % dmesg | grep ipmi
   ipmi0: KCS mode found at io 0xca8 on acpi
   ipmi1:IPMI System Interface  on isa0
   device_attach

Serverworks HT-1000 HPET event timer

2012-03-09 Thread Alexander Motin

Hi.

Does anybody have success story of using HPET event timer (not time 
counter!) on Serverworks HT-1000 chipset under FreeBSD 9/10?


I was reported about problems with it on HP BL465c G6 blade system and 
now thinking whether it is global problem or specific to this system.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: missing disk device under 9-STABLE

2012-03-03 Thread Alexander Motin

Hi.

On 03.03.2012 21:08, Jeff Blank wrote:

I attempted an upgrade last night from an old 8-STABLE (25 Apr 2011)
to 9-STABLE and ran into a problem where a disk apparently wasn't
detected.  I'm of course aware of the ATA/CAM changes, but I haven't
found anything that quite explains what's happening here.  I've
attached dmesg output from the 8-STABLE and 9-STABLE kernels as well
as the results of 'ls -l /dev/ad*' and 'zpool status' under both
kernels.

ZFS seems to have figured out what to do about its ad4p3 member
(switching to a gptid device), but since only ada0 is detected during
boot, it can't complete the pool.  The weird thing is, though, that
the other disk was actually detected on one reboot to the 9.0 kernel,
ZFS was happy, etc.  I haven't been able to reproduce it, though.

Due to these problems, I haven't upgraded userland yet and am of
course sticking with the 8-STABLE kernel, but I can boot into the
9-STABLE kernel at will if anyone needs more information.


This looks like cause of the missing disk:

ahcich1: Timeout on slot 0 port 0
ahcich1: is 0002 cs  ss  rs 0001 tfd 50 serr 
 cmd 6017

ahcich1: Timeout on slot 0 port 0
ahcich1: is 0002 cs  ss  rs 0001 tfd 50 serr 
 cmd 6017


It tells that controller signals interrupt, but driver haven't got it. 
That is even more strange after the disk on first SATA port is working 
fine. You may try to add to your /boot/loader.conf line:

hint.ahci.0.msi=0
, or just set it via loader prompt.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: missing disk device under 9-STABLE

2012-03-03 Thread Alexander Motin

On 03.03.2012 22:21, Jeff Blank wrote:

On Sat, Mar 03, 2012 at 09:51:53PM +0200, Alexander Motin wrote:

This looks like cause of the missing disk:

ahcich1: Timeout on slot 0 port 0
ahcich1: is 0002 cs  ss  rs 0001 tfd 50 serr
 cmd 6017
ahcich1: Timeout on slot 0 port 0
ahcich1: is 0002 cs  ss  rs 0001 tfd 50 serr
 cmd 6017

It tells that controller signals interrupt, but driver haven't got it.
That is even more strange after the disk on first SATA port is working
fine. You may try to add to your /boot/loader.conf line:
hint.ahci.0.msi=0
, or just set it via loader prompt.


Alexander,

Thanks, that seemed to clear the problem up, no troubles through half
a dozen or more reboots.

Is disabling MSI likely to have any side effects on, for example,
performance or reliability?  Is there any point to pursuing this as a
FreeBSD problem, since I didn't have any issues under the old ATA
system?  I'm happy to help troubleshoot this if anyone thinks it's
worth looking into.


MSI interrupts could give a bit better performance. But with regular 
HDDs I think it is unlikely that you notice any difference. What's about 
about old driver, it never used MSI by default, while new one does.


What board and chipset do you use? Have you tried to update BIOS? Please 
show `pciconf -lvcb` output about the controller.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Lost ata_xpt.c fix for -stable: svn commit: r217444

2012-02-15 Thread Alexander Motin

Hi.

On 02/15/12 12:02, Harald Schmalzbauer wrote:

I just applied my local patches against RELENG_8_2 src tree and found
that http://svn.freebsd.org/changeset/base/217444 was still missing, and
if I read svnweb right (sorry, lack of svn knowledge here), it's also
missing in -stable.
Any plans to commit?


As I can see, it was merged to 8-STABLE a year ago at r218340: 
http://svnweb.freebsd.org/base?view=revisionrevision=218340


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: problems with AHCI on FreeBSD 8.2

2012-02-14 Thread Alexander Motin

On 02/14/12 11:19, Victor Balada Diaz wrote:

We're having some troubles with AHCI under FreeBSD 8.2 and 8-STABLE. The error 
is:

ahcich0: Timeout on slot 8
ahcich0: is  cs 0100 ss  rs 0100 tfd c0 serr 
ahcich0: AHCI reset...
ahcich0: SATA connect time=0ms status=0123
ahcich0: ready wait time=18ms
ahcich0: AHCI reset done: device found
(ada0:ahcich0:0:0:0): Request requeued
(ada0:ahcich0:0:0:0): Retrying command
(ada0:ahcich0:0:0:0): Command timed out
(ada0:ahcich0:0:0:0): Retrying command
ahcich0: Timeout on slot 8
ahcich0: is  cs 007ff000 ss 007fff00 rs 007fff00 tfd c0 serr 
ahcich0: AHCI reset...
ahcich0: SATA connect time=0ms status=0123
ahcich0: ready wait time=84ms
ahcich0: AHCI reset done: device found
(ada0:ahcich0:0:0:0): Request requeued
(ada0:ahcich0:0:0:0): Retrying command
(ada0:ahcich0:0:0:0): Command timed out
(ada0:ahcich0:0:0:0): Retrying command
(ada0:ahcich0:0:0:0): Request requeued
[...]

If we use old ATA driver we have no problems. If we just use the first disk 
(ada0) with ahci,
no problems either. If we use both disks (ada0 and ada1) in gmirror setup with 
ahci, we
got the above error. If we use both disks in gmirror with old ata driver, no 
problems.


In both cases controller reports command status as 0xc0, that means 
device is busy with the command. For NCQ commands it means that device 
in in stage of processing command itself, not a head positioning or data 
transfer. Enabling AHCI enables NCQ for the devices. That increases load 
on both devices and the controller, and it is difficult to say who's 
fault is here. SAMSUNG HD154UI disks AFAIR have 4k sectors that may have 
big performance penalties when accessing small/misaligned data. I am not 
sure how big that penalty can be in the worst case, especially since 
disks by default cache writes, hiding the real load level. Relations 
with gmirror is harder to explain. Depending on how you created it and 
partitions it could cause more misaligned I/Os during rebuild. Using 
gmirror also double concurrent load on the controller, but at this point 
I have nothing to blame it for.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Disable DMA.

2012-02-11 Thread Alexander Motin

On 02/11/12 20:15, Peter Ankerstål wrote:

In FreeBSD 8 i used the loader-variable hw.ata.ata_dma=0 to get my computer 
boot on a CF card. But
in FreeBSD 9.0 it doesn't seem to work. Could it be another variable or is it 
something else that doesn't work
in 9? The machine boots up the installer when the CF-card is not present but 
when it is present it stops right
after the Timecounter stuff.


On 9.0 you can to it with
hint.ata.X.mode=PIO4
, where X is a bus number.

In recent 8/9-STABLE I've also resurrected hw.ata.ata_dma=0.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Alexander Motin
: siisch1: Error while READ LOG EXT


This indicates the underlying device was handed a READ LOG EXT ATA
command (command 0x2f) and the device did not respond promptly
(resulting in the timeout messages you see).


There are hours between timeouts and READ LOG EXT errors. they are not 
directly related, but may have the same reason.



smartctl doesnt show any issues on the drives other than one that has some historical 
errors from a while ago.  What are these errors and do I need to worry about them ? The 
READ LOG EXT ones are new.

{snipping SMART stats}


You're focused heavily on the READ LOG EXT command.  READ LOG EXT is
intended for accessing the GP Log section of a drive.  EXT stands for
Extended.  GP Log means General Purpose Log, and is where all
sorts of logging information regarding drive performance is stored.
It's usually stored within a reserved section of the platters, or in the
HPA area.  It's not within a standard user-accessible LBA/sector
region.  This is a completely separate log from that of SMART logs.


READ LOG EXT commands here used to fetch status of some failed NCQ 
commands. It is normal (the only) way to get detailed error status in 
that case. Error of the READ LOG EXT commands may mean that it is not 
regular media error, but may be problem with communication, firmware or 
something else.



You can review the different types of logs on a device by reviewing
the ATA8-ACS specification here.  See Annex A, section A.1, page 362:

http://www.t13.org/documents/UploadedDocuments/docs2007/D1699r4a-ATA8-ACS.pdf

This is almost certainly a lower level problem with the disk that cannot
be addressed/solved via normal means.  Thus, my recommendation is to
replace the disk.

If you would rather not replace the disk, I can try to step you through
looking at the GPLog sections of the disk to see if you can trigger the
problem -- and I have a feeling you'll be able to, but I won't
necessarily be able to tell you where the actual problem lies
hardware-wise, nor will I be able to solve the problem.

Regarding the repeated errors at semi-regular (but not entirely)
intervals: are you using smartd?  Do you have a cronjob that issues
smartctl -a or smartctl -x commands at intervals?  I imagine any of
these could be tickling something lower level.

Also, please upgrade your smartmontools to 5.42.  It does provide some
further enhancements that are useful.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: siisch1: Error while READ LOG EXT

2012-02-08 Thread Alexander Motin

On 09.02.2012 00:38, Jeremy Chadwick wrote:

On Thu, Feb 09, 2012 at 12:22:40AM +0200, Alexander Motin wrote:

On 08.02.2012 23:27, Jeremy Chadwick wrote:

On Wed, Feb 08, 2012 at 04:00:57PM -0500, Mike Tancsa wrote:

I have a 4 port eSata PCIe card with 3 external port multipliers attached on an 
AMD64 box (8G of RAM), RELENG8 from Feb1st.

siis0@pci0:5:0:0:   class=0x010400 card=0x71241095 chip=0x31241095 rev=0x02 
hdr=0x00
 vendor = 'Silicon Image Inc (Was: CMD Technology Inc)'
 device = 'PCI-X to Serial ATA Controller (SiI 3124)'
 class  = mass storage
 subclass   = RAID
 bar   [10] = type Memory, range 64, base 0xb4408000, size 128, enabled
 bar   [18] = type Memory, range 64, base 0xb440, size 32768, enabled
 bar   [20] = type I/O Port, range 32, base 0x3000, size 16, enabled
 cap 01[64] = powerspec 2  supports D0 D1 D2 D3  current D0
 cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 12 split 
transactions
 cap 05[54] = MSI supports 1 message, 64 bit enabled with 1 message

siis0:SiI3124 SATA controller   port 0x3000-0x300f mem 
0xb4408000-0xb440807f,0xb440-0xb4407fff irq 19 at device 0.0 on pci5
siis0: [ITHREAD]
siisch0:SIIS channel   at channel 0 on siis0
siisch0: [ITHREAD]
siisch1:SIIS channel   at channel 1 on siis0
siisch1: [ITHREAD]
siisch2:SIIS channel   at channel 2 on siis0
siisch2: [ITHREAD]
siisch3:SIIS channel   at channel 3 on siis0
siisch3: [ITHREAD]

# camcontrol devlist
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 0 lun 0 (pass0,ada0)
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 1 lun 0 (pass1,ada1)
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 2 lun 0 (pass2,ada2)
WDC WD2001FASS-00U0B0 01.00101 at scbus0 target 3 lun 0 (pass3,ada3)
Port Multiplier 47261095 1f06  at scbus0 target 15 lun 0 (pass4,pmp1)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 0 lun 0 (pass5,ada4)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 1 lun 0 (pass6,ada5)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 2 lun 0 (pass7,ada6)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 3 lun 0 (pass8,ada7)
WDC WD2002FAEX-007BA0 05.01D05 at scbus1 target 4 lun 0 (pass9,ada8)
Port Multiplier 37261095 1706  at scbus1 target 15 lun 0 (pass10,pmp0)
Areca usrvar R001  at scbus4 target 0 lun 0 (pass11,da0)
Areca backup1 R001 at scbus4 target 0 lun 1 (pass12,da1)
Areca RAID controller R001 at scbus4 target 16 lun 0 (pass13)
AMCC 9650SE-2LP DISK 4.10  at scbus5 target 0 lun 0 (pass14,da2)
ST31000333AS SD35  at scbus6 target 0 lun 0 (pass15,ada9)
ST31000528AS CC35  at scbus7 target 0 lun 0 (pass16,ada10)
ST31000340AS SD1A  at scbus8 target 0 lun 0 (pass17,ada11)
WDC WD1002FAEX-00Z3A0 05.01D05 at scbus11 target 0 lun 0 (pass18,ada12)


Ever since I added a new PM, I have been seeing a new error (READ LOG EXT) 
along with a the odd slot timeout error.


Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4700
Feb  7 23:49:32 backup3 kernel: siisch1: Timeout on slot 26
Feb  7 23:49:32 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:32 backup3 kernel: siisch1:  ... waiting for slots 4300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 30
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0300
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 25
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068
Feb  7 23:49:34 backup3 kernel: siisch1:  ... waiting for slots 0100
Feb  7 23:49:34 backup3 kernel: siisch1: Timeout on slot 24
Feb  7 23:49:34 backup3 kernel: siisch1: siis_timeout is 0704 ss 7f17e8b9 
rs 7f17e8b9 es  sts 801d2000 serr 0068


This indicates the controller on channel 1 (siisch1) is stalled
waiting for underlying communication with the device attached to it.


Feb  7 23:57:59 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:13:36 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:21:53 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:22:16 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 00:39:13 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:24:25 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:33:52 backup3 last message repeated 2 times
Feb  8 01:43:45 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 01:50:31 backup3 last message repeated 2 times
Feb  8 01:55:20 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:26:26 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 02:27:24 backup3 kernel: siisch1: Error while READ LOG EXT
Feb  8 03:16:28 backup3 kernel

Re: Kernel panics under 8.2 due to ATA timeouts

2012-02-01 Thread Alexander Motin

Hi.

On 01/30/12 22:46, Andrew Boyer wrote:

I have a system that appears to have a flaky SATA controller (one of the Intel 
ESB2 variants) and it seems to be exposing a weakness in the ATA driver (not 
using ATA_CAM).  If a command with ATA_R_DIRECT set times out, the channel gets 
reinitialized, but from the soft interrupt context.  It panics when it tries to 
sleep in ata_queue_request().

Timeouts work if ATA_R_DIRECT isn't set because in that case it uses a 
taskqueue to complete the request.

Here is the backtrace:

#0  kdb_enter (why=0x80962cfa panic, msg=0xaAddress 0xa out of 
bounds) at ../../../kern/subr_kdb.c:349
#1  0x805d6d0b in panic (fmt=Variable fmt is not available.
) at ../../../kern/kern_shutdown.c:689
#2  0x8061bc53 in sleepq_add (wchan=0xff00052c3e58, lock=0xff00052c3e38, 
wmesg=0x808fa213 ATA request done,
 flags=1, queue=0) at ../../../kern/subr_sleepqueue.c:320
#3  0x80590c95 in _cv_timedwait (cvp=0xff00052c3e58, 
lock=0xff00052c3e38, timo=4) at ../../../kern/kern_condvar.c:313
#4  0x805d61af in _sema_timedwait (sema=0xff00052c3e38, timo=4, 
file=0x808fa1f6 ../../../dev/ata/ata-queue.c,
 line=118) at ../../../kern/kern_sema.c:123
#5  0x8028559f in ata_queue_request (request=0xff00052c3dc0) at 
../../../dev/ata/ata-queue.c:117
#6  0x80286628 in ata_controlcmd (dev=0xff0002e83d00, command=239 '?', 
feature=Variable feature is not available.
) at ../../../dev/ata/ata-queue.c:153
#7  0x8027ffd3 in ata_setmode (dev=0xff0002e83d00) at 
../../../dev/ata/ata-all.c:637
#8  0x802a0af9 in ad_init (dev=0xff0002e83d00) at 
../../../dev/ata/ata-disk.c:405
#9  0x802a0c29 in ad_reinit (dev=0xff0002e83d00) at 
../../../dev/ata/ata-disk.c:221
#10 0x80280cad in ata_reinit (dev=0xff0002902800) at ata_if.h:79
#11 0x802856c4 in ata_completed (context=Variable context is not 
available.
) at ../../../dev/ata/ata-queue.c:313
#12 0x80285ffb in ata_finish (request=0xff00054ec8c0) at 
../../../dev/ata/ata-queue.c:265
#13 0x805ed419 in softclock (arg=Variable arg is not available.
) at ../../../kern/kern_timeout.c:430


This is very repeatable.  I'm not sure what's the best fix - always use a 
taskqueue on timeouts?  Don't reinit if direct commands fail?


This is one of the most messy points of the old ata(4). Problem is that 
reinit implemented to work synchronously. It means that if some command 
caused timeout and started reinit, that reinit runs from the taskqueue, 
blocking it. As result, we can't use taskqueue for completion there and 
can't do reinit on one of reinit commands timeout. That is handled using 
ATA_STALL_QUEUE flag. I remember I've intentionally blocked new device 
detection on reinit to avoid problems with taskqueue there.


What's about ATA_R_DIRECT, sorry, I don't remember why it is used there 
or why it is needed at all. It was done before me. The only place where 
I see it set except ataraid is ata_getparam(), that should be called 
only on initial bus probe.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Timekeeping in stable/9

2012-01-21 Thread Alexander Motin

Hi.

On 01/21/12 11:18, Martin Sugioarto wrote:

Am Wed, 18 Jan 2012 07:50:49 +0100
schrieb Martin Sugioartomar...@sugioarto.com:


I can confirm this on VirtualBox. I've been running WinXP inside
VirtualBox and measured network I/O during downloads. It showed me
very high download rates (around 800kB/s) while it's physically
possible to download 200kB/s through DSL here (Germany sucks with
DSL, even in largest cities, btw!).

I correlated this behavior with high disk I/O on the host. That means
that the timer issues on the virtual host appear when I start a
larger cp job on the host. I also immediately thought that this has
something to do with timers.


I just want to add some information on this. I tested a few things with
VirtualBox yesterday.

I switched off ntpd on the host and tested if there are differences,
but the clock is working correctly on the host. I tested it a few times,
it is stable, as I expect it to be.

It seems to be rather a software problem with VirtualBox. I can see that
when the host is under heavy load (CPU!) the guest does not get enough
runtime to adjust the clock correctly. After a few minutes there has
been a difference of 50 seconds between the host and guest clock. And
furthermore, I don't quite understand how the real time clock works in
VirtualBox but it seems to slide in the different directions causing
weird results with progress bars on MS-Windows XP.

I just want to explain why I thought that I/O influences this. I have
got my hard disk encrypted, so it puts some load on the CPU, too.

If you want to test VirtualBox behavior, you can simple dd
from /dev/random and look at the weird results in VirtualBox.


I am not using VirtualBox right now, so I'll need to setup it to test 
this. Meanwhile you could try to experiment with switching to different 
timecounters and eventtimers. May be some change in 9.0 changed default 
timecounter for you, causing the problem.


timecounter wrap should be the main cause of time drift (if timecounter 
hardware is emulated correctly at all). Different timecounters have 
different wrap periods that can be calculated by dividing 
kern.timecounter.tc.X.mask on kern.timecounter.tc.X.frequency. In my 
case there are: 300s for HPET, 5s for ACPI-fast, 2s for TSC and 55ms for 
i8254. If system won't get timer interrupts within half of that time -- 
time will drift. Start from looking what you are using and how good it 
is in your case.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Timekeeping in stable/9

2012-01-21 Thread Alexander Motin

On 01/21/12 15:20, Martin Sugioarto wrote:

Am Sat, 21 Jan 2012 14:30:53 +0200
schrieb Alexander Motinm...@freebsd.org:

I am not using VirtualBox right now, so I'll need to setup it to test
this. Meanwhile you could try to experiment with switching to
different timecounters and eventtimers. May be some change in 9.0
changed default timecounter for you, causing the problem.


I think we have a misunderstanding here. The host (FreeBSD 9.0R) works
fine. The time is being updated under heavy load without problems.

I already said that this seems to be an application problem and this
email(s) should be rather seen by the VBox maintainer. The problem is
that VBox seems to stop working properly when you put heavy CPU load on
the host. It even does not keep the clock up-to-date.

I can desync the guest clock to -1 minute in a few seconds, just by
running openssl speed -multi 20.


Ah. I'm sorry. I was sure we are debugging FreeBSD inside VirtualBox. If 
we are speaking about FreeBSD outside, then neither timecounter nor 
eventtimer choice should not affect guest if host is working fine. It is 
more question to VirtualBox and may be host system scheduler.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD hangs on boot after kernel upgrade to 9.0-R

2012-01-21 Thread Alexander Motin

Hi.

On 01/21/12 21:34, mato wrote:

I've used freebsd-update to upgrade from 8.2-R to 9.0-R and all looked nice
until the first reboot.  Now my FreeBSD always hangs midway through the boot
process and the last message output is:
uhub3:Intel EHCI root HUB...
I've tried safe boot option but that does not help at all.
When I disable USB support in BIOS the last message before hang is:
ata1: reset tp1 mask=03 ostat0=00 ostat1=00
(aprobe0:ata0:0:1:0): SIGNATURE: eb14

Any idea what might be wrong and how to fix it please ?


The last line is the ATAPI device detection. What ATA controller do you 
have there? On one Core2Duo-class Supermicro system alike hang was 
caused by ITE PATA controller. In that case it was workarounded by 
adding hint.ata.0.mode=PIO4 to /oot/loader.conf. You may try just set 
it from loader prompt with `set` command.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Marvel 88SE9480

2012-01-20 Thread Alexander Motin

Hi.

On 01/21/12 00:19, Mike Tancsa wrote:

I tried this new controller
http://www.addonics.com/products/ad2ms6gpx8.php
which is based on the 88SE9480 chipset.  Does anyone have it working ?

I tried adding the PCI ID, but it does not attach unfortunately.

{0x94801b4b, 0x00, Addonics,  AHCI_Q_NOBSYRES},

ahci0:Addonics AHCI SATA controller  mem
0x4814-0x4815,0x4810-0x4813 irq 16 at device 0.0 on pci1
device_attach: ahci0 attach returned 6

pciconf shows

ahci0@pci0:1:0:0:   class=0x010400 card=0x94801b4b chip=0x94801b4b
rev=0x03 hdr=0x00
 vendor = 'Marvell Technology Group Ltd.'
 class  = mass storage
 subclass   = RAID
 bar   [10] = type Memory, range 64, base 0x4814, size 131072,
enabled
 bar   [18] = type Memory, range 64, base 0x4810, size 262144,
enabled
 cap 01[40] = powerspec 3  supports D0 D1 D3  current D0
 cap 05[50] = MSI supports 1 message, 64 bit
 cap 10[70] = PCI-Express 2 endpoint max data 128(4096) link x8(x8)
ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected
ecap 0002[140] = VC 1 max VC0


I haven't seen SAS controllers compatible with AHCI yet.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange 'hangs' with RELENG_9

2012-01-19 Thread Alexander Motin

On 19.01.2012 18:51, Oliver Pinter wrote:

CC: Alexander Motin

On 1/19/12, László KÁROLYIlas...@karolyi.hu  wrote:

László KÁROLYI wrote:

Ok, couldn't get it through... So here is it, uploaded:

http://www.freeimagehosting.net/s836i

Another screenshot here:

http://www.freeimagehosting.net/xv26d


I am not sure how freezes that could be fixed with key press could be 
related to panics around storage. I would try to go two different ways:
 - for panics, if dumping is not possible, I would try to resolve 
address of the instruction pointer from both messages with `addr2line 
-e /path/to/kernel address`.
 - for freezes I would try to look on eventtimers(4) subsystem: check 
what timer is used, try to switch to different one, try to switch into 
periodic mode.


Since cause of siis timeouts in SATA2 mode is also unclear, I can't also 
exclude that it may be somehow related.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange 'hangs' with RELENG_9

2012-01-19 Thread Alexander Motin

On 01/19/12 21:05, Ian Lepore wrote:

On Thu, 2012-01-19 at 19:14 +0200, Alexander Motin wrote:

On 19.01.2012 18:51, Oliver Pinter wrote:

CC: Alexander Motin

On 1/19/12, László KÁROLYIlas...@karolyi.hu   wrote:

László KÁROLYI wrote:

Ok, couldn't get it through... So here is it, uploaded:

http://www.freeimagehosting.net/s836i

Another screenshot here:

http://www.freeimagehosting.net/xv26d


I am not sure how freezes that could be fixed with key press could be
related to panics around storage. I would try to go two different ways:
   - for panics, if dumping is not possible, I would try to resolve
address of the instruction pointer from both messages with `addr2line
-e /path/to/kernel address`.
   - for freezes I would try to look on eventtimers(4) subsystem: check
what timer is used, try to switch to different one, try to switch into
periodic mode.

Since cause of siis timeouts in SATA2 mode is also unclear, I can't also
exclude that it may be somehow related.


The new eventtimers was also the first thing that came to my mind, but I
couldn't quickly find the right way to boot with a different timer.

I saw in the eventtimers(7) manpage that there's a sysctl to change the
timer, but when I used it the system timing went completely wonky (ntpd
reported it was off by many seconds, a few seconds after I changed it).
When I just tried it again the system locked up and had to be power
cycled.  (I'm trying this on old hardware where my only choices are
i8254 and RTC, and changing to RTC apparently doesn't work well.)  So I
didn't want to recommend it to someone else. :)


That's strange. On all systems I have, I can safely set any event timer 
in any way. Though for better precision it is better to set them using 
loader tunable.



For both eventtimers and timecounters, I think it'd be nice if a tunable
or hint could let the user override the quality number.  But maybe
there's already some better way of influencing the choices the kernel
makes?


kern.eventtimer.timer is both sysctl and loader tunable. You can set it 
anywhere you want. Also for most enevt timers there are documented 
tunables to disable them,


Also, as I've already said, you may try to switch to old periodic mode 
by setting kern.eventtimer.periodic. On your old system with just i8254 
and RTC it is enabled always automatically.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange 'hangs' with RELENG_9

2012-01-19 Thread Alexander Motin

On 01/19/12 22:03, Andriy Gapon wrote:

on 19/01/2012 21:24 László Károlyi said the following:

On 2012.01.19., at 18:18, Andriy Gapon wrote:


Please provide output of the following sysctls:
sysctl kern.eventtimer
sysctl kern.timecounter



[root@sys ~]# sysctl kern.eventtimer
kern.eventtimer.choice: HPET(450) HPET1(450) HPET2(450) LAPIC(400) i8254(100) 
RTC(0)
kern.eventtimer.et.LAPIC.flags: 15
kern.eventtimer.et.LAPIC.frequency: 0
kern.eventtimer.et.LAPIC.quality: 400
kern.eventtimer.et.i8254.flags: 1
kern.eventtimer.et.i8254.frequency: 1193182
kern.eventtimer.et.i8254.quality: 100
kern.eventtimer.et.HPET.flags: 3
kern.eventtimer.et.HPET.frequency: 14318180
kern.eventtimer.et.HPET.quality: 450
kern.eventtimer.et.HPET1.flags: 3
kern.eventtimer.et.HPET1.frequency: 14318180
kern.eventtimer.et.HPET1.quality: 450
kern.eventtimer.et.HPET2.flags: 3
kern.eventtimer.et.HPET2.frequency: 14318180
kern.eventtimer.et.HPET2.quality: 450
kern.eventtimer.et.RTC.flags: 17
kern.eventtimer.et.RTC.frequency: 32768
kern.eventtimer.et.RTC.quality: 0
kern.eventtimer.periodic: 0
kern.eventtimer.timer: HPET
kern.eventtimer.idletick: 0
kern.eventtimer.singlemul: 2
[root@sys ~]# sysctl kern.timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC-low(800) HPET(950) i8254(0) ACPI-fast(900) 
dummy(-100)
kern.timecounter.hardware: HPET
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.ACPI-fast.mask: 4294967295
kern.timecounter.tc.ACPI-fast.counter: 3649705857
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 900
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 27536
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 1224089625
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 950
kern.timecounter.tc.TSC-low.mask: 4294967295
kern.timecounter.tc.TSC-low.counter: 1655163352
kern.timecounter.tc.TSC-low.frequency: 11772185
kern.timecounter.tc.TSC-low.quality: 800
kern.timecounter.smp_tsc: 1
kern.timecounter.invariant_tsc: 1


I wonder whether there could be an interference between HPET being used as
timecounter and HPET being used as an event timer.
Alexander, what do you think?


I don't expect interference between them. HPET timecounter just reads 
same hardware counter that is also read by comparators for eventtimer 
interrupts generation. Theoretically they could interfere if that timer 
was stopped during comparators programming, but it is not.



László, can you please try changing kern.timecounter.hardware to TSC-low or
ACPI-fast?


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.0 and Intel MatrixRAID RAID5

2012-01-17 Thread Alexander Motin

On 17.01.2012 12:53, Alexander Pyhalov wrote:

On my desktop I use Intel MatrixRAID RAID5 soft raid controller. RAID5
is configured over 3 disks. FreeBSD 8.2 sees this as:

ar0: 953874MB Intel MatrixRAID RAID5 (stripe 64 KB) status: READY
ar0: disk0 READY using ad4 at ata2-master
ar0: disk1 READY using ad6 at ata3-master
ar0: disk2 READY using ad12 at ata6-master

Root filesystem is on /dev/ar0s1.
Today I've tried to upgrade to 9.0.
It doesn't see this disk array. Here is dmesg. When I load geom_raid, it
finds something, but doesn't want to work with RAID:

GEOM_RAID: Intel-e922b201: Array Intel-e922b201 created.
GEOM_RAID: Intel-e922b201: No transformation module found for Volume0.
GEOM_RAID: Intel-e922b201: Volume Volume0 state changed from STARTING to
UNSUPPORTED.
GEOM_RAID: Intel-e922b201: Disk ada2 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Subdisk Volume0:2-ada2 state changed from
NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Subdisk Volume0:1-ada1 state changed from
NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Disk ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Subdisk Volume0:0-ada0 state changed from
NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Array started.

No new devices appear in /dev.
How could I solve this issue?


ataraid(4) had mostly read-only support for RAID5 because it doesn't 
update parity data. I haven't thought anybody really using it in such 
condition. That's why geom_raid doesn't support RAID5 now at all.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.0 and Intel MatrixRAID RAID5

2012-01-17 Thread Alexander Motin

On 17.01.2012 19:03, Vinny Abello wrote:

I had something similar on a software based RAID controller on my Intel S5000PSL 
motherboard when I just went from 8.2-RELEASE to 9.0-RELEASE. After adding 
geom_raid_load=YES to my /boot/loader.conf, it still didn't create the device 
on bootup. I had to manually create the label with graid. After that it created 
/dev/raid/ar0 for me and I could mount the volume. Only thing which I've trying to 
understand is the last message below about the integrity check failed. I've found other 
posts on this but when I dig into my setup, I don't see the same problems that are 
illustrated in the post and am at a loss for why that is being stated. Also, on other 
posts I think it was (raid/r0, MBR) that people were getting and trying to fix. Mine is 
(raid/r0, BSD) which I cannot find reference to. I have a feeling it has to do with the 
geometry of the disk or something. Everything else seems fine... I admittedly only use 
this volume for scratch space and didn't have anything important stor

ed

  on it so I wasn't worried about experimenting or losing data.

ada0 at ahcich0 bus 0 scbus2 target 0 lun 0
ada0:WDC WD4000YR-01PLB0 01.06A01  ATA-7 SATA 1.x device
ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad4
ada1 at ahcich1 bus 0 scbus3 target 0 lun 0
ada1:WDC WD4000YR-01PLB0 01.06A01  ATA-7 SATA 1.x device
ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad6

GEOM_RAID: Intel-8c840681: Array Intel-8c840681 created.
GEOM_RAID: Intel-8c840681: Disk ada0s1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-8c840681: Subdisk ar0:0-ada0s1 state changed from NONE to 
ACTIVE.
GEOM_RAID: Intel-8c840681: Disk ada1s1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-8c840681: Subdisk ar0:1-ada1s1 state changed from NONE to 
ACTIVE.
GEOM_RAID: Intel-8c840681: Array started.
GEOM_RAID: Intel-8c840681: Volume ar0 state changed from STARTING to OPTIMAL.
GEOM_RAID: Intel-8c840681: Provider raid/r0 for volume ar0 created.
GEOM_PART: integrity check failed (raid/r0, BSD)

Any ideas on the integrity check anyone?


It is not related to geom_raid, but to geom_part. There is something 
wrong with your label. You may set kern.geom.part.check_integrity sysctl 
to zero do disable these checks. AFAIR it was mentioned in 9.0 release 
notes.



On 1/17/2012 6:57 AM, Matthias Gamsjager wrote:

Not sure if geom_raid is implemented with cam. I remember a post a while
back about this issue to happen with defaulting cam in 9. Did not follow it
so not sure if something has been done about it.

On Tue, Jan 17, 2012 at 11:53 AM, Alexander Pyhalova...@rsu.ru  wrote:


Hello.
On my desktop I use Intel MatrixRAID RAID5 soft raid controller. RAID5 is
configured over 3 disks. FreeBSD 8.2 sees this as:

ar0: 953874MBIntel MatrixRAID RAID5 (stripe 64 KB)  status: READY
ar0: disk0 READY using ad4 at ata2-master
ar0: disk1 READY using ad6 at ata3-master
ar0: disk2 READY using ad12 at ata6-master

Root filesystem is on /dev/ar0s1.
Today I've tried to upgrade to 9.0.
It doesn't see this disk array. Here is dmesg. When I load geom_raid, it
finds something, but doesn't want to work with RAID:

GEOM_RAID: Intel-e922b201: Array Intel-e922b201 created.
GEOM_RAID: Intel-e922b201: No transformation module found for Volume0.
GEOM_RAID: Intel-e922b201: Volume Volume0 state changed from STARTING to
UNSUPPORTED.
GEOM_RAID: Intel-e922b201: Disk ada2 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Subdisk Volume0:2-ada2 state changed from NONE
to ACTIVE.
GEOM_RAID: Intel-e922b201: Disk ada1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Subdisk Volume0:1-ada1 state changed from NONE
to ACTIVE.
GEOM_RAID: Intel-e922b201: Disk ada0 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-e922b201: Subdisk Volume0:0-ada0 state changed from NONE
to ACTIVE.
GEOM_RAID: Intel-e922b201: Array started.

No new devices appear in /dev.
How could I solve this issue?


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 9.0 and Intel MatrixRAID RAID5

2012-01-17 Thread Alexander Motin

On 17.01.2012 23:35, Vinny Abello wrote:

On 1/17/2012 4:04 PM, Alexander Motin wrote:

On 17.01.2012 19:03, Vinny Abello wrote:

I had something similar on a software based RAID controller on my Intel S5000PSL 
motherboard when I just went from 8.2-RELEASE to 9.0-RELEASE. After adding 
geom_raid_load=YES to my /boot/loader.conf, it still didn't create the device 
on bootup. I had to manually create the label with graid. After that it created 
/dev/raid/ar0 for me and I could mount the volume. Only thing which I've trying to 
understand is the last message below about the integrity check failed. I've found other 
posts on this but when I dig into my setup, I don't see the same problems that are 
illustrated in the post and am at a loss for why that is being stated. Also, on other 
posts I think it was (raid/r0, MBR) that people were getting and trying to fix. Mine is 
(raid/r0, BSD) which I cannot find reference to. I have a feeling it has to do with the 
geometry of the disk or something. Everything else seems fine... I admittedly only use 
this volume for scratch space and didn't have anything important st



or

ed

   on it so I wasn't worried about experimenting or losing data.

ada0 at ahcich0 bus 0 scbus2 target 0 lun 0
ada0:WDC WD4000YR-01PLB0 01.06A01   ATA-7 SATA 1.x device
ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad4
ada1 at ahcich1 bus 0 scbus3 target 0 lun 0
ada1:WDC WD4000YR-01PLB0 01.06A01   ATA-7 SATA 1.x device
ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad6

GEOM_RAID: Intel-8c840681: Array Intel-8c840681 created.
GEOM_RAID: Intel-8c840681: Disk ada0s1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-8c840681: Subdisk ar0:0-ada0s1 state changed from NONE to 
ACTIVE.
GEOM_RAID: Intel-8c840681: Disk ada1s1 state changed from NONE to ACTIVE.
GEOM_RAID: Intel-8c840681: Subdisk ar0:1-ada1s1 state changed from NONE to 
ACTIVE.
GEOM_RAID: Intel-8c840681: Array started.
GEOM_RAID: Intel-8c840681: Volume ar0 state changed from STARTING to OPTIMAL.
GEOM_RAID: Intel-8c840681: Provider raid/r0 for volume ar0 created.
GEOM_PART: integrity check failed (raid/r0, BSD)

Any ideas on the integrity check anyone?


It is not related to geom_raid, but to geom_part. There is something wrong with 
your label. You may set kern.geom.part.check_integrity sysctl to zero do 
disable these checks. AFAIR it was mentioned in 9.0 release notes.


Thanks for responding, Alexander. I also found that information about that 
sysctl variable, however I was trying to determine if something is actually 
wrong, how to determine what it is and ultimately how to fix it so it passes 
the check. I'd rather not ignore errors/warnings unless it's a bug. Again, I 
have no data of value on this partition, so I can do anything to fix it. Just 
not sure what to do or look at specifically.


First thing I would check is that partition is not bigger then the RAID 
volume size. If label was created before the RAID volume, that could be 
the reason, because RAID cuts several sectors off the end of disk to 
store metadata.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ataraid and 9.0 RC-2

2011-11-29 Thread Alexander Motin

Hi.

On 27.11.2011 01:41, Adam Stylinski wrote:

I just ran freebsd-update to get up to 9.0-RC2 and discovered that ataraid does 
not work.  I realize I'm an edge case and my scenario is not ideal (I use an 
ITE controller and performance is actually impressively slow), but I cannot 
boot 9.0 from my stripe, even after manually loading ataraid from the loader 
prompt (after running an unload command).  I mention it mostly because other 
people using the fakeraid setup by their motherboards for whatever reason 
(perhaps to share a partition table with windows on the same mirror or stripe) 
may have a similar problem.  It seems like the ar0 device disappeared for me 
completely (even though it finds ada0 and ada1).  I'm using the following 
device:

atapci0@pci0:2:11:0:class=0x010400 card=0x chip=0x82121283 rev=0x13 
hdr=0x00
 vendor = 'Integrated Technology Express (ITE) Inc'
 device = 'ATA 133 IDE RAID Controller (IT8212F)'
 class  = mass storage
 subclass   = RAID
rl0@pci0:2:13:0:class=0x02 card=0x80ea104d chip=0x813910ec rev=0x10 
hdr=0x00

At first I figured because it may be loading AHCI (as per the device naming 
schemes ada0 and ada1).  I haven't looked too much into it (these devices are 
actually PATA not SATA, so AHCI doesn't even exist for these), but maybe 
there's an ATA/AHCI driver that's built into the default kernelthat is 
interfering with ataraid.ko?  Maybe this interferes with my stupidly slow and 
unpopular configuration.

Thanks for any help, I'll also have a gander at the new DEFAULTS for the 
generic kernel in the 9.0 source tree.


FreeBSD 9.x uses new CAM-bases ATA subsystem. ataraid driver depends on 
old ATA infrastructure and does not work with new. Instead, new GEOM 
RAID class was implemented. Unluckily, as soon as ITE produced only PATA 
controllers, there is no support for their metadata format in geom_raid 
module now. So, at the moment, the only option to access that RAID 
volume is to build custom kernel with old ATA and use ataraid. 
Respective kernel options listed in /usr/src/UPDATING item from 20110424.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ataraid and 9.0 RC-2

2011-11-29 Thread Alexander Motin

On 30.11.2011 03:03, Adam Stylinski wrote:

On Tue, Nov 29, 2011 at 08:38:47PM +0200, Alexander Motin wrote:

On 27.11.2011 01:41, Adam Stylinski wrote:

I just ran freebsd-update to get up to 9.0-RC2 and discovered that ataraid does 
not work.  I realize I'm an edge case and my scenario is not ideal (I use an 
ITE controller and performance is actually impressively slow), but I cannot 
boot 9.0 from my stripe, even after manually loading ataraid from the loader 
prompt (after running an unload command).  I mention it mostly because other 
people using the fakeraid setup by their motherboards for whatever reason 
(perhaps to share a partition table with windows on the same mirror or stripe) 
may have a similar problem.  It seems like the ar0 device disappeared for me 
completely (even though it finds ada0 and ada1).  I'm using the following 
device:

atapci0@pci0:2:11:0:class=0x010400 card=0x chip=0x82121283 rev=0x13 
hdr=0x00
  vendor = 'Integrated Technology Express (ITE) Inc'
  device = 'ATA 133 IDE RAID Controller (IT8212F)'
  class  = mass storage
  subclass   = RAID
rl0@pci0:2:13:0:class=0x02 card=0x80ea104d chip=0x813910ec rev=0x10 
hdr=0x00

At first I figured because it may be loading AHCI (as per the device naming 
schemes ada0 and ada1).  I haven't looked too much into it (these devices are 
actually PATA not SATA, so AHCI doesn't even exist for these), but maybe 
there's an ATA/AHCI driver that's built into the default kernelthat is 
interfering with ataraid.ko?  Maybe this interferes with my stupidly slow and 
unpopular configuration.

Thanks for any help, I'll also have a gander at the new DEFAULTS for the 
generic kernel in the 9.0 source tree.


FreeBSD 9.x uses new CAM-bases ATA subsystem. ataraid driver depends on
old ATA infrastructure and does not work with new. Instead, new GEOM
RAID class was implemented. Unluckily, as soon as ITE produced only PATA
controllers, there is no support for their metadata format in geom_raid
module now. So, at the moment, the only option to access that RAID
volume is to build custom kernel with old ATA and use ataraid.
Respective kernel options listed in /usr/src/UPDATING item from 20110424.


Hmm, I may just as well dump the UFS and restore it to a totally geom based 
solution.  If anything it will likely help rather than hurt my performance.


Sure. You can't boot from GEOM STRIPE (you may want MIRROR or CONCAT), 
but if your motherboard has at least one SATA port, single modern hard 
drive may give you even higher speeds then stripe of old PATA drives on 
PCI controller.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ATA/Cdrom(?) panic

2011-11-16 Thread Alexander Motin
Hi.

On 11/16/11 08:43, Bjoern A. Zeeb wrote:
 we have seen this or a very similar panic for about 1 year now once in
 a while and I think I reported it before; this is FreeBSD as guest on
 vmware.   Seems it was a double panic this time.   Could someone please
 see what's going on there?It was on 8.x-STABLE in the past and this
 is 8.2-RELEASE-p4.

The part of code reporting completing request directly is IMHO broken
by design. It returns request completion before request will actually be
completed by lower levels without any knowledge of what's going on
there. There is kind of protection against double request completion,
but it looks like not always working. May be because that part of code
is not locked and nothing prevents that semaphore timeout and normal
request timeout/completion to happen simultaneously. It is surprising to
see even two traps same time, not sure what synchronized them so precisely.

Simple removing that semaphore timeout is not an option, because it will
cause deadlock when this wait happen within taskqueue thread that is
used to handle requests completion and abort that wait. Avoid waiting
inside taskqueue is also impossible without major rewrite. That's why
ATA_CAM drops that code completely.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ATA/Cdrom(?) panic

2011-11-16 Thread Alexander Motin
On 11/16/11 16:14, Bjoern A. Zeeb wrote:
 On Wed, 16 Nov 2011, Alexander Motin wrote:
 
 Hi.

 On 11/16/11 08:43, Bjoern A. Zeeb wrote:
 we have seen this or a very similar panic for about 1 year now once in
 a while and I think I reported it before; this is FreeBSD as guest on
 vmware.   Seems it was a double panic this time.   Could someone please
 see what's going on there?It was on 8.x-STABLE in the past and this
 is 8.2-RELEASE-p4.

 The part of code reporting completing request directly is IMHO broken
 by design. It returns request completion before request will actually be
 completed by lower levels without any knowledge of what's going on
 there. There is kind of protection against double request completion,
 but it looks like not always working. May be because that part of code
 is not locked and nothing prevents that semaphore timeout and normal
 request timeout/completion to happen simultaneously. It is surprising to
 see even two traps same time, not sure what synchronized them so
 precisely.

 Simple removing that semaphore timeout is not an option, because it will
 cause deadlock when this wait happen within taskqueue thread that is
 used to handle requests completion and abort that wait. Avoid waiting
 inside taskqueue is also impossible without major rewrite. That's why
 ATA_CAM drops that code completely.
 
 So the bottom line of what you are saying is:
 1) it's hard to fix right in 8
 2) it's not an issue in 9 anymore at all?

Right.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Trouble with SSD on SATA

2011-11-16 Thread Alexander Motin

Hi.

On 16.11.2011 18:12, Willem Jan Withagen wrote:

I'm getting these:

Nov 16 16:40:49 zfs kernel: ata6: port is not ready (timeout 15000ms)
tfd = 0080
Nov 16 16:40:49 zfs kernel: ata6: hardware reset timeout
Nov 16 16:41:50 zfs kernel: ata6: port is not ready (timeout 15000ms)
tfd = 0080
Nov 16 16:41:50 zfs kernel: ata6: hardware reset timeout

When inserting the tray with a SSD disk connected to that controller.

Which is probably due to a BIOS upgrade
At least it started after upgrading the BIOS. So I'm asking SuperMicro
for an older version.

When this happens, the system sometimes panics, haven't written the
details yet down right now. somewhere in get_devices...

After the panic I really need to powerdown the machine, otherwise it
boots but stalls at finding any disks. It does not just find no disks,
it freezes at the point it should report the found disks in the
bios-boot.
So apparently the ata controller are left in a very confused state.

Why is the controller found at boot, and works as it should.
And why later it just starts generating these hardware resets??


Looking on messages, I would say that you are using AHCI controller with 
old ata(4) driver. I would recommend you to try new ahci(4) driver. It 
has better hot-plug support and also supports NCQ and some other 
features. Note that disks connected to it will be reported as adaX 
instead of adY.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: SATA 6g 4-port non-RAID controller ?

2011-07-25 Thread Alexander Motin

On 25.07.2011 12:34, Kurt Jaeger wrote:

What kind of SATA 6g 4-port non-RAID controller is currently suggested
for use in 8/9 setups with large RAM (64G) setups with ZFS ?


If you need exactly SATA, not SAS 6g controller, then choice is not so 
big: either something integrated into latest Intel (only two ports) or 
AMD (6 ports) chipset, or something based on Marvell 88SE91xx chips. 
Last case also has only 2 ports, but you may install two cards, or use 
Highpoint RocketRAID 640, which is just two above chips connected with 
PCIe bridge.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: MFC: graid(8) (RAID GEOM) support

2011-06-22 Thread Alexander Motin
Jeremy Chadwick wrote:
 On Fri, Jun 17, 2011 at 05:51:24PM -0700, Jeremy Chadwick wrote:
 Sorry for the cross-post, but I thought both lists would want to know
 about this.

 Looks like mav@ just committed this ~17 hours ago:
 http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/geom/raid/g_raid.c

 Those who have historically wanted to use Intel MatrixRAID (now called
 Intel RST (Rapid Storage Technology)), but haven't due to the severe
 issues/risks with ataraid(4), will probably be very interested in
 this commit.  I know I am!

 I plan on stress-testing the Intel support on a 2-disk system with
 RAID-1 enabled, and will document my experiences, procedures, etc...

 Thanks, mav@ and imp@ !

 I'll be sending another mail momentarily asking about USB memory stick
 image building, since to accomplish the above, I want to do a
 bare-bones install on our test system (e.g. enable Intel RAID, set up
 2 disks in a RAID-1 mirror, boot a USB memory stick that contains this
 latest RELENG_8 build, and do sysinstall, etc.. the normal way).


 =
 MFC r219974, r220209, r220210, r220790:
 Add new RAID GEOM class, that is going to replace ataraid(4) in supporting
 various BIOS-based software RAIDs. Unlike ataraid(4) this implementation
 does not depend on legacy ata(4) subsystem and can be used with any disk
 drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4)
 with `options ATA_CAM`). To make code more readable and extensible, this
 implementation follows modular design, including core part and two sets
 of modules, implementing support for different metadata formats and RAID
 levels.

 Support for such popular metadata formats is now implemented:
 Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage.

 Such RAID levels are now supported:
 RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT.

 For all of these RAID levels and metadata formats this class supports
 full cycle of volume operations: reading, writing, creation, deletion,
 disk removal and insertion, rebuilding, dirty shutdown detection
 and resynchronization, bad sector recovery, faulty disks tracking,
 hot-spare disks. For Intel and Promise formats there is support multiple
 volumes per disk set.

 Look graid(8) manual page for additional details.

 Co-authored by: imp
 Sponsored by:   Cisco Systems, Inc. and iXsystems, Inc.
 =
 
 By the way, it doesn't look like the graid(8) man page is being brought
 in to the base system on either of the two RELENG_8 systems I've rebuilt
 in the past few days.
 
 I'm thinking /usr/src/sbin/geom/class/raid/graid.8 isn't being noticed
 as a man page.
 
 /usr/src/sbin/geom/class/raid/Makefile doesn't have MAN8=graid.8 in it,
 is that the problem?

I've just rebuilt my test 8-STABLE system and it installed graid(8).

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: PCIe SATA HBA for ZFS on -STABLE

2011-06-07 Thread Alexander Motin

On 07.06.2011 05:33, Matthew Dillon wrote:

 The absolute cheapest solution is to buy a Sil-3132 PCIe card
 (providing 2 E-SATA ports), and then connect an external port multiplier
 to each port.  External port multiplier enclosures typically support
 5 drives each so that would give you your 10 drives.

 Even the 3132 is a piss-ant little card it does support FIS-Based
 switching so performance will be very good... it will just be limited
 to SATA-II speeds is all.


SiI3132 is indeed good for it's price and it is quite good for random 
I/O. But at burst speeds it is limited lower then SATA-II. Even lower 
then PCIe 1.0 x1 it uses. IIRC I've seen about 150MB/s from one port and 
about 170MB/s from two.


If burst rate is important, SiI3124 chip is much better -- up to about 
900MB/s measured from 4 ports. The only issue is PCI-X interface: either 
motherboard with PCI-X needed, or card with PCIe x8 bridge (like these 
http://www.addonics.com/products/host_controller/adsa3gpx8-4e.asp), but 
last case is too expensive.


There are also much cheaper (~$50) PCIe x1 bridge SiI3124 cards 
(http://www.sybausa.com/productInfo.php?iid=537). They are not so fast 
-- about 200MB/s, but still more then SiI3132. And they still have 4 
SATA ports.



 For SSDs you want to directly connect the SSD to a mobo SATA port and
 then either mount the SSD in the case or mount it in a hot-swap gadget
 that you can screw into a PCI slot (it doesn't actually use the PCI
 connector, just the slot).  A SATA-III port with a SATA-III SSD really
 shines here and 400-500 MBytes/sec random read performance from a single
 SSD is possible, but it isn't an absolute requirement.  A SATA-II port
 will still work fine as long as you don't mind maxing out the bandwidth
 at 250 MBytes/sec.


Agree. Intel on-board ports rock! Recently I've built new system with 
two OCZ Vertex 3 SSDs connected to 6Gbps SATA ports on Intel Sandy 
Bridge class motherboard. UFS on top of graid RAID0 volume gives me 
about 950MB/s on both read and write!



 To get robust hot-swap enclosures you either need to go with SAS or you
 need to go with discrete SATA ports (no port multiplication), and the
 ports have to support hot-swap.  The best hot-swap support for an AHCI
 port is if the AHCI chipset supports cold-presence-detect (CPD), and
 again Mobo AHCI chipsets usually don't.  Hot-swap is a bit hit or miss
 without CPD because power savings modes can effectively prevent hot-swap
 detect from working properly.  Drive disconnects will always be detected
 but drive connects might not be.


I would say it depends. In some cases it is easier to detect hot-plug 
then hot-unplug, as device sends COMINIT that should wake up port even 
from power-save state. With ICH10, for example, I've managed to make 
both hot plug and unplug work even with power-management enabled: 
hot-plug via tracking COMINIT, unplug via it's CPD capability. Without 
PM it just works. :)


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8-STABLE won't boot with ZFSv28

2011-06-03 Thread Alexander Motin
Hi.

Holger Kipp wrote:
 as yesterday was a bank holiday in Germany I wasn't in the office to
 try the patch linked in the email.
 Is it consent that I should try the patch located here:
 
 http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26
 
 and report the result? Or do you need some additional discussion on
 this topic? I really don't know much about ata-intel chipset programming
 interface things, that's why I'm asking :-)

Yes, I want you to try it and report the result.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8-STABLE won't boot with ZFSv28

2011-06-02 Thread Alexander Motin
Hi.

Holger Kipp wrote:
 got the same messages over and over again - panic took some time:
 
 unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0
 ata0: reinit done ..
 ata0: reiniting channel ..
 ata0: DISCONNECT requested
 
 short delay here
 
 ata0: p0: SATA connect time=0ms status=0113
 ata0: p1: SATA connect timeout status=
 ata0: reset tp1 mask=03 ostat0=00 ostat1=00
 ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb
 ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb
 ata0: reset tp2 stat0=00 stat1=00 devices=0x3
 unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0
 ata0: reinit done ..
 ata0: reiniting channel ..
 ata0: DISCONNECT requested

I see two problems here:
 1. devices=0x3 means that two ATAPI devices were detected instead
of one. I can reproduce it also with other Intel chipsets. It looks like
a hardware bug to me. It can be workarounded by reconnecting ATAPI
device to even (2 or 4) SATA port, or connecting any other device there.
 2. DISCONNECT requested means that controller reported PHY status
change for some device on channel, triggering infinite retry. Unluckily
I have no ICH9 board, while I can't reproduce it with ICH10 or above.

This patch should workaround the first problem in software:
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26
Try it please and let's see if with some luck it do something about the
second problem.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 8-STABLE won't boot with ZFSv28

2011-06-02 Thread Alexander Motin
Jeremy Chadwick wrote:
 On Thu, Jun 02, 2011 at 09:53:58AM +0300, Alexander Motin wrote:
 Holger Kipp wrote:
 got the same messages over and over again - panic took some time:

 unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0
 ata0: reinit done ..
 ata0: reiniting channel ..
 ata0: DISCONNECT requested

 short delay here

 ata0: p0: SATA connect time=0ms status=0113
 ata0: p1: SATA connect timeout status=
 ata0: reset tp1 mask=03 ostat0=00 ostat1=00
 ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb
 ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb
 ata0: reset tp2 stat0=00 stat1=00 devices=0x3
 unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0
 ata0: reinit done ..
 ata0: reiniting channel ..
 ata0: DISCONNECT requested
 I see two problems here:
  1. devices=0x3 means that two ATAPI devices were detected instead
 of one. I can reproduce it also with other Intel chipsets. It looks like
 a hardware bug to me. It can be workarounded by reconnecting ATAPI
 device to even (2 or 4) SATA port, or connecting any other device there.
  2. DISCONNECT requested means that controller reported PHY status
 change for some device on channel, triggering infinite retry. Unluckily
 I have no ICH9 board, while I can't reproduce it with ICH10 or above.

 This patch should workaround the first problem in software:
 http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26
 Try it please and let's see if with some luck it do something about the
 second problem.
 
 With regards to item #1: I don't see anything in the ICH9 errata that
 indicates a silicon bug if the only device attached to the controller is
 an ATAPI device and connected to SATA port 0 (presumably), or an
 odd-numbered port?  If this problem exists on other ICHxx and/or ESBxx
 chips, I sure would hope it'd be documented.
 
 I haven't tried confirming it myself, but if need be I can set up a test
 box with a SATA-based DVD drive hooked up to it + provide remote serial
 console/etc. if it'd be of any help.  I don't think it would be (sounds
 like you have lots of hardware :-) ), but I'm willing to help in any way
 I can.

Intel probably don't see issue there, as the same behavior can be found
even on latest chipsets. But according to my ATA specs understanding and
real PATA devices behavior analysis, this behavior is not correct. When
ATAPI device connected to the first of two SATA ports, routed to the
same legacy-/PATA-emulated ATA channel (master device), soft-reset
sequence returns false-positive slave ATAPI device presence. Problem
doesn't expose with ATA disk devices, or if some other device really
attached to the slave port. Problem looks like it was there always, but
before ATA_CAM it was not usually noticed, due to very small IDENTIFY
command timeouts in ata(4).

If somebody can give better explanation or propose better workaround --
welcome, as I am not very like this solution.

 With regards to item #2: could this be at all related to OOB (bit 15)
 somehow being set in PCS (SATA register offset 0x92)?  I'm doubting it
 but I thought I'd ask.  My thought process, which is probably wrong
 (consider it an educational discussion :-) ):
 
 The ICH9 specification states that the default value for this register
 is 0x, and b15=0 means SATA controller will not retry after an OOB
 failure, while b15=1 causes the controller to indefinitely retry after
 OOB failure.  I imagine system BIOSes and other things can change this
 default value, but we don't seem to print it anywhere in
 ata_intel_chipinit() during a verbose boot.
 
 Looking at chipsets/ata-intel.c, it looks like we only touch PCS in
 ata_intel_chipinit() and ata_intel_reset().  In the former, we avoid
 touching bits 4 through 15, and in the latter we mask out only what we
 want to adjust (e.g. the SATA port per ch variable).

As as I can see, ata_intel.c should not change that bit if it was set
for some reason. Theoretically, OOB (Out-of-Band signaling) is the
function of the same state machine which sets that PHY changes status
flag. But friendly speaking, I have no idea what result can be from
setting of this bit. In this legacy/PATA emulation mode there are too
many things not documented to be sure in anything.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ICH9 panic/instability on recent kernel

2011-05-29 Thread Alexander Motin

On 29.05.2011 07:56, Jeremy Chadwick wrote:

On Sat, May 28, 2011 at 09:10:11PM -0700, Michael Sinatra wrote:

I have a core-2 system with a 3ware SATA RAID controller for the
main disks and the built-in Intel ICH9 4-port SATA controller that
is only used for the DVDR.  An 8-STABLE kernel csup'd and compiled
on April 25 works fine on this system.  Kernels from source csup'd
this week are extremely unstable and usually panic or hang just
minutes after booting.  The following warning messages appear after
the kernel probes the SATA controller and/or ICH9 USB controller and
continue about once per 1-2 seconds until the system crashes:

May 13 14:21:05 sonicyouth kernel: unknown: WARNING - ATAPI_IDENTIFY
requeued due to channel reset LBA=0

Disabling the ICH9 SATA controller in the BIOS allows the system to
boot and run normally.

Changes were made on April 28 to allow better support for 6-port
ICH9 controllers (SVN rev 221156) and I am wondering if my
controller is now being incorrectly recognized.

Here's the relevant kernel messages:

May 13 13:52:53 sonicyouth kernel: atapci1:Intel ICH9 SATA300 controller  
port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1c40-0x1c4f,0x1c30-0x1c3f at device 31.2 
on pci0
May 13 13:52:53 sonicyouth kernel: ata0:ATA channel 0  on atapci1
May 13 13:52:53 sonicyouth kernel: ata0: [ITHREAD]
May 13 13:52:53 sonicyouth kernel: ata1:ATA channel 1  on atapci1
May 13 13:52:53 sonicyouth kernel: ata1: [ITHREAD]
May 13 13:52:53 sonicyouth kernel: atapci2:Intel ICH9 SATA300  controller  
port 0x1cb8-0x1cbf,0x1cac-0x1caf,0x1cb0-0x1cb7,0x1ca8-0x1cab,0x1c60-0x1c6f,0x1c50-0x1c5f 
irq 18 at device 31.5 on pci0
May 13 13:52:53 sonicyouth kernel: atapci2: [ITHREAD]
May 13 13:52:53 sonicyouth kernel: ata3:ATA channel 0  on atapci2
May 13 13:52:53 sonicyouth kernel: ata3: [ITHREAD]
May 13 13:52:53 sonicyouth kernel: ata4:ATA channel 1  on atapci2
May 13 13:52:53 sonicyouth kernel: ata4: [ITHREAD]

If I csup the most recent kernel sources, I get the same problem.
However, if, after csuping the latest kernel sources, I then fetch
the version of sys/dev/ata/ata-all.c as of April 27, everything
works fine.  Here's the output of pciconf -l:


The only change in 8-STABLE ata-all.c since April 27 was the SVN rev 
221155. But I don't see how can it cause problems. I would really like 
to see full _verbose_ demsg output to better understand what is going on 
there. If it even panics, I need to see how exactly.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: MPS driver: force bus rescan after remove SAS cable

2011-04-28 Thread Alexander Motin
Rumen Telbizov wrote:
  Also identify function doesn't work from the OS (no problem
  via the card BIOS). Don't remember having any luck with sg3_util
  package either but worth trying again.
 
 I don't use SAS myself, but wouldn't the command be inquiry and not
 identify?  identify is for ATA (specifically SATA via CAM), while
 inquiry is for SCSI.  Where SAS fits into this is unknown to me.
 
 
 Well I have SATA disks visible as /dev/da* . From camcontrol(8):
 
  inquiry Send a SCSI inquiry command (0x12) to a device.  By
 default,
  camcontrol will print out the standard inquiry data, device
  serial number, and transfer rate information.  The user can
  specify that only certain types of inquiry data be printed:
 
 Example:
 
 # camcontrol inquiry /dev/da47
 pass48: ATA WDC WD2003FYYS-0 0D02 Fixed Direct Access SCSI-5 device 
 pass48: Serial Number  WD-WMAUR0408496
 pass48: 300.000MB/s transfers, Command Queueing Enabled 
 
 It's a SATA disk in this case attached to SAS/SATA backplane and SAS2008
 HBA chip (9211-8i)
 What I need is a way to light on the fault led on the disk that I want
 to identify (point to)
 This is usually what I need when I send a DC technician to replace a
 disk. For which I though I should
 be using:
 
  identifySend a ATA identify command (0xec) to a device.
 
 From my experience SAS or SATA disks - I always get those as /dev/da*
 disks. It's a combo controller and backplane.
 So which is the correct way of identifying a disk?

`camcontrol identify` means sending ATA IDENTIFY DEVICE command to the
ATA device. That command is roughly the analogue of the SCSI INQUIRY
command. It has nothing to do with LEDs. LEDs most likely controlled via
ses device or some alike management thing.

The fact that you see ATA device as daX is just means that your SAS
controller does protocol translation on-the-fly. It allows you to
communicate with disk using SCSI commands _instead_ of ATA.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: MPS driver: force bus rescan after remove SAS cable

2011-04-27 Thread Alexander Motin

On 27.04.2011 16:39, Denny Schierz wrote:

Am Mittwoch, den 27.04.2011, 05:57 -0700 schrieb Jeremy Chadwick:

camcontrol reset 0


0:22:0 is available, 0:46:0 not:

root@iscsihead-m:~# camcontrol reset 0:22:0
Reset of 0:22:0 was successful

root@iscsihead-m:~# camcontrol reset 0:46:0
camcontrol: cam_open_btl: no passthrough device found at 0:46:0


You should reset whole bus, not the specific LUN. Full reset doesn't 
need that passthrough device. IIRC it works via xpt0.



We bought the LSI SAS6160 switch:

http://www.lsi.com/storage_home/products_home/sas_switch/sas6100/index.html

use the LSI 9200-8e hostbusadapter and LSI JBODs 630j. We had a lot of
e-mail conversation with LSI and they mean, that we need the switch for
a clear failover setup.
Also a reason for the switch: increase storage with more jbods. Every
jbod has his own cable to the (later) redundant switch. Otherwise we
have to build a bus from JBOD to JBOD to JBOD to JBOD to host ... bad
idea ;-)

The question is, what does the driver while FreeBSD starts?


CAM exactly does full bus reset and after few seconds full rescan. What 
controller driver may do except it depends on it alone.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Promise SATA controller issues...

2011-04-26 Thread Alexander Motin
George Kontostanos wrote:
 I have a Promise PDC40718 SATA300 controller running on a box from
 8.0-Release since now now on 8.2-Stable 8.2-STABLE FreeBSD 8.2-STABLE #3:
 Thu Apr 21 15:23:08 EEST 2011. The controller is in jbod mode with 3 WD
 drives in Raidz1.
 
 ad6: 610480MB WDC WD6401AALS-00J7B1 05.00K05 at ata3-master UDMA100 SATA
 3Gb/s
 ad8: 610480MB WDC WD6402AAEX-00Y9A0 05.01D05 at ata4-master UDMA100 SATA
 3Gb/s
 ad10: 610480MB WDC WD6401AALS-00J7B1 05.00K05 at ata5-master UDMA100 SATA
 3Gb/s
 
 Today the box became unresponsive so I had to do a hard reset. From
 /var/log/messages:
 
 It appears from the logs that the problem lasted for a full day! However,
 after the reboot the drive did not perform any resilver and no data loss
 occurred.
 I have scrubbed my pool successfully and run smartmon tests also.
 
 It doesn't appear to be a drive issue so I was wondering if the recent
 changes in controllers that appeared a few days ago might be related.

There was no changes specific to the Promise controllers for a long
time. Mostly because I have no any documentation for them. For the same
reason I hardly can say what could be wrong there. Some additional
information is definitely required.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Promise SATA controller issues...

2011-04-26 Thread Alexander Motin
George Kontostanos wrote:
 Please let me know what kind of information might be also useful.

I don't know. What were the first messages before the problem? Was there
any specific activity? It would be most useful if you could reproduce
the problem in controllable environment.

 There was no changes specific to the Promise controllers for a long
 time. Mostly because I have no any documentation for them. For the same
 reason I hardly can say what could be wrong there. Some additional
 information is definitely required.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Promise SATA controller issues...

2011-04-26 Thread Alexander Motin
George Kontostanos wrote:
 The system was up since April 21 when I upgraded to the latest world 
 kernel.
 There are 2 pools, a mirror with root on ZFS for the OS and a Raidz1
 just for the data.
 
 There was nothing out of the ordinary before this except some repeated
 power failures that where handled by the UPS as you can see in the logs:
 
 Apr 25 13:05:21 hp apcupsd[870]: Power is back. UPS running on mains.
 Apr 25 13:32:48 hp apcupsd[870]: Power failure.
 Apr 25 13:32:50 hp apcupsd[870]: Power is back. UPS running on mains.
 Apr 25 13:35:06 hp apcupsd[870]: Power failure.
 Apr 25 22:08:35 hp kernel: ata4: SIGNATURE: 
 Apr 25 22:08:35 hp kernel: ata4: timeout waiting to issue command
 Apr 25 22:08:35 hp kernel: ata4: error issuing SETFEATURES SET TRANSFER
 MODE command
 .

I would enable verbose kernel messages for case it it repeats again. May
be it gives some more understanding. But that's not a fact.

 On Tue, Apr 26, 2011 at 10:39 PM, Alexander Motin m...@freebsd.org
 mailto:m...@freebsd.org wrote:
 
 George Kontostanos wrote:
  Please let me know what kind of information might be also useful.
 
 I don't know. What were the first messages before the problem? Was there
 any specific activity? It would be most useful if you could reproduce
 the problem in controllable environment.
 
  There was no changes specific to the Promise controllers for a
 long
  time. Mostly because I have no any documentation for them. For
 the same
  reason I hardly can say what could be wrong there. Some additional
  information is definitely required.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Sense fetching [Was: cdrtools /devel ...]

2011-04-13 Thread Alexander Motin
Buganini wrote:
 does r22056{3,5,6,9} supercede these patches ?

Yes. They solve problem from different side.

 my dvd burning with ahci seems to be fixed by those commits,
 without these patches.
 
 I've just burned a DVD successful, and it's readable.

Yea, I've also burned few DVDs with cdrecord-devel for testing.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


  1   2   3   >