Re: Can't use USB keyboard during boot menu

2010-02-23 Thread Andriy Gapon
on 23/02/2010 13:18 Renato Botelho said the following:
> On Mon, Feb 22, 2010 at 7:35 PM, Chris Hedley
>  wrote:
[snip]
>> Do you have USB legacy support enabled in your BIOS?  I'm not sure if
>> there's an option for the loader to use USB devices natively, but the BIOS's
>> legacy option where it provides AT/PS2 emulation is probably the easiest way
>> to get the keyboard working.
> 
> Yes, I do, but it seems to be a regression on FreeBSD itself, I had this 
> problem
> in the past and I checked the same things i need to check in the past again 
> and
> everything is fine.

A more precise way to state that would be "a regression in FreeBSD boot/loader".
I think that you are referring to the issue that was fixed by r189017.
It might be worthwhile investigating what was done in that revision and what
happened in sys/boot code since then.

One possibility is that your BIOS uses memory above 1MB for USB emulation, but
doesn't mark that memory as used in system memory map.  In that case that memory
could be overwritten by the loader.  If that's true then the blame is on the 
BIOS.
 Alternatively, our code might be parsing the system memory map incorrectly.
But I am just making wild guesses here.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Seeing the dreaded "ZFS: i/o error - all block copies unavailable" on 9.0-CURRENT

2010-02-25 Thread Andriy Gapon
on 25/02/2010 19:58 Chris said the following:
> On Thu, Feb 25, 2010 at 8:06 AM, John Baldwin  wrote:
>> On Wednesday 24 February 2010 10:12:25 pm Chris wrote:
>>> So it sounds like somehow my system is trying to use the old boot2
>>> method when I don't hit F12. I'm guessing the difference is due to how
>>> the hard drive is getting presented to the boot loader by the BIOS.
>>> How can I get rid of the legacy boot system and use only the ZFS
>>> bootloader?
>> Does F12 enable PXE booting or some such?
> 
> The only options I have when I press F12 are to either boot from my
> hard drive or to boot from my optical drive. Is there
> any way to more verbosely see what is happening at the bootloader level?

I guess that F12 that you describe is handled by BIOS.
Do you have other HDDs in this system?
What is your default boot order (configured in BIOS)?


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: build failures after stdlib update

2010-03-20 Thread Andriy Gapon
on 21/03/2010 02:55 Alexander Best said the following:
> ok. i think i finally solved this riddle. the cause for the problem seems to
> have been my CPUTYPE in /etc/make.conf. it is set to 'native'. actually i've
> been using the 'native' keyword for years now and never had any problems with
> it, but it seems a recent commit broke 'native' as CPUTYPE. for me this is
> 100% reproducable:
> 
> 1. put 'CPUTYPE = native' in /etc/make.conf
> 2. build and install gnu/usr.bin/cc
> 3. do 'buildkernel' or 'buildworld' and observe the segfault. for some reason
> this always occurs in lib/libc/string/strlen.c (r205108). i've tested this
> with older version of libc.so (built 22. Feb) and got the same result. so i
> assume this is not a libc problem, but a problem with gcc tripping over some
> code in libc. i have no clue however why this happend just now and not
> earlier. i don't think there has been a recent commit to gcc.

Interestingly enough, there have recent commits to lib/libc/string/strlen.c.

> to solve this there are two ways:
> 
> 1. set CPUTYPE to 'nocona' (i'm running amd64). this will let you compile
> everything just fine even with a gcc that has itself been built with 'CPUTYPE
> = native'.
> 2. build and install gnu/usr.bin/cc with 'CPUTYPE = nocona'. the gcc version
> that has been built this way will compile everything just fine even with
> 'CPUTYPE = native'. the only way to break this is to go and compile gcc again
> with the CPUTYPE set to 'native'.
> 
> so to summarize: compiling gnu/usr.bin/cc with CPUTYPE set to 'native' will
> give you a broken gcc!

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: build failures after stdlib update

2010-03-21 Thread Andriy Gapon
on 21/03/2010 13:43 Garrett Cooper said the following:
> 
> Works for me *shrugs*:
> 
> $ gcc -v -x c -E -mtune=native /dev/null -o /dev/null 2>&1
> Using built-in specs.
> Target: amd64-undermydesk-freebsd
> Configured with: FreeBSD/amd64 system compiler
> Thread model: posix
> gcc version 4.2.1 20070719  [FreeBSD]
>  /usr/libexec/cc1 -E -quiet -v -D_LONGLONG /dev/null -o /dev/null 
> -mtune=generic
> ignoring duplicate directory "/usr/include"
> #include "..." search starts here:
> #include <...> search starts here:
>  /usr/include
> End of search list.
> $ echo $?
> 0

Do you also have the latest version of libc _installed_ in the system?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: build failures after stdlib update

2010-03-21 Thread Andriy Gapon
on 21/03/2010 14:35 Alexander Best said the following:
> Andriy Gapon schrieb am 2010-03-21:
>> on 21/03/2010 13:43 Garrett Cooper said the following:
> 
>>> Works for me *shrugs*:
> 
>>> $ gcc -v -x c -E -mtune=native /dev/null -o /dev/null 2>&1
>>> Using built-in specs.
>>> Target: amd64-undermydesk-freebsd
>>> Configured with: FreeBSD/amd64 system compiler
>>> Thread model: posix
>>> gcc version 4.2.1 20070719  [FreeBSD]
>>>  /usr/libexec/cc1 -E -quiet -v -D_LONGLONG /dev/null -o /dev/null
>>>  -mtune=generic
>>> ignoring duplicate directory "/usr/include"
>>> #include "..." search starts here:
>>> #include <...> search starts here:
>>>  /usr/include
>>> End of search list.
>>> $ echo $?
>>> 0
> 
>> Do you also have the latest version of libc _installed_ in the
>> system?
> 
> i think so. i grabbed a fresh src copy yesterday and did
> buildworld/installworld a few ours ago.
> 
> i've attached the output of `ident /lib/libc.so.7`.

The question was for Garrett :)


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Increasing MAXPHYS

2010-03-21 Thread Andriy Gapon
on 21/03/2010 16:05 Alexander Motin said the following:
> Ivan Voras wrote:
>> Hmm, it looks like it could be easy to spawn more g_* threads (and,
>> barring specific class behaviour, it has a fair chance of working out of
>> the box) but the incoming queue will need to also be broken up for
>> greater effect.
> 
> According to "notes", looks there is a good chance to obtain races, as
> some places expect only one up and one down thread.

I haven't given any deep thought to this issue, but I remember us discussing
them over beer :-)
I think one idea was making sure (somehow) that requests traveling over the same
edge of a geom graph (in the same direction) do it using the same queue/thread.
Another idea was to bring some netgraph-like optimization where some (carefully
chosen) geom vertices pass requests by a direct call instead of requeuing.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: build failures after stdlib update

2010-03-21 Thread Andriy Gapon
on 21/03/2010 14:53 Alexander Best said the following:
> *lol* sorry. ;)

No worries.
BTW, when that rash happens, are you able to examine the core with gdb?
Is it possible to examine values of 's' and 'p' in strlen?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: build failures after stdlib update

2010-03-21 Thread Andriy Gapon
on 21/03/2010 20:46 Alexander Best said the following:
> Andriy Gapon schrieb am 2010-03-21:
>> on 21/03/2010 14:53 Alexander Best said the following:
>>> *lol* sorry. ;)
> 
>> No worries.
>> BTW, when that rash happens, are you able to examine the core with
>> gdb?
>> Is it possible to examine values of 's' and 'p' in strlen?
> 
> 'p' is not available. i guess because the segfault happens before 'p' gets
> assigned.
> 
> but mask01 = 0x101010101010101 and lp = (const long unsigned int *) 0xc092d8.
> 
> but i'm not really familiar with gdb and debugging. so you might want to ask
> for certain commands. ;)

Not sure what I was dreaming of when I wrote my request.
I actually meant 'str' and 'lp'.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: build failures after stdlib update

2010-03-21 Thread Andriy Gapon
on 21/03/2010 23:11 Alexander Best said the following:
> *hehe* that makes more sense. well i already sent you lp. unfortunately str is
> not available to gdb:
> 
> (gdb) print str
> Variable "str" is not available.

In cases like this sometimes the argument is still available for examination as
a variable further down the stack (i.e. in one of the calling functions).

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: build failures after stdlib update

2010-03-22 Thread Andriy Gapon
on 22/03/2010 00:12 Alexander Best said the following:
> Andriy Gapon schrieb am 2010-03-21:
>> on 21/03/2010 23:11 Alexander Best said the following:
>>> *hehe* that makes more sense. well i already sent you lp.
>>> unfortunately str is
>>> not available to gdb:
> 
>>> (gdb) print str
>>> Variable "str" is not available.
> 
>> In cases like this sometimes the argument is still available for
>> examination as
>> a variable further down the stack (i.e. in one of the calling
>> functions).
> 
> hmmm...
> 
> str might be "-m" then, because of:
> 
> #1  0x00416782 in concat (first=0x417618 "-m") at
> /usr/src/gnu/usr.bin/cc/libiberty/../../../../contrib/gcclibs/libiberty/concat.c:76
> 

Perhaps, but doesn't look like it.
Could you please do 'print *lp'?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [head tinderbox] failure on i386/i386

2010-03-22 Thread Andriy Gapon
on 21/03/2010 08:25 Garrett Cooper said the following:
> On Sat, Mar 20, 2010 at 9:58 PM, FreeBSD Tinderbox
>  wrote:
>> TB --- 2010-03-21 04:53:25 - /usr/bin/make -B buildkernel KERNCONF=PAE
>>>>> Kernel build for PAE started on Sun Mar 21 04:53:25 UTC 2010
>>>>> stage 1: configuring the kernel
>>>>> stage 2.1: cleaning up the object tree
>>>>> stage 2.2: rebuilding the object tree
>>>>> stage 2.3: build tools
>>>>> stage 3.1: making dependencies
>>>>> stage 3.2: building everything
>> [...]
>> cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
>> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
>> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. 
>> -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS 
>> -include opt_global.h -fno-common -finline-limit=8000 --param 
>> inline-unit-growth=100 --param large-function-growth=1000  
>> -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow 
>> -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -fstack-protector -Werror  
>> /src/sys/libkern/qdivrem.c
>> cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
>> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
>> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. 
>> -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS 
>> -include opt_global.h -fno-common -finline-limit=8000 --param 
>> inline-unit-growth=100 --param large-function-growth=1000  
>> -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow 
>> -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -fstack-protector -Werror  
>> /src/sys/libkern/ucmpdi2.c
>> cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
>> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
>> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. 
>> -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS 
>> -include opt_global.h -fno-common -finline-limit=8000 --param 
>> inline-unit-growth=100 --param large-function-growth=1000  
>> -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow 
>> -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -fstack-protector -Werror  
>> /src/sys/libkern/udivdi3.c
>> cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
>> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
>> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. 
>> -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS 
>> -include opt_global.h -fno-common -finline-limit=8000 --param 
>> inline-unit-growth=100 --param large-function-growth=1000  
>> -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow 
>> -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -fstack-protector -Werror  
>> /src/sys/libkern/umoddi3.c
>> cc -c -O -pipe  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs 
>> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline 
>> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  -I. 
>> -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS 
>> -include opt_global.h -fno-common -finline-limit=8000 --param 
>> inline-unit-growth=100 --param large-function-growth=1000  
>> -mno-align-long-strings -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow 
>> -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -fstack-protector -Werror  
>> /src/sys/compat/x86bios/x86bios.c
>> cc1: warnings being treated as errors
>> /src/sys/compat/x86bios/x86bios.c: In function 'x86bios_map_mem':
>> /src/sys/compat/x86bios/x86bios.c:558: warning: cast to pointer from integer 
>> of different size
>> *** Error code 1
>>
>> Stop in /obj/i386/src/sys/PAE.
>> *** Error code 1
>>
>> Stop in /src.
>> *** Error code 1
>>
>> Stop in /src.
>> TB --- 2010-03-21 04:58:08 - WARNING: /usr/bin/make returned exit code  1
>> TB --- 2010-03-21 04:58:08 - ERROR: failed to build PAE kernel
>> TB --- 2010-03-21 04:58:08 - 5080.43 user 936.95 system 6788.14 real
> 
> Hi Jung,
> Could you please resolve this error?
> Thanks,

It would have been nice to actually CC Jung-uk :-)
The problem seems to be that with PAE type of x86bios_rom_phys, vm_paddr_t, is
64-bit.
I am not sure what values x86bios_rom_phys can have, but most likely it can be
simply cast to a 32-bit value.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-03-24 Thread Andriy Gapon
on 19/03/2010 20:26 Paul B Mahol said the following:
> On Fri, Mar 19, 2010 at 7:11 PM, Fabian Keil
>  wrote:
>> Paul B Mahol  wrote:
>>
>>> FreeBSD 9.0 CURRENT panics when mounting file system created via
>>> newfs_msdos on DVD-RAM disc.
>>> Something to do about divide by zero.
>> I recently had a similar problem with a 16GB iPod. I still haven't
>> managed to actually mount it, but the patch below at least works
>> around the panic.
>>
>> Does it work for you, too?
> 
> Obviously it will fix panic, but will not allow to mount. Zero value
> should be handled
> already much before. It looks the real bug is in newfs_msdos.
> 

Looking at the code in mountmsdosfs(), it seems that SecPerClust can have zero
value at the place of the crash only if pm_BlkPerSec is zero.
See this line and the check above it:
SecPerClust *= pmp->pm_BlkPerSec;
But that is impossible because of the same if statement.

In my opinion, the only possible explanation is an overflow of a SecPerClust
value.  Given that its type is u_int8_t, it seems plausible.

It would be really nice if people who can reproduce this issue could either add 
a
couple of printfs before the quoted above line or examined a crashdump to
determine values of SecPerClust and pm_BlkPerSec before the multiplication.

Could you guys please do it?
Thanks!
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-03-29 Thread Andriy Gapon

If you want to make sure that I see your reply please include me into recipient
list.  FreeBSD mailing lists sometimes have high volume and it's easy to miss a
followup even if you are interested in reading it.

on 28/03/2010 18:25 Fabian Keil said the following:
> Andriy Gapon  wrote:
>> Looking at the code in mountmsdosfs(), it seems that SecPerClust can
>> have zero value at the place of the crash only if pm_BlkPerSec is zero.
>> See this line and the check above it:
>> SecPerClust *= pmp->pm_BlkPerSec;
>> But that is impossible because of the same if statement.
>>
>> In my opinion, the only possible explanation is an overflow of a
>> SecPerClust value.  Given that its type is u_int8_t, it seems plausible.
> 
> That seems to be indeed the case. Adding a printf before
>   SecPerClust *= pmp->pm_BlkPerSec;
> 
> Results in: Multiplying 64 with 8

Interesting.  See below.

> Using an unsigned int for SecPerClust allows to mount the file
> system and df -h correctly shows its size, but cd'ing into it
> and running ls -l leads to another panic:

I that this local workaround cures only one local symptom and pushes the problem
further in the code.  The panic you got is a symptom of a deeper issue.

Could you please remind us in what way was the filesystem created?
Was it FreeBSD newfs_msdos?

I am not a FAT expert and I know to take Wikipedia with a grain of salt.
But please take a look at this:
http://en.wikipedia.org/wiki/File_Allocation_Table#Boot_Sector

In our formula:
SecPerClust *= pmp->pm_BlkPerSec;
we have the following parameters:
SecPerClust[in] - sectors per cluster
pm_BlkPerSec - bytes per sector divided by 512 (pm_BytesPerSec / DEV_BSIZE)
SecPerClust[out] - bytes per cluster divided by 512

So we have:
sectors per cluster: 64
bytes per sector: 4096

That Wikipedia article says: "However, the value must not be such that the 
number
of bytes per cluster becomes greater than 32 KB."
But in our case it's 256K, the same value that is passed as 'size' parameter to
bread() in the crash stack trace below.

By the way, that 32KB limit means that value of SecPerClust[out] should never be
greater than 64 and SecPerClust[in] is limited to 128, so its current must be of
sufficient size to hold all allowed values.

Thus, clearly, it is a fault of a tool that formatted the media for FAT.
It should have picked correct values, or rejected incorrect values if those were
provided as overrides via command line options.

> f...@r500 /usr/crash $kgdb kernel.1/kernel.symbols vmcore.1
[snip]
> Unread portion of the kernel message buffer:
> panic: getblk: size(262144) > MAXBSIZE(65536)
[snip]
> #11 0x803bedfb in panic (fmt=Variable "fmt" is not available.
> ) at /usr/src/sys/kern/kern_shutdown.c:562
> #12 0x8042ecde in getblk (vp=0xff006dbfad20, blkno=992, 
> size=262144, slpflag=0, slptimeo=Variable "slptimeo" is not available.
> ) at /usr/src/sys/kern/vfs_bio.c:2523
> #13 0x8042f12f in breadn (vp=0xff006dbfad20, blkno=Variable 
> "blkno" is not available.
> ) at /usr/src/sys/kern/vfs_bio.c:800
> #14 0x8042f24e in bread (vp=Variable "vp" is not available.
> ) at /usr/src/sys/kern/vfs_bio.c:748
> #15 0x8035efc2 in msdosfs_readdir (ap=0xff803e71ca60) at 
> /usr/src/sys/fs/msdosfs/msdosfs_vnops.c:1641
[snip]

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-03-29 Thread Andriy Gapon
on 29/03/2010 23:29 Fabian Keil said the following:
> Andriy Gapon  wrote:
>> Thus, clearly, it is a fault of a tool that formatted the media for FAT.
>> It should have picked correct values, or rejected incorrect values if
>> those were provided as overrides via command line options.
> 
> The kernel still shouldn't panic, though.

A quick reply to this point only - yes, I completely agree.
But remember that the panic happened only after the sources were modified :)
Jokes aside, mountmsdosfs() should reject incorrect combination of bytes/sector
and sectors/cluster and should produce proper diagnostics for that.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-03-30 Thread Andriy Gapon
on 30/03/2010 18:36 Fabian Keil said the following:
> Andriy Gapon  wrote:
> 
>> on 29/03/2010 23:29 Fabian Keil said the following:
>>> Andriy Gapon  wrote:
>>>> Thus, clearly, it is a fault of a tool that formatted the media for FAT.
>>>> It should have picked correct values, or rejected incorrect values if
>>>> those were provided as overrides via command line options.
>>> The kernel still shouldn't panic, though.
>> A quick reply to this point only - yes, I completely agree.
>> But remember that the panic happened only after the sources were modified :)
> 
> It wasn't clear from my message, but I was mainly referring to the
> division-by-zero panic mentioned at the beginning of the thread,
> for which I posted a work-around in <20100319191133.46fe2...@r500.local>.

Oh, yes, right.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-03-31 Thread Andriy Gapon
on 30/03/2010 18:41 Andriy Gapon said the following:
> on 30/03/2010 18:36 Fabian Keil said the following:
>> Andriy Gapon  wrote:
>>
>>> on 29/03/2010 23:29 Fabian Keil said the following:
>>>> Andriy Gapon  wrote:
>>>>> Thus, clearly, it is a fault of a tool that formatted the media for FAT.
>>>>> It should have picked correct values, or rejected incorrect values if
>>>>> those were provided as overrides via command line options.
>>>> The kernel still shouldn't panic, though.
>>> A quick reply to this point only - yes, I completely agree.
>>> But remember that the panic happened only after the sources were modified :)
>> It wasn't clear from my message, but I was mainly referring to the
>> division-by-zero panic mentioned at the beginning of the thread,
>> for which I posted a work-around in <20100319191133.46fe2...@r500.local>.
> 
> Oh, yes, right.

To clarify - I already forgot that the original problem was division by zero 
panic
and for some reason thought that it was EINVAL.

Anyways, here is a patch that I would use.
Unfortunately, ENOTIME to understand newfs_msdos code and fix it too,

--- a/sys/fs/msdosfs/msdosfs_vfsops.c
+++ b/sys/fs/msdosfs/msdosfs_vfsops.c
@@ -580,6 +580,7 @@ mountmsdosfs(struct vnode *devvp, struct mount *mp)
  || (pmp->pm_BytesPerSec & (pmp->pm_BytesPerSec - 1))
  || (pmp->pm_HugeSectors == 0)
  || (pmp->pm_FATsecs == 0)
+ || (SecPerClust * pmp->pm_BlkPerSec > MAXBSIZE / DEV_BSIZE)
) {
error = EINVAL;
goto error_exit;


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-04-02 Thread Andriy Gapon
on 02/04/2010 13:57 Fabian Keil said the following:
> Andriy Gapon  wrote:
>> Anyways, here is a patch that I would use.
>> Unfortunately, ENOTIME to understand newfs_msdos code and fix it too,
>>
>> --- a/sys/fs/msdosfs/msdosfs_vfsops.c
>> +++ b/sys/fs/msdosfs/msdosfs_vfsops.c
>> @@ -580,6 +580,7 @@ mountmsdosfs(struct vnode *devvp, struct mount *mp)
>>|| (pmp->pm_BytesPerSec & (pmp->pm_BytesPerSec - 1))
>>|| (pmp->pm_HugeSectors == 0)
>>|| (pmp->pm_FATsecs == 0)
>> +  || (SecPerClust * pmp->pm_BlkPerSec > MAXBSIZE / DEV_BSIZE)
>>  ) {
>>  error = EINVAL;
>>  goto error_exit;
> 
> That works, too:
> 
> f...@r500 ~ $sudo mdconfig -a -t vnode -f /tank/ipod-image-formatiert
> md0
> f...@r500 ~ $sudo mount_msdosfs /dev/md0 /mnt/
> mount_msdosfs: /dev/md0: Invalid argument
> 
> Is there a chance that this, or some other workaround, could be committed?

Yes, there is 99.99% chance of this happening :-)
Now, if someone could fix newfs_msdos issue too.
I could easily reproduce it this way:

$ truncate -s 5G test.img
$ mdconfig -a -t vnode -f test.img -S 2048 -u 0
$ newfs_msdos -F 32 /dev/md0

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-04-02 Thread Andriy Gapon
on 02/04/2010 14:09 Andriy Gapon said the following:
> on 02/04/2010 13:57 Fabian Keil said the following:
>> Andriy Gapon  wrote:
>>> Anyways, here is a patch that I would use.
>>> Unfortunately, ENOTIME to understand newfs_msdos code and fix it too,
>>>
>>> --- a/sys/fs/msdosfs/msdosfs_vfsops.c
>>> +++ b/sys/fs/msdosfs/msdosfs_vfsops.c
>>> @@ -580,6 +580,7 @@ mountmsdosfs(struct vnode *devvp, struct mount *mp)
>>>   || (pmp->pm_BytesPerSec & (pmp->pm_BytesPerSec - 1))
>>>   || (pmp->pm_HugeSectors == 0)
>>>   || (pmp->pm_FATsecs == 0)
>>> + || (SecPerClust * pmp->pm_BlkPerSec > MAXBSIZE / DEV_BSIZE)
>>> ) {
>>> error = EINVAL;
>>> goto error_exit;
>> That works, too:
>>
>> f...@r500 ~ $sudo mdconfig -a -t vnode -f /tank/ipod-image-formatiert
>> md0
>> f...@r500 ~ $sudo mount_msdosfs /dev/md0 /mnt/
>> mount_msdosfs: /dev/md0: Invalid argument
>>
>> Is there a chance that this, or some other workaround, could be committed?
> 
> Yes, there is 99.99% chance of this happening :-)
> Now, if someone could fix newfs_msdos issue too.
> I could easily reproduce it this way:
> 
> $ truncate -s 5G test.img
> $ mdconfig -a -t vnode -f test.img -S 2048 -u 0
> $ newfs_msdos -F 32 /dev/md0
> 

OK, I did it again.
I tested the below patch using the scenario described above.
Could you please review and/or test this patch?
If you like it and it works, I can commit it.
Thanks!

--- a/sbin/newfs_msdos/newfs_msdos.c
+++ b/sbin/newfs_msdos/newfs_msdos.c
@@ -427,6 +427,9 @@ main(int argc, char *argv[])
 if (bpb.bpbBytesPerSec < MINBPS)
errx(1, "bytes/sector (%u) is too small; minimum is %u",
     bpb.bpbBytesPerSec, MINBPS);
+bpb.bpbSecPerClust /= (bpb.bpbBytesPerSec / MINBPS);
+if (bpb.bpbSecPerClust == 0)
+   bpb.bpbSecPerClust = 1;
 if (!(fat = opt_F)) {
if (opt_f)
fat = 12;


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-04-02 Thread Andriy Gapon
on 02/04/2010 22:26 Andriy Gapon said the following:
> 
> OK, I did it again.
> I tested the below patch using the scenario described above.
> Could you please review and/or test this patch?
> If you like it and it works, I can commit it.
> Thanks!
> 
> --- a/sbin/newfs_msdos/newfs_msdos.c
> +++ b/sbin/newfs_msdos/newfs_msdos.c
> @@ -427,6 +427,9 @@ main(int argc, char *argv[])
>  if (bpb.bpbBytesPerSec < MINBPS)
>   errx(1, "bytes/sector (%u) is too small; minimum is %u",
>bpb.bpbBytesPerSec, MINBPS);
> +bpb.bpbSecPerClust /= (bpb.bpbBytesPerSec / MINBPS);
> +if (bpb.bpbSecPerClust == 0)
> + bpb.bpbSecPerClust = 1;
>  if (!(fat = opt_F)) {
>   if (opt_f)
>   fat = 12;
> 

And here is a safer one (in case of a huge sector size > 32KB).
I will appreciate any testing with real media that you might have.

diff --git a/sbin/newfs_msdos/newfs_msdos.c b/sbin/newfs_msdos/newfs_msdos.c
index 955c3a5..3f2778d 100644
--- a/sbin/newfs_msdos/newfs_msdos.c
+++ b/sbin/newfs_msdos/newfs_msdos.c
@@ -427,6 +427,12 @@ main(int argc, char *argv[])
 if (bpb.bpbBytesPerSec < MINBPS)
errx(1, "bytes/sector (%u) is too small; minimum is %u",
 bpb.bpbBytesPerSec, MINBPS);
+bpb.bpbSecPerClust /= (bpb.bpbBytesPerSec / MINBPS);
+if (bpb.bpbSecPerClust == 0)
+   bpb.bpbSecPerClust = 1;
+if (bpb.bpbSecPerClust * bpb.bpbBytesPerSec > 32 * 1024)
+   errx(1, "bytes per sector (%u) is greater than 32k",
+   bpb.bpbSecPerClust * bpb.bpbBytesPerSec);
 if (!(fat = opt_F)) {
if (opt_f)
fat = 12;


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-04-03 Thread Andriy Gapon
on 03/04/2010 18:07 Tijl Coosemans said the following:
> 
> I'm not sure the second paragraph is worth supporting, but the first
> seems to say that 32k limit you have in your patch only applies to
> disks with 512 byte sectors. For disks with larger sectors it would
> be proportionally larger.

Last sentence is your own conclusion I guess?
Please read this whole thread to see why it doesn't work that way in practice.
At least for present FreeBSD.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newfs_msdos and DVD-RAM

2010-04-03 Thread Andriy Gapon
on 03/04/2010 18:27 Andriy Gapon said the following:
> on 03/04/2010 18:07 Tijl Coosemans said the following:
>> I'm not sure the second paragraph is worth supporting, but the first
>> seems to say that 32k limit you have in your patch only applies to
>> disks with 512 byte sectors. For disks with larger sectors it would
>> be proportionally larger.
> 
> Last sentence is your own conclusion I guess?
> Please read this whole thread to see why it doesn't work that way in practice.
> At least for present FreeBSD.

OTOH, perhaps you are right and we should consider either bumping MAXBSIZE or
retiring it.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD 9-CURRENT (and 8.0) on MacMini (rev. 3,1)

2010-04-04 Thread Andriy Gapon
on 04/04/2010 18:34 Phil Regnauld said the following:
> Attilio Rao (attilio) writes:
>> I would start by compiling a debugging kernel and using serial port
>> for capturing, starting reporting the ACPI bug in the latest case,
>> then we can get useful informations.
> 
>   Hi Attilio,
> 
>   Any pointers on how to achieve that on a machine with no serial ports ?
>   I've checked out:
>   
> http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-online-gdb.html
>   and
>   http://wiki.freebsd.org/DebugWithDcons (there is a recognized firewire 
> port)
> 
>   I don't otherwise see how to get a core to disk halfway through the boot
>   process.

You could try to use firewire console.
See dcons(4).

Also, a good and quicker start is to report actual panics that you get, as
Attilio has suggested.
When everything else fails, a digital camera still can be used to get screen
captures:-)


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [RFC] Rewriting sade(8)

2010-04-08 Thread Andriy Gapon
on 08/04/2010 12:05 Dag-Erling Smørgrav said the following:
> Alexander Leidinger  writes:
>> Please consider using SVN instead. A lot more users will be able to
>> check out from there.
> 
> We don't grant non-committers access to the Subversion repo.

But nothing stops Andrey from creating his own svn/hg/git/etc repo _just_ for 
his
sade bits.  It's easy.  This is what I would do even just for my own sake.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: gpart and sector size

2010-04-09 Thread Andriy Gapon
on 09/04/2010 14:00 Alexey Tarasov said the following:
> I've booted from dvd to fixit mode and got the following:
> 
> FreeBSD  8.0-STABLE-201002 FreeBSD 8.0-STABLE-201002 #0: Tue Feb 16 21:05:59 
> UTC 2010 r...@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> ATA channel 0:
> Master:  ad0  ATA/ATAPI revision 0
> Slave:   no device present
> ATA channel 2:
> Master:  ad4  SATA revision 2.x
> Slave:   no device present
> ATA channel 3:
> Master:  ad6  SATA revision 2.x
> Slave:   no device present
> ATA channel 4:
> Master:  ad8  SATA revision 2.x
> Slave:   no device present
> ATA channel 5:
> Master: ad10  SATA revision 2.x
> Slave:   no device present
> 
> /dev/ad4
>512 # sectorsize
>1500301910016   # mediasize in bytes (1.4T)
>2930277168  # mediasize in sectors
>0   # stripesize
>0   # stripeoffset
>2907021 # Cylinders according to firmware.
>16  # Heads according to firmware.
>63  # Sectors according to firmware.
>WD-WMAVU1512579 # Disk ident.
> 
> Seems that mav@ commit doesn't work? o_O

Or the disk doesn't actually report 4096 anywhere anyhow...  Have you considered
that?  If yes, can you verify using any tools of any OS that the disk reports 4K
in any way?

P.S. DES's name looks strange in headers :-)

P.P.S.
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

> On 09.04.2010, at 0:44, Dimitry Andric wrote:
> 
>> That said, if the physical sector size is larger than the logical
>> sector size, the d_stripesize field is initialized with it.  So if you
>> run "diskinfo -v" on the disk, what is the output for stripesize?
> 
> --
> Alexey Tarasov
> 
> (\__/) 
> (='.'=) 
> E[: | | | | :]З 
> (")_(")
> 
> _______
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: gpart and sector size

2010-04-09 Thread Andriy Gapon
on 09/04/2010 14:27 Alexey Tarasov said the following:
>> Or the disk doesn't actually report 4096 anywhere anyhow...  Have you
>> considered that?  If yes, can you verify using any tools of any OS that the
>> disk reports 4K in any way?
> 
> In the previous discussion we found that the disk reports 512 sector size, but
> there are additional ATA commands to determine if it has real sector size
> larger than 4k. I will try to confirm this.

Thank you.  I think that this would be an interesting detail.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: gpart and sector size

2010-04-09 Thread Andriy Gapon
on 09/04/2010 14:33 Alexey Tarasov said the following:
> On 09.04.2010, at 15:32, Andriy Gapon wrote:
> 
>> on 09/04/2010 14:27 Alexey Tarasov said the following:
>>>> Or the disk doesn't actually report 4096 anywhere anyhow...  Have you
>>>> considered that?  If yes, can you verify using any tools of any OS that the
>>>> disk reports 4K in any way?
>>> In the previous discussion we found that the disk reports 512 sector size, 
>>> but
>>> there are additional ATA commands to determine if it has real sector size
>>> larger than 4k. I will try to confirm this.
>> Thank you.  I think that this would be an interesting detail.
>>
> 
> Here is the reference:
> 
> http://www.wdc.com/wdproducts/library/WhitePapers/ENG/2579-771430.pdf

I saw it, but I want to see what's reported in reality.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: gpart and sector size

2010-04-09 Thread Andriy Gapon
on 09/04/2010 14:31 Dag-Erling Smørgrav said the following:
> Andriy Gapon  writes:
>> P.S. DES's name looks strange in headers :-)
> 
> Get a better MUA.  MIME quoted-printable has been around for what, 15
> years?

The advice is misdirected.  Right, Dmitry? :-)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ZFS raidz and 4k sector disks

2010-04-09 Thread Andriy Gapon
on 09/04/2010 14:14 Alexey Tarasov said the following:
> Hello.
> 
> I see considerably increased performance when creating over gnop -S 4096 
> virtual disk. Even when I create zpool over raw disks the performance is very 
> bad and concurent writes stalls. When using gnop, zfs works VERY fast!
> 
> Btw, here is another discussion, may be there is a bug in a mav@ commit, 
> because he has added support for >512 sector size:
> http://lists.freebsd.org/pipermail/freebsd-current/2010-April/016495.html


Looks like I was wrong:
/*
 * Determine the device's minimum transfer size.
 */
*ashift = highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1;

This is in vdev_geom_open and SPA_MINBLOCKSIZE is 512.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Cross compilling kernel i386/amd64 is failed

2010-04-15 Thread Andriy Gapon
on 16/04/2010 01:27 Arseny Nasokin said the following:
> I get error in same point when I try compile amd64 current GENERIC on
> i386 machine. Svn revision is 206597
> 
> Error at src/sys/amd64/amd64/genassym.c:1: code model 'kernel' not
> supported in the 32 bit mode.
> 
> how to cross compile it?
> 
> PS: I build only kernel at this point. Should I rebuild whole world to
> build kernel?

kernel-toolchain
See build(7)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Cross compilling kernel i386/amd64 is failed

2010-04-15 Thread Andriy Gapon
on 16/04/2010 04:19 Arseny Nasokin said the following:
> 
> 
> On 16 Apr 2010, at 03:03, Andriy Gapon  wrote:
> 
>> on 16/04/2010 01:27 Arseny Nasokin said the following:
>>> I get error in same point when I try compile amd64 current GENERIC on
>>> i386 machine. Svn revision is 206597
>>>
>>> Error at src/sys/amd64/amd64/genassym.c:1: code model 'kernel' not
>>> supported in the 32 bit mode.
>>>
>>> how to cross compile it?
>>>
>>> PS: I build only kernel at this point. Should I rebuild whole world to
>>> build kernel?
>>
>> kernel-toolchain
>> See build(7)
> 
> Thanks, I'll create bug with patch

Please don't create any new bugs, bug reports are welcome though :-)
BTW, what do you want to report?

>>
>> -- 
>> Andriy Gapon


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Cross compilling kernel i386/amd64 is failed

2010-04-16 Thread Andriy Gapon
on 16/04/2010 10:38 Arseny Nasokin said the following:
> 
> kernel-toolchain target must be called on cross-compilling, even you
> making cross-world (where toolchain is called)

Still not sure what is the problem.
Before cross-compiling a kernel you have to cross-build either the corresponding
world or the kernel toolchain.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ISO9660 4GB directory structures boundary limit and growisofs

2010-04-17 Thread Andriy Gapon
on 17/04/2010 02:07 Paul B Mahol said the following:
> Hi,
> 
> It is apparently not possible to make use of -use-the-force-luke=4gms
> on FreeBSD when appending new session after 4GB. Mounted disk
> afterwards  show nothing.

Is it expected that everyone knows what -use-the-force-luke=4gms is?

> Should we allow it like linux does?

What exactly is disallowed on FreeBSD?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ISO9660 4GB directory structures boundary limit and growisofs

2010-04-19 Thread Andriy Gapon
on 17/04/2010 19:31 Tim Kientzle said the following:
> Paul B Mahol wrote:
>>
>> It is apparently not possible to make use of -use-the-force-luke=4gms
>> on FreeBSD when appending new session after 4GB. Mounted disk
>> afterwards  show nothing.
>>
>> Should we allow it like linux does?
> 
> Are you claiming there is a problem when FreeBSD reads such
> images or a problem with creating such images?  What
> programs are you using?
> 
> This sounds like a pretty unsurprising 32-bit truncation
> bug:  the filesystem structures in ISO9660 are all sector
> numbers so 8TB should be the natural limit (4G sectors
> times 2k bytes/sector).

I don't think that the problem is with limit on sector count here.
I think it's a limitation with size/offset in bytes somewhere in cd9660 fs 
driver.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: MCA messages in /var/log/message?

2010-04-22 Thread Andriy Gapon
on 23/04/2010 01:28 Steve Kargl said the following:
> How does one interpret the following MCA message?
> 
> MCA: Bank 4, Status 0x945a4000d6080a13
> MCA: Global Cap 0x0105, Status 0x
> MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 0
> MCA: CPU 0 COR BUSLG Responder RD Memory
> MCA: Address 0x70c42280
> MCA: Bank 4, Status 0x942140012a080813
> MCA: Global Cap 0x0105, Status 0x
> MCA: Vendor "AuthenticAMD", ID 0xf5a, APIC ID 1
> MCA: CPU 1 COR BUSLG Source RD Memory
> MCA: Address 0x1b97ca578
> 
> It appears that these messages coincide with a 15 to 30
> second period where my USB mouse inexplicably loses a
> large number of button clicks, (which is quite noticable
> with firefox3).

This very much looks like DRAM ECC error.
You seem to have family Fh AMD processor, so I am not entirely sure.
But for 10h processors BKDG table 80 (NB error signatures) definitely specifies
that extended error code of 8 (in bits 20:16) means ECC error.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Switchover to CAM ATA?

2010-04-22 Thread Andriy Gapon
on 23/04/2010 07:48 Szilveszter Adam said the following:
> There is one interesting tidbit though: previously it used to be
> possible to run cdda2wav also as non-root, provided the user running it
> had read access to the /dev/cd0 device. This seems to no longer work.

Probably you also need access to the corresponding passX device, which you can
find from output of 'camcontrol devlist'.
You didn't need that with *a*cd0.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Switchover to CAM ATA?

2010-04-23 Thread Andriy Gapon
on 23/04/2010 12:28 Alexander Best said the following:
> has anybody thought about adding scsi support to burncd(8)? i've been using
> ATA CAM for quite a while now and really love it. however i miss burncd(8). i
> found it to be much easier to use and less buggy than cdrecord(1).

burncd for CAM (SCSI, ATAPI) will be something very close to cdrecord or 
growisofs.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS UP: SUJ Going in to head today

2010-04-26 Thread Andriy Gapon
on 26/04/2010 16:42 dikshie said the following:
> Hi Jeff,
> thanks for SUJ.
> btw, why there is nan% utilization? and what does it mean?

0/0 I guess. Floating point allows that :-)

> --
> ** SU+J Recovering /dev/ad0s1g
> ** Reading 33554432 byte journal from inode 4.
> ** Building recovery table.
> ** Resolving unreferenced inode list.
> ** Processing journal entries.
> ** 0 journal records in 0 bytes for nan% utilization <
> ** Freed 0 inodes (0 dirs) 0 blocks, and 0 frags.
> --




-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: HEADS UP: SUJ Going in to head today

2010-04-27 Thread Andriy Gapon
on 27/04/2010 09:00 Jeff Roberson said the following:
> I think some people are enabling after returning to single user from a
> live system rather than booting into single user.  This is a different
> path in the filesystem as booting directly just mounts read-only while
> the other option updates a mount from read/write.  I believe this is the
> path that is broken.

Yes, this seems to be broken and perhaps by design.
g_vfs_open() calls g_access like this: g_access(cp, 1, wr, 1);
That means that 'e' count (exclusive) is always bumped, even for R/O mounts, and
that prevents opening the provider for writing.

ffs_mountfs has this special code:
/*
 * If we are a root mount, drop the E flag so fsck can do its magic.
 * We will pick it up again when we remount R/W.
 */
if (error == 0 && ronly && (mp->mnt_flag & MNT_ROOTFS))
error = g_access(cp, 0, 0, -1);

So, basically for read-only UFS root mount we allow concurrent open, even for
writing.  This is needed primarily for fsck, but also helps tunefs.

But I believe that this code is exercised only during original mount.
Remounting to R/O at later time doesn't drop 'e' count.

I think that this is by design, to prevent foot-shooting.
We either should document this behavior or re-consider it.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: config(8) dumps core

2010-04-29 Thread Andriy Gapon
on 29/04/2010 18:31 Michael Moll said the following:
[snip]
> Assertion failed: (r != '\0' && ("Char present in the configuration " "string
> mustn't be equal to 0")), function kernconfdump, file
> /usr/src/usr.sbin/config/main.c, line 721.
[snip]
> Any ideas on this?

Yes, one idea - to verify what the message above says.
You can use hd to see if you indeed have '\0' (0x00) symbol somewhere within
your kernel config file.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: config(8) dumps core

2010-04-29 Thread Andriy Gapon
on 30/04/2010 00:12 Michael Moll said the following:
> Hi,
> 
> On Thu, Apr 29, 2010 at 11:33:30PM +0300, Andriy Gapon wrote:
>> on 29/04/2010 18:31 Michael Moll said the following:
>> You can use hd to see if you indeed have '\0' (0x00) symbol somewhere within
>> your kernel config file.
> 
> Thanks, I checked this and there are no 0x00s in the config file itself,

Then that assert message is strange.
Or there is something else to this situation.

> but a hd to /boot/kernel/kernel reveals:
> 
> 09 66 77 69 70 0a 64 65  76 69 63 65 09 64 63 6f |.fwip.device.dco|
> 6e 73 0a 64 65 76 69 63  65 09 64 63 6f 6e 73 5f |ns.device.dcons_|
> 63 72 6f 6d 0a 00 00 00  00 00 00 00 00 00 00 00 |crom|
> 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 ||
> 
> This also explains why a recent config-binary worked against the old
> kernel... The were some commits to /src/usr.sbin/config/* in the last
> weeks, maybe one of them broke this.

Actually I think that this doesn't mean anything.
/boot/kernel/kernel is a binary, an executable, it is expected to have a fair
amount of 0x00 in it.
That assert was specifically about kernel _config_ file.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 8.1-release + zfs v15 df(1) hangs

2010-08-18 Thread Andriy Gapon
on 18/08/2010 11:23 Marian Hettwer said the following:
>  Hi All,
> 
> i installed freebsd 8.1-release on my workstation (based on the
> 8.1-release mfsbsd isos) and I'm now experiencing some strange effects.
> 
> A df(1) doesn't return and is not killable and while taking a look
> around in my process table, I could find several find's hanging around too.
> 
> mhettwer  5976  0.0  0.0  6896  1088  13  D+5:55PM   0:00.00 df -h
> mhettwer  5351  0.0  0.0  6896  1088  19  D+1:49PM   0:00.00 df -h

Can you run procstat -k to see where exactly the processes are stuck?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: STABLE kernel panic: privileged instruction fault

2010-08-18 Thread Andriy Gapon
on 13/08/2010 00:45 Alexey Tarasov said the following:
> Fatal trap 1: privileged instruction fault while in kernel mode
> cpuid = 1; apic id = 01
> instruction pointer = 0x20:0xff8040d2cc83
> stack pointer   = 0x28:0xff8040d2ca80
> frame pointer   = 0x28:0xff0060c0b740

I suspect that either stack is corrupted or non-code is executed (or both).
Stack pointer seems to be too close to instruction pointer and too far from 
frame
pointer.

Can you try to use kgdb and disassemble code (or examine data) near instruction
pointer address and also near frame pointer address?
Also, you might want to rebuild kgdb with a recent change from head, so that it
better maps symbols and addresses in kernel modules.

> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 9388 (nginx)
> trap number = 1
> panic: privileged instruction fault
> cpuid = 1
> Uptime: 17d15h48m49s
> Physical memory: 2032 MB
> Dumping 1485 MB: 1470 1454 1438 1422 1406 1390 1374 1358 1342 1326 1310 1294 
> 1278 1262 1246 1230 1214 1198 1182 1166 1150 1134 1118 1102 1086 1070 1054 
> 1038 1022 1006 990 974 958 942 926 910 894 878 862 846 830 814 798 782 766 
> 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526 510 494 478 462 
> 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 
> 142 126 110 94 78 62 46 30 14
> 
> 
> (kgdb) #0  doadump () at pcpu.h:223
> #1  0x80590c59 in boot (howto=260)
> at /usr/src/sys/kern/kern_shutdown.c:416
> #2  0x8059108c in panic (fmt=0x80951fc4 "%s")
> at /usr/src/sys/kern/kern_shutdown.c:579
> #3  0x80878fd8 in trap_fatal (frame=0xff0060c0b740, eva=Variable 
> "eva" is not available.
> )
> at /usr/src/sys/amd64/amd64/trap.c:857
> #4  0x808799ea in trap (frame=0xff8040d2c9d0)
> at /usr/src/sys/amd64/amd64/trap.c:644
> #5  0x8085f983 in calltrap ()
> at /usr/src/sys/amd64/amd64/exception.S:224
> #6  0xff8040d2cc83 in ?? ()
> #7  0xff8040d2cb50 in ?? ()
> #8  0xff8040d2caf0 in ?? ()
> #9  0xff8040d2cbf0 in ?? ()
> #10 0xff0060c0b740 in ?? ()
> #11 0x80b83c60 in sysent ()
> #12 0xff8040d2cc80 in ?? ()
> #13 0xff8040d2cae0 in ?? ()
> #14 0x8059c431 in bintime (bt=0x80ad3140)
> at /usr/src/sys/kern/kern_tc.c:200
> Previous frame inner to this frame (corrupt stack?)
> (kgdb) 



-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 8.1-release + zfs v15 df(1) hangs

2010-08-18 Thread Andriy Gapon
on 18/08/2010 22:07 Marian Hettwer said the following:
> I'll try and reproduce that tomorrow. I would say, a hanging nfs mount
> shouldn't lead to a hanging around df(1).

See df -n.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Runaway intr, not flash related

2010-08-19 Thread Andriy Gapon
on 12/08/2010 23:57 Doug Barton said the following:
> My "runaway intr" problem with flash has been continuing all along, but
> since no one has been interested in helping with it I haven't reported
> it for a while. However, today, for the first time, it happened when I
> had not run flash at all since I booted.
> 
> My system:
> Dell D620, C2D, i386, SMP, r210908
> 
> swi4: clock is the culprit again this time, but when flash triggers this
> problem I sometimes see hdac as the culprit, FYI.
> 
> 
> last pid: 19763;  load averages:  1.05,  1.40,  1.18up 0+01:58:20 
> 13:41:19
> 129 processes: 3 running, 106 sleeping, 20 waiting
> CPU 0: 20.8% user,  0.0% nice,  6.9% system,  8.5% interrupt, 63.8% idle
> CPU 1: 56.9% user,  0.0% nice,  8.5% system,  1.5% interrupt, 33.1% idle
> Mem: 182M Active, 1279M Inact, 187M Wired, 18M Cache, 112M Buf, 334M Free
> Swap: 1024M Total, 1024M Free
> 
>   PID USERNAME   PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
>10 root   171 ki31 0K16K RUN 1  87:55 63.72% {idle:
> cpu1}
>10 root   171 ki31 0K16K RUN 0  88:03 60.69% {idle:
> cpu0}
>  1621 dougb  1020   162M   141M select  0  14:19 29.54% Xorg
>11 root   -32- 0K   160K WAIT0   0:33  5.76% {swi4:
> clock}
>  1668 dougb   970 36808K 20864K select  0   0:38  3.61% {initial
> thread
>  1692 dougb80 11136K  2284K nanslp  0   2:13  2.15% wmwlmon
> 19763 dougb   960  9912K  2076K CPU11   0:01  1.57% top
>17 root96- 0K 8K syncer  1   0:48  1.17% syncer
>  1684 dougb   960 11020K  2108K select  1   1:10  1.12% wmbsdbatt
>  1762 dougb   960 36284K 15540K select  0   0:04  0.39% {initial
> thread
>11 root   -64- 0K   160K WAIT0   0:03  0.15% {irq22:
> uhci2}
>   783 root960  9684K  1232K select  0   0:21  0.10% moused
>  1663 dougb   960 21388K  8912K select  1   0:15  0.10% openbox
>11 root   -32- 0K   160K WAIT1   0:17  0.05% {swi4:
> clock}
>  1817 dougb   960 90820K 53672K select  0   3:23  0.00% {initial
> thread
> 0 root   -160 0K64K sched   0   0:26  0.00% {swapper}

I am sorry, but I don't see anything dramatically wrong here.
So "swi4: clock" uses 5.76% of WCPU, is that such a big deal to be called 
"runaway
intr"?
A lot of CPU time is idle and a lot is used by userland processes (e.g. Xorg).
Can you provide data that better illustrate your problem?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Runaway intr, not flash related

2010-08-19 Thread Andriy Gapon
on 19/08/2010 20:30 Doug Barton said the following:
> On 08/19/2010 08:24, Andriy Gapon wrote:
>> I am sorry, but I don't see anything dramatically wrong here. So
>> "swi4: clock" uses 5.76% of WCPU, is that such a big deal to be
>> called "runaway intr"?
> 
> That's the symptom.

OK, I see.

Perhaps you will find this message (and its ancestor thread) interesting:
http://lists.freebsd.org/pipermail/freebsd-hackers/2008-February/023447.html
I believe that your issue is different, but perhaps that stuff will inspire you 
to
use ktr(4) and schedgraph to properly debug this issue.  I strongly believe that
you have some sort of a scheduling issue and ktr seems to be the way to
investigate it.

Perhaps, you can first try the following dtrace script.
It should give a better view of what statclock sees, but I am not sure if that
information will be sufficient.
//
fbt::statclock:entry
/curthread->td_oncpu == 0/
{

@stacks0[stack()] = count();
counts0++;
}

fbt::statclock:entry
/curthread->td_oncpu == 1/
{

@stacks1[stack()] = count();
counts1++;
}

fbt::statclock:entry
{

@stacks[pid, tid, stack()] = count();
counts++;
}

END
{
printf("\n");
printf("* CPU 0:\n");
normalize(@stacks0, counts0 / 100);
trunc(@stacks0, 5);
printa("%...@u\n\n", @stacks0);

printf("\n\n");
printf("* CPU 1:\n");
normalize(@stacks1, counts1 / 100);
trunc(@stacks1, 5);
printa("%...@u\n\n", @stacks1);

printf("\n\n");
printf("* Top Processes:\n");
normalize(@stacks, counts / 200);
trunc(@stacks, 20);
printa(@stacks);
}
//
You would run this script when the problem hits, few seconds should be 
sufficient.
You may want to play with values in trunc() calls, you may also want to filter
gathered statistics (using conditions in /.../) by pid/tid if you spot anything
interesting unusual.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Latest intr problems

2010-08-20 Thread Andriy Gapon
on 21/08/2010 03:03 Doug Barton said the following:
> Here are the results of a vmstat -i, the old dtrace script, and Andriy's
> new one.

I think that for such amount of data it is better to use links (perhaps a
service like pastebin) rather than inlining it.
BTW, it seems that there are no followups/comments on results of the old dtrace
script, so I am not sure if there is any point in continuing to post it.
It is useless personally for me.

Back to the data.
Could you please report results of
procstat -k 10
procstat -k 11
?

Thanks.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Latest intr problems

2010-08-21 Thread Andriy Gapon
on 21/08/2010 09:50 Andriy Gapon said the following:
> on 21/08/2010 03:03 Doug Barton said the following:
>> Here are the results of a vmstat -i, the old dtrace script, and Andriy's
>> new one.
> 
> I think that for such amount of data it is better to use links (perhaps a
> service like pastebin) rather than inlining it.
> BTW, it seems that there are no followups/comments on results of the old 
> dtrace
> script, so I am not sure if there is any point in continuing to post it.
> It is useless personally for me.
> 
> Back to the data.
> Could you please report results of
> procstat -k 10
> procstat -k 11
> ?

Some additional stuff.

Could you please remind what the "old dtrace script" is? :-)

Output of sysctl dev.cpu (normal and when the problem hits)

Another dtrace script:
profile:::profile-1001
{
@stacks[curthread->td_oncpu, pid, tid, stack()] = count();
}
END
{
trunc(@stacks, 20);
printa(@stacks);
}

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Latest intr problems

2010-08-21 Thread Andriy Gapon
on 21/08/2010 10:33 Doug Barton said the following:
> On Sat, 21 Aug 2010, Andriy Gapon wrote:
> 
>>> I think that for such amount of data it is better to use links
>>> (perhaps a
>>> service like pastebin) rather than inlining it.
> 
> No problem:
> http://people.freebsd.org/~dougb/intr-out.txt

Thanks a lot!
Can you try, for the sake of experiment, to reproduce the problem with
hw.acpi.cpu.cx_lowest=C1 ?
I feel like you might be having a problem with clocks...

BTW, if you run procstat -k 11 a few times during the condition, does TID 16
typically have or not have "lock_mtx softclock" substring in its stack?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Latest intr problems

2010-08-21 Thread Andriy Gapon
on 21/08/2010 12:35 Andriy Gapon said the following:
> I feel like you might be having a problem with clocks...

FWIW, I am reading this document http://edc.intel.com/Link.aspx?id=1484
and I see this sentence: "All of the clocks in the processor core are
stopped in the C3 state".

I see that you have C3 state enabled and it's regularly entered:
dev.cpu.0.cx_usage: 0.00% 5.51% 94.48% last 305us

Also I see that LAPIC timer is used as timer1 (hardclock) on your system:
kern.eventtimer.timer1: LAPIC

I believe that this might be the explanation of what you see, but I am not sure.
One indication that this might be the case is high degree of aliasing between
hardclock and statclock interrupts as per my interpretation of the dtrace
information.

You can test by either avoiding C3 state (via cx_lowest) or configuring some
other timer as kern.eventtimer.timer1

P.S. I think that timer selection code and/or Cx configuration code could/should
be smarter about things like that.  After all ET_FLAGS_C3STOP is set for your
LAPIC timer.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Latest intr problems

2010-08-21 Thread Andriy Gapon
on 21/08/2010 16:04 b. f. said the following:
> Andriy Gapon wrote:
>> on 21/08/2010 12:35 Andriy Gapon said the following:
>>> I feel like you might be having a problem with clocks...
>> FWIW, I am reading this document http://edc.intel.com/Link.aspx?id=1484
>> and I see this sentence: "All of the clocks in the processor core are
>> stopped in the C3 state".
>>
>> I see that you have C3 state enabled and it's regularly entered:
>> dev.cpu.0.cx_usage: 0.00% 5.51% 94.48% last 305us
> 
> I don't think this accounts for all of his problems, unless his
> machine has an unusual configuration. 

Well, let's try to not muddy the waters prematurely.

> Alexander and I recommended
> that he try different clocks, and just recently, for example, he wrote
> that he had used:
> 
> loader.conf
> hint.apic.0.clock="0"
> hint.atrtc.0.clock="0"
> hint.attimer.0.clock="0"
> hint.hpet.0.legacy_route="1"

Well, I don't see much point in doing the above in this situation.

> machdep.disable_rtc_set="1"
> kern.eventtimer.timer2="HPET"
> kern.eventtimer.timer1="NONE" (Or, if available, HPET1, ...)

So, what was actually used here?
I don't think that NONE is a good idea.

> kern.eventtimer.singlemul="1"
> 
> sysctl.conf:
> kern.timecounter.hardware=HPET
> 
> and reported that it did not help.  The HPET doesn't usually suffer
> from the problem that you are describing, right?

Right.
Still I would prefer that Doug would do the cleaner experiment(s) that I
suggested.  And if the problem persists then elimination of LAPIC timer would
make the picture clearer (for me).

P.S.
I still think that KTR+schedgraph would be the best tool here.

-- 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: amd64 panic snd_hda - hdac_get_capabilities: Invalid corb size (0)

2010-08-21 Thread Andriy Gapon
on 30/07/2010 17:36 Anton Shterenlikht said the following:
> On Fri, Jul 30, 2010 at 04:31:44PM +0300, Andriy Gapon wrote:
>> Just a one thing to try - can you please add hdac_reset(sc, 1) call in
>> hdac_attach() right before hdac_get_capabilities() call?
>> The idea is to reset the controller before trying to get its capabilities.
> 
> OSS became 1, no other change:
> 
> % dmesg | fgrep -i hda
> hdac0:  irq 16 at device 20.2 on 
> pci0
> hdac0: HDA Driver Revision: 20100226_0142
> hdac0: Lazy allocation of 0x4000 bytes rid 0x10 type 3 at 0xb7fb
> hdac0: [MPSAFE]
> hdac0: [ITHREAD]
> hdac0: hdac_get_capabilities: Invalid corb size (0)
> hdac0: Resetting corb size to 256
> hdac0: hdac_get_capabilities: Invalid rirb size (0)
> hdac0: Resetting rirb size to 256
> hdac0: Caps: OSS 1, ISS 0, BSS 0, NSDO 1, CORB 256, RIRB 256
> hdac0: 

Just a notice that I don't have any further ideas, sorry.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: runaway intr problems: powerd and/or hw.acpi.cpu.cx_lowest related

2010-08-22 Thread Andriy Gapon
on 23/08/2010 06:17 Doug Barton said the following:
> I also got another interesting set of data today from a "runaway intr" 
> situation
> that did not involve swi:4. The symptoms were the same as previously, but the
> devices involved were totally different. This may have to do with the fact 
> that
> I switched back to ULE for the testing today, and/or I hadn't set 
> cx_lowest=C3.

Yes, ULE rules.  4BSD usually only makes things better when there is some real
problem and 4BSD masks it due to its design.

> http://people.freebsd.org/~dougb/intr-out-3.txt

So, hm, npviewer.bin eats all the CPU time?
Just google that name and see that you are not alone.
Can't help with that though.

> This was with ULE + USB in the kernel, LAPIC/HPET, cx_lowest=C1, but running
> powerd with the following:
> powerd_flags="-a adaptive -b adaptive -n adaptive"

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: runaway intr problems: powerd and/or hw.acpi.cpu.cx_lowest related

2010-08-23 Thread Andriy Gapon
on 23/08/2010 09:48 Doug Barton said the following:
> On 08/22/2010 23:20, Andriy Gapon wrote:
>> on 23/08/2010 06:17 Doug Barton said the following:
>>
>>> http://people.freebsd.org/~dougb/intr-out-3.txt
>>
>> So, hm, npviewer.bin eats all the CPU time?
> 
> No, the odd bits of that one are the fact that the intr threads irq17, irq256,
> and irq20; are showing up at all, and/or showing up with more than a fraction 
> of
> a percent of cpu time.

DTrace output doesn't show anything abnormal for those, but it does show that
those interrupts do happen and those drivers do work.
E.g. there is hdac (sound) activity [irq256: hdac0] and wireless activity
[irq17: wpi0]. irq20 is hpet + usb.

So did you do anything wireless?  Did you play sound?

The %WCPU may be _reported_ higher than what it actually is due to other issues
with your system (high load by npviewer.bin, HPET+USB interrupt sharing, C3 with
LAPIC timer).

> Usually they don't, and the fact that they did at that
> point in time was indicative of the fact that the "runaway intr" problem was
> happening. _Incidentally_ npviewer.bin was taking up more cpu than it usually
> does, but I think that's another symptom of the underlying problem.

In complex systems it's not always trivially obvious what's incidental and
what's not.

> Here is a typical, non-problematic top output while running a flash video:
> 
> last pid: 10841;  load averages:  0.22,  0.12,  0.19up 0+04:15:49 23:46:11
> 171 processes: 3 running, 148 sleeping, 20 waiting
> CPU 0: 14.8% user,  0.0% nice,  3.1% system,  0.0% interrupt, 82.0% idle
> CPU 1: 18.8% user,  0.0% nice,  0.0% system,  0.0% interrupt, 81.3% idle
> Mem: 342M Active, 1397M Inact, 168M Wired, 49M Cache, 112M Buf, 45M Free
> Swap: 1024M Total, 1444K Used, 1022M Free
> 
>   PID USERNAME   PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
>10 root   171 ki31 0K16K CPU00 203:29 82.86% {idle: cpu0}
>10 root   171 ki31 0K16K RUN 1 191:24 81.05% {idle: cpu1}
> 10813 dougb   540   420M 78196K select  0   0:18 17.77% npviewer.bin
> 10822 dougb   470   420M 78196K futex   1   0:05  6.30% npviewer.bin
> 10839 dougb   450   420M 78196K futex   1   0:03  3.66% npviewer.bin
> 10840 dougb   450   420M 78196K futex   1   0:03  3.66% npviewer.bin
> 10832 dougb   450   420M 78196K pcmwrv  1   0:03  2.88% npviewer.bin
>  1598 dougb   440   163M   142M select  1  12:06  1.56% Xorg
>11 root   -68- 0K   160K WAIT1   1:10  0.49% {irq17: wpi0}
> 10770 dougb   440   178M   136M ucond   0   0:00  0.39% {firefox-bin}
> 10770 dougb   450   178M   136M select  1   0:15  0.29% {initial 
> thread
>11 root   -80- 0K   160K WAIT0   0:45  0.10% {irq256: 
> hdac0}

Well, notice that in this case your npviewer.bin processes are not "run away"
either.  They spend most of the time waiting for something and use only a
fraction of CPU time.  In the report that I commented on they were mostly
running on CPU (and who knows what else they were doing, like driving sound card
crazy etc).

> I really wish people would stop focusing on flash here. :)  It's simply the
> easiest and most consistent way that I have triggered this problem, it's not 
> the
> only one.

Well, I just interpreted the DTrace output you gave.  No prejudice against
flash, although all those reports/complaints by Linux folks are suspicious.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: runaway intr problems: powerd and/or hw.acpi.cpu.cx_lowest related

2010-08-23 Thread Andriy Gapon
on 23/08/2010 10:55 Doug Barton said the following:
> Meanwhile, with the combination of ULE, no powerd, and cx_lowest=C1 I was able
> to watch 2 movies streaming over flash, plus do backups to various USB drives,
> read mail, etc. etc. for several hours; all without a hiccup. So clearly (in 
> my
> mind at least) there is a problem with powerd, at least in the situation like
> mine where there is IRQ contention for HPET. I forgot to mention that in my
> previous testing today I tried running just powerd (not changing cx_lowest) 
> and
> I when intr started running away I could "solve" the problem by killing 
> powerd.
> The intr process went back to normal, and I could go back to doing what I was
> doing without having to reboot again.


Speaking of which you seem to have too many powerd levels.
What cpufreq drivers are in use on your system?
Maybe you'd want to stick to just one of them?
E.g.:
http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/055666.html

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sched_pin() bug in SCHED_ULE

2010-08-26 Thread Andriy Gapon
on 27/08/2010 00:20 m...@freebsd.org said the following:
> 
> I tried making sched_pin() a real function which used
> intr_disable/intr_restore around saving off td->td_oncpu,
> td->td_lastcpu and ts->ts_cpu, and the stack at the time of call.  In
> sched_switch when I saw an unexpected migration I printed all that
> out.  I found that on my boxes, at sched_pin() time ts_cpu was already
> different from td->td_oncpu, so the specific problem I was having was
> that while another thread can change ts_cpu it has no way to force
> that thread to immediately migrate.

Like e.g. in sched_affinity where ts_cpu is first changed and then the old cpu
is ipi-ed?

> I believe the below patch fixes the issue, though I'm open to other methods:
> 
> 
> Index: kern/sched_ule.c
> ===
> --- kern/sched_ule.c  (.../head/src/sys)  (revision 158279)
> +++ kern/sched_ule.c  (.../branches/BR_BUG_67957/src/sys) (revision 
> 158279)
> @@ -112,6 +112,7 @@
>  /* flags kept in ts_flags */
>  #define  TSF_BOUND   0x0001  /* Thread can not migrate. */
>  #define  TSF_XFERABLE0x0002  /* Thread was added as 
> transferable. */
> +#define  TSF_BINDING 0x0004  /* Thread is being bound. */

I don't really follow why TSF_BINDING is needed, i.e. why TSF_BOUND is not
sufficient in this case?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: runaway intr problems: powerd and/or hw.acpi.cpu.cx_lowest related

2010-08-27 Thread Andriy Gapon
on 23/08/2010 11:43 Doug Barton said the following:
> Ok, so it seems that you're suggesting to disable throttling, so I added the
> following to /boot/loader.conf:
> 
> hint.p4tcc.0.disabled="1"
> hint.p4tcc.1.disabled="1"
> hint.acpi_throttle.0.disabled="1"
> hint.acpi_throttle.1.disabled="1"
> 
> Not sure the .1.'s are necessary, but I wanted to be thorough. With that I 
> get:
> dev.cpu.0.freq_levels: 2333/31000 2000/26000 1667/22000 1333/17000 1000/13000
> dev.est.0.freq_settings: 2333/31000 2000/26000 1667/22000 1333/17000 
> 1000/13000
> dev.est.1.freq_settings: 2333/31000 2000/26000 1667/22000 1333/17000 
> 1000/13000
> 
> hopefully that's more in line with what it should be? I'd really like to be 
> able
> to at least use powerd since it does seem to help with heat when the system is
> idle (and by extension, power consumption as well).
> 
> Unless you say differently when I get up tomorrow I'll try this configuration
> for a little while and see how it goes.

So, how did this go?
Did the change make any difference?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: runaway intr problems: powerd and/or hw.acpi.cpu.cx_lowest related

2010-08-27 Thread Andriy Gapon
on 27/08/2010 19:19 Doug Barton said the following:
> Yes, it improved things greatly. I first ran with just powerd for several 
> hours
> and that worked fine. The next day I was able to use powerd and cx_lowest=C2 
> for
> the better part of a day (including watching a few flash videos). By the end 
> of
> the day intr started to run away again, so not out of the woods yet, but at 
> least
> this shows we're going in the right direction. Also, while poking around in 
> the
> BIOS settings I noticed in one of the "information only" screens that I don't
> usually visit one line about the "minimum cpu speed" is 1.00 GHz, which the 
> sysctl
> output above seems to verify. So where the throttling code was getting all 
> those
> other numbers I don't know.
> 
> Meanwhile I've actually not been running FreeBSD for most of this week I've 
> been
> working on re-partitioning my new disk and running ubuntu. So 2 interesting 
> pieces
> of information there, first the "CPU Frequency Scaling Monitor" for the gnome 
> that
> comes with ubuntu never goes below 1 GHz, so that bit seems extra verified.
> Second, I can watch all the flash videos I want while doing other stuff in the
> background (like restoring the backups of my data) without any problems, so 
> add
> that to windows in terms of OS' that work on this same hardware. Now that I 
> have
> finally figured out how to boot windows, linux, and 2 FreeBSDs on the same 
> disk
> I'll be able to set up 7-stable i386 and 9-current amd64 to see how they 
> compare
> to the 9-current i386 I was using previously; so I should have more 
> information in
> a few days.

Cool!
Meanwhile can you double-check what timers does Linux use there?
(No idea how to do that, especially if it's NO_HZ kernel).

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [CFT] Improved ZFS metaslab code (faster write speed)

2010-08-28 Thread Andriy Gapon
on 28/08/2010 04:24 jhell said the following:
>   I must have missed the uma defrag patches but according to the code
> those patches should not have any effect on your implimentation of ZFS
> on your system because vfs.zfs.zio.use_uma defaults to off unless you
> have manually turned this on or the patch reverts that facility back to
> its original form.

ZFS uses UMA even for other things besides ARC and those are not controlled by
vfs.zfs.zio.use_uma.  Those zones also happen to be the most fragmented ones for
me.  To name a few: dnode_t, dmu_buf_impl_t, arc_buf_hdr_t.
So the patch should help there.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [CFT] Improved ZFS metaslab code (faster write speed)

2010-08-28 Thread Andriy Gapon
on 28/08/2010 04:24 jhell said the following:
> The modified patch from avg@ (portion patch) is:
> 
> #ifdef _KERNEL
> if (arc_reclaim_needed()) {
> needfree = 0;
> wakeup(&needfree);
> }
> #endif
> 
>   I still moved that down to below _KERNEL for the obvious reasons.  But
> when I was using the original patch with if (needfree) I noticed a
> performance degradation after ~12 hours of use with and without UMA
> turned on. So far with ~48 hours of testing with the top half of that
> being with the above change, I have not seen more degradation of

This is quite unexpected.
needfree should be checked as the very first thing in arc_reclaim_needed()
[unless you have patched it locally].  So if needfree is 1 then
arc_reclaim_needed() should also return 1.  But the converse is not true,
arc_reclaim_needed() may return 1 even if needfree is zero.

So if your testing results are conclusive then it must mean that some extra
wakeups on needfree are needed.  I.e. needfree is zero, so there shouldn't be
anything waiting on it (see arc_lowmem) and no notification should be needed,
but issuing somehow does make difference,
Hmm...

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Difficulty playing DVDs under AHCI/CAM?

2010-08-28 Thread Andriy Gapon
on 28/08/2010 20:30 Garrett Wollman said the following:
> After a recent upgrade, I switched to AHCI/CAM for my SATA devices,
> including a new DVD drive.  Now I find that nothing can play DVDs any
> more.  For example, here's what mplayer does:

[snip]

>   8469 initial thread CALL  close(0x4)
>   8469 initial thread RET   close 0
>   8469 initial thread CALL  read(0x3,0x7fffb4d0,0x800)
>   8469 initial thread RET   read -1 errno 6 Device not configured
> 
> ...say what?  Why is the cd driver suddenly returning ENXIO?

Strange indeed.
Can you dtrace this read?  You can use combination of syscall and fbt providers
with execname predicate.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


patch for topology detection of Intel CPUs

2010-08-29 Thread Andriy Gapon

[Reposted from stable@; edited]

The below patch is against sources in FreeBSD tree, it should be applied
either to sys/amd64/amd64/mp_machdep.c or sys/i386/i386/mp_machdep.c depending
on the desired architecture:
http://people.freebsd.org/~avg/intel-cpu-topo.diff

The patch is substantially based on the Junk-uk's patch, but with some changes
and additions:
- topo_prob_0x4() is rewritten so that it does APIC ID matching against masks
as described in the Intel article.  The code still heavily depends on the
assumption of the uniform topology, it discovers number of cores in BSP package
and number of threads in BSP core and extrapolates that to global topology.
The difference with current code and Junk-uk's patch is that actual APIC ID
matching is done as opposed to deriving counts purely from max. values.

- topo_prob_0x4() is invoked for 1 <= cpu_high < 4 case as well as for 4 <=
cpu_high < 11 case as done in the current code, but unlike Junk-uk's patch.  The
code should be able to properly handle that class of CPUs and either detect
hyperthreading topology or fallback to one processor per package topology.

- added a few comments that describe uniformity assumption, plus couple other
useful things.

- changed "final fallback" code, so that each logical CPU is considered to be in
its own physical package as opposed to current code placing all logical CPUs as
cores of a single package.

The rest is Junk-uk's work.

Concerns:
- about my code: ilog2_round_pow2 name is ugly; looking for suggestions on a
better name or re-arranging/writing that code, so that the function is not 
needed.
- about current code: logical_cpus variable (don't confuse with cpu_logical)
doesn't seem to be consistently used; e.g. it is not set at all by
topo_probo_0xb(); also, the method of using it for setting logical_cpus_mask
doesn't seem to be reliable - BSP may be missed.

Reviews, comments and test reports are very welcome!
Please test the patch if you have any problems with how CPU topology is reported
by the current code.  Please test even if everything is OK, to avoid 
regressions.

Thanks!
-- 
Andriy Gapon

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: patch for topology detection of Intel CPUs

2010-08-30 Thread Andriy Gapon
on 30/08/2010 12:15 pluknet said the following:
> On 29 August 2010 13:25, Andriy Gapon  wrote:
>>
>> [Reposted from stable@; edited]
>>
>> The below patch is against sources in FreeBSD tree, it should be applied
>> either to sys/amd64/amd64/mp_machdep.c or sys/i386/i386/mp_machdep.c 
>> depending
>> on the desired architecture:
>> http://people.freebsd.org/~avg/intel-cpu-topo.diff
>>
> 
> Hi, Andriy.
> 
> I tried your patch and see no regression on Xeon 50xx, 55xx, 54xx.
> It also improved CPU detection on Xeon 54xx (as well as original
> Junk-uk's patch).
> 
> It also improved CPU detection on Xen HVM @ Xeon 55xx @ 3 cores:
> 
> FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs
> FreeBSD/SMP: 1 package(s) x 3 core(s)
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  2
>  cpu2 (AP): APIC ID:  4
> 

Thanks a lot for testing!

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: patch for topology detection of Intel CPUs

2010-08-30 Thread Andriy Gapon
on 29/08/2010 12:25 Andriy Gapon said the following:
> The patch is substantially based on the Junk-uk's patch, but with some changes

I several times mistyped Jung-uk's name, my sincere apologies.
Probably should have used jkim instead :)
Thanks to rdivacky for pointing this out to me.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: symbol versioning on libgcc?

2010-08-30 Thread Andriy Gapon
on 30/08/2010 20:32 Steve Kargl said the following:
> I know that several libraries in FreeBSD
> uses symbol versioning.  In looking through
> src/ I was unable to determine whether 
> symbol versioning is used on libgcc.  Any
> guidance would be appreciated.

Check out output of e.g. objdump -T /usr/lib/libgcc_s.so

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: svn commit: r211908 - head/sys/dev/ichwd

2010-08-30 Thread Andriy Gapon
on 30/08/2010 20:18 Mike Tancsa said the following:
> At 12:51 PM 8/30/2010, Olivier Smedts wrote:
> 
>> By any chance, is it disabled in BIOS ?
> 
> Hi,
> There are a couple of options in the BIOS. There is a "reboot the box if the
> bios does not post within 6min" as well as "Fire the watchdog if the dog has 
> not
> been patted after 5,10 or 15min after the BIOS post.   I tried all 
> combinations
> without luck. If I have the "reboot after x min post post", the box will 
> reboot
> on its own.

I'd guess that this kind of option would enable OS use of the watchdog.
Perhaps you can contact Intel about this issue, either via their official
support service or via jfv (who is CCed as I see) or both.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


stable/8 build broken on head without WITH_CTF

2010-09-01 Thread Andriy Gapon

stable/8 build seems to be broken for me on head without WITH_CTF:
...
cc -c -x assembler-with-cpp -DLOCORE -O2 -fno-strict-aliasing -pipe -march=k8
-std=c99 -g -O -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes
-Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef
-Wno-pointer-sign -fformat-extensions -nostdinc  -I.
-I/usr/devel/svn/base/stable/8/sys
-I/usr/devel/svn/base/stable/8/sys/contrib/altq -D_KERNEL
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
-finline-limit=8000 --param inline-unit-growth=100 --param
large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel
-mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-sse3 -mno-mmx -mno-3dnow
-msoft-float -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector
-Werror /usr/devel/svn/base/stable/8/sys/amd64/amd64/locore.S
: No such file or directory
*** Error code 1

The reason is that kernel Makefile (generated by config I assume) has these in
it (just two examples):
...
cam.o: $S/cam/cam.c
${NORMAL_C}
@${NORMAL_CTFCONVERT}
...
locore.o: $S/amd64/amd64/locore.S
${NORMAL_S}
@${NORMAL_CTFCONVERT}
...

The issue is that NORMAL_CTFCONVERT has an empty value unless WITH_CTF is used,
see sys/conf/kern.pre.mk of stable/8.
I guess this happens because config is from head, but mk files are from 
stable/8.

If we don't try to support such build configurations, then sorry for the noise.
But if this can be made to work without much hassle, then I'd appreciate it.
Thanks!

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: stable/8 kernel build broken on head without WITH_CTF

2010-09-01 Thread Andriy Gapon

The subject should have been what it is now, sorry.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Difficulty playing DVDs under AHCI/CAM?

2010-09-01 Thread Andriy Gapon

Just want to draw attention of those who use ahci, have hald running and burn
optical media to couple of known issues:

1.
http://thread.gmane.org/gmane.os.freebsd.devel.gnome/29636/focus=29652
2.
http://thread.gmane.org/gmane.os.freebsd.devel.scsi/5128
3. k3b (and k3b-kde4) has a bug in its internal code which results in incorrect
SCSI command(s) that may confuse some drive models at device or media probing 
stage.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Difficulty playing DVDs under AHCI/CAM?

2010-09-02 Thread Andriy Gapon
on 01/09/2010 21:21 Garrett Wollman said the following:
> < said:
> 
>> Just want to draw attention of those who use ahci, have hald running and burn
>> optical media to couple of known issues:
> 
> What about those of us who use ahci, don't have hald running, and just
> want to read their DVDs?

I am not aware of any known but not resolved issues in this context.
But I think that I gave you a good advice.

> I never heard any response from you when I
> asked for a more specific debugging procedure.

Sorry about that, forgot to tell you about google.
Now, apologies about the joke, no offense meant :-)
These links should give a good overview for the start:
http://wiki.freebsd.org/DTrace
http://wiki.freebsd.org/DTrace/Examples
http://www.freebsd.org/doc/handbook/dtrace.html
http://wikis.sun.com/display/DTrace/Documentation

And, oh, here is a script that I used to investigate a somewhat similar problem
with failing ioctl.  I think you should be able to easily adapt it to your 
needs.

syscall::ioctl:entry
/execname == "k3b" && arg1 == 3299349762/
{
self->trace = 1;
}

fbt:::entry
/self->trace/
{
}

fbt:::return
/self->trace/
{
printf("%d", arg1);
}

syscall::ioctl:return
{
self->trace = 0;
}

> My next step was going
> to be enabling CAMDEBUG and trying to find out which specific
> operation was failing, but I'm not a SCSI expert by any means

Not sure if debugging with CAMDEBUG would be easier or not.
There could be lots of output.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


panic in get_next_dirent

2010-09-02 Thread Andriy Gapon

Brian,

after I upgrade from beginning-of-June kernel to end-of-August one (r211758) I
get a panic in get_next_dirent which happens during parallel access to FS like
during buildworld with -jN.
I am upgrading kernel to the latest revision as of today.

Could this be something that you accidentally broke and then fixed while
pursuing your NFS issue?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


panic in get_next_dirent

2010-09-02 Thread Andriy Gapon

Brian,

after I upgraded my kernel from beginning of July version to end of August
version I started to get panics in get_next_dirent under parallel FS load, like
e.g. during buildworld with -jN.

Is this something that might have been broken by accident and then fixed later?
I've seen that you were making some changes in the related code while working on
your NFS problem.

I am upgrading kernel to the latest version now to see if that helps.

Here is panic information:

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0xff80151b8abb
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x803f6f54
stack pointer   = 0x28:0xff8124353580
frame pointer   = 0x28:0xff8124353650
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12295 (sh)
trap number = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at 0x801b84ba = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0x803a2c62 = kdb_backtrace+0x32
panic() at 0x8036cb54 = panic+0x1b4
trap_fatal() at 0x805471ad = trap_fatal+0x39d
trap_pfault() at 0x805473bd = trap_pfault+0x1ed
trap() at 0x805479a4 = trap+0x484
calltrap() at 0x80531428 = calltrap+0x8
--- trap 0xc, rip = 0x803f6f54, rsp = 0xff8124353580, rbp =
0xff8124353650 ---
get_next_dirent() at 0x803f6f54 = get_next_dirent+0x164
vop_stdvptocnp() at 0x803f749a = vop_stdvptocnp+0x31a
VOP_VPTOCNP_APV() at 0x805a3af8 = VOP_VPTOCNP_APV+0xe8
vn_vptocnp_locked() at 0x803f339c = vn_vptocnp_locked+0x1fc
vn_fullpath1() at 0x803f36b8 = vn_fullpath1+0x1e8
kern___getcwd() at 0x803f3b4a = kern___getcwd+0xda
__getcwd() at 0x803f3cd4 = __getcwd+0x14
syscallenter() at 0x803b088e = syscallenter+0x26e
syscall() at 0x80547432 = syscall+0x42
Xfast_syscall() at 0x80531702 = Xfast_syscall+0xe2
--- syscall (326, FreeBSD ELF64, __getcwd), rip = 0x800939cfc, rsp =
0x7fffe0b8, rbp = 0x800c2a208 ---

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


problem with amd64 minidump

2010-09-02 Thread Andriy Gapon

Not sure if this is some local issue or a problem in FreeBSD code.
I remember minidumps working perfectly well for me, but now I can not get data
from them.
Example:
dmesg -M /var/crash/vmcore.4
dmesg: _kvm_vatop: direct map address 0xff012fe0 not in minidump
dmesg: kvm_read: invalid address (0xff012fe0)

Needless to say kgdb refuses to work with that core too.

With kgdb on live system I can access that address:
(gdb) x/a 0xff012fe0
0xff012fe0: 0xff012ffe

Looks like perhaps we do not include something that we should into the dump?
Thanks!
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic in get_next_dirent

2010-09-02 Thread Andriy Gapon
on 02/09/2010 13:01 Andriy Gapon said the following:
> 
> Brian,
> 
> after I upgraded my kernel from beginning of July version to end of August
> version I started to get panics in get_next_dirent under parallel FS load, 
> like
> e.g. during buildworld with -jN.
> 
> Is this something that might have been broken by accident and then fixed 
> later?
> I've seen that you were making some changes in the related code while working 
> on
> your NFS problem.
> 
> I am upgrading kernel to the latest version now to see if that helps.
>

Update to r212138 seems to have helped.
Sorry if my report is useless.

> Here is panic information:
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 01
> fault virtual address   = 0xff80151b8abb
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x803f6f54
> stack pointer   = 0x28:0xff8124353580
> frame pointer   = 0x28:0xff8124353650
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 12295 (sh)
> trap number = 12
> panic: page fault
> cpuid = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0x801b84ba = db_trace_self_wrapper+0x2a
> kdb_backtrace() at 0x803a2c62 = kdb_backtrace+0x32
> panic() at 0x8036cb54 = panic+0x1b4
> trap_fatal() at 0x805471ad = trap_fatal+0x39d
> trap_pfault() at 0x805473bd = trap_pfault+0x1ed
> trap() at 0x805479a4 = trap+0x484
> calltrap() at 0x80531428 = calltrap+0x8
> --- trap 0xc, rip = 0x803f6f54, rsp = 0xff8124353580, rbp =
> 0xff8124353650 ---
> get_next_dirent() at 0x803f6f54 = get_next_dirent+0x164
> vop_stdvptocnp() at 0x803f749a = vop_stdvptocnp+0x31a
> VOP_VPTOCNP_APV() at 0x805a3af8 = VOP_VPTOCNP_APV+0xe8
> vn_vptocnp_locked() at 0x803f339c = vn_vptocnp_locked+0x1fc
> vn_fullpath1() at 0x803f36b8 = vn_fullpath1+0x1e8
> kern___getcwd() at 0x803f3b4a = kern___getcwd+0xda
> __getcwd() at 0x803f3cd4 = __getcwd+0x14
> syscallenter() at 0x803b088e = syscallenter+0x26e
> syscall() at 0x80547432 = syscall+0x42
> Xfast_syscall() at 0xffff80531702 = Xfast_syscall+0xe2
> --- syscall (326, FreeBSD ELF64, __getcwd), rip = 0x800939cfc, rsp =
> 0x7fffe0b8, rbp = 0x800c2a208 ---
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: problem with amd64 minidump

2010-09-02 Thread Andriy Gapon
on 02/09/2010 13:10 Andriy Gapon said the following:
> 
> Not sure if this is some local issue or a problem in FreeBSD code.
> I remember minidumps working perfectly well for me, but now I can not get data
> from them.
> Example:
> dmesg -M /var/crash/vmcore.4
> dmesg: _kvm_vatop: direct map address 0xff012fe0 not in minidump
> dmesg: kvm_read: invalid address (0xff012fe0)

Not sure if it can help, but it seems that this virtual address in DMAP
corresponds to a physical address in the last page of RAM.
Do we use that for anything special?  Message buffer?
I had a quick look at getmemsize() function in sys/amd64/amd64/machdep.c and it
looks like the following code in the function could be doing just that:

Maxmem = atop(phys_avail[pa_indx]);

/* Trim off space for the message buffer. */
phys_avail[pa_indx] -= round_page(MSGBUF_SIZE);

/* Map the message buffer. */
msgbufp = (struct msgbuf *)PHYS_TO_DMAP(phys_avail[pa_indx]);

Oh, and yeah:
(gdb) p msgbufp
$4 = (struct msgbuf *) 0xff012fe0

But we do dump the message buffer.
But somehow its dmap address is not resolved correctly.

This should ring a bell for someone knowledgeable of minidump and libkvm code, I
believe.

> Needless to say kgdb refuses to work with that core too.
> 
> With kgdb on live system I can access that address:
> (gdb) x/a 0xff012fe0
> 0xff012fe0: 0xff012ffe



-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Trouble with a atapi-cam backup..

2010-09-02 Thread Andriy Gapon
on 02/09/2010 23:23 Randy Stewart said the following:
> Hi all:
> 
> So I finally upgraded my 7.3stable main server to 8.1stable...
> 
> And now my backup to atapi-cam is failing.. I get:
> 
> r...@lakerest /usr/tmp]# /usr/local/bin/growisofs -Z /dev/cd0 -R -J
> backup_init.08-31-2010.gz
> :-( unable to CAMGETPASSTHRU for /dev/cd0: Inappropriate ioctl for device

You can try to use DTrace to see where exactly in kernel the ioctl request 
fails.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: problem with amd64 minidump

2010-09-04 Thread Andriy Gapon
on 02/09/2010 16:08 Andriy Gapon said the following:
> on 02/09/2010 13:10 Andriy Gapon said the following:
>>
>> Not sure if this is some local issue or a problem in FreeBSD code.
>> I remember minidumps working perfectly well for me, but now I can not get 
>> data
>> from them.
>> Example:
>> dmesg -M /var/crash/vmcore.4
>> dmesg: _kvm_vatop: direct map address 0xff012fe0 not in minidump
>> dmesg: kvm_read: invalid address (0xff012fe0)
> 
> Not sure if it can help, but it seems that this virtual address in DMAP
> corresponds to a physical address in the last page of RAM.
> Do we use that for anything special?  Message buffer?
> I had a quick look at getmemsize() function in sys/amd64/amd64/machdep.c and 
> it
> looks like the following code in the function could be doing just that:
> 
> Maxmem = atop(phys_avail[pa_indx]);
> 
> /* Trim off space for the message buffer. */
> phys_avail[pa_indx] -= round_page(MSGBUF_SIZE);
> 
> /* Map the message buffer. */
> msgbufp = (struct msgbuf *)PHYS_TO_DMAP(phys_avail[pa_indx]);
> 
> Oh, and yeah:
> (gdb) p msgbufp
> $4 = (struct msgbuf *) 0xff012fe0
> 
> But we do dump the message buffer.
> But somehow its dmap address is not resolved correctly.
> 
> This should ring a bell for someone knowledgeable of minidump and libkvm 
> code, I
> believe.

Just for the record: this was triggered by having non-default MSGBUF_SIZE, see
r212174 for the fix.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Andriy Gapon
on 29/08/2010 12:25 Andriy Gapon said the following:
> The below patch is against sources in FreeBSD tree, it should be applied
> either to sys/amd64/amd64/mp_machdep.c or sys/i386/i386/mp_machdep.c depending
> on the desired architecture:
> http://people.freebsd.org/~avg/intel-cpu-topo.diff

I see that I am not getting as many testers as I expected, so I am going to 
commit
the patch.

You still have a short while to either objectively object to the patch or to
voluntary test it :-)
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Andriy Gapon
on 06/09/2010 15:23 Jeremy Chadwick said the following:
> On Mon, Sep 06, 2010 at 03:17:42PM +0300, Andriy Gapon wrote:
>> on 29/08/2010 12:25 Andriy Gapon said the following:
>>> The below patch is against sources in FreeBSD tree, it should be applied
>>> either to sys/amd64/amd64/mp_machdep.c or sys/i386/i386/mp_machdep.c 
>>> depending
>>> on the desired architecture:
>>> http://people.freebsd.org/~avg/intel-cpu-topo.diff
>>
>> I see that I am not getting as many testers as I expected, so I am going to 
>> commit
>> the patch.
>>
>> You still have a short while to either objectively object to the patch or to
>> voluntary test it :-)
> 
> I would gladly assist in testing this, except there doesn't appear to be
> an authoritative statement that it will apply to RELENG_8; when I see
> WIP, I assume -CURRENT/HEAD only.

patch -C is much better than any statement :)

> Let me know, since all the systems I have are Intel multi-core.

Yes, the patch should be applicable to stable/8 without any issues.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Andriy Gapon
on 06/09/2010 16:12 Jeremy Chadwick said the following:
> Great, thanks!  I'll be testing this out on two separate systems, both
> RELENG_8:
> 
> - Supermicro X7SBA + Intel C2D E8400 (stepping 10)
> - Supermicro X7SBL-LN2 + Intel C2D E6600 (stepping 6)
> 
> I'll make sure to provide what the topology looks like before and after.
> Is CPU-relevant dmesg output sufficient?

If you mean something like the below, then yes.  Thanks!

CPU: Intel(R) Core(TM)2 Duo CPU E7300  @ 2.66GHz (2653.35-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x10676  Family = 6  Model = 17  Stepping = 6

Features=0xbfebfbff
  Features2=0x8e39d
  AMD Features=0x20100800
  AMD Features2=0x1
  TSC: P-state invariant
[snip]
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Andriy Gapon
on 06/09/2010 19:22 Jeremy Chadwick said the following:
> On Mon, Sep 06, 2010 at 04:28:02PM +0300, Andriy Gapon wrote:
>> on 06/09/2010 16:12 Jeremy Chadwick said the following:
>>> Great, thanks!  I'll be testing this out on two separate systems, both
>>> RELENG_8:
>>>
>>> - Supermicro X7SBA + Intel C2D E8400 (stepping 10)
>>> - Supermicro X7SBL-LN2 + Intel C2D E6600 (stepping 6)
>>>
>>> I'll make sure to provide what the topology looks like before and after.
>>> Is CPU-relevant dmesg output sufficient?
>>
>> If you mean something like the below, then yes.  Thanks!
>> [...]
> 
> All done.  Good news (I think): there's no difference in the CPU-related
> topology on either system with your patch, aside from kernel build date.
> The topologies are still detected correctly.  In case you want them:
> 

Thanks a lot for the test!

[test results snipped]

> All other systems I have are C2D and C2Q-based, but I can't easily test
> on those given their production roles.  If there's a particular Intel
> processor family/model you're interested in, let me know and I can dig
> around to see if I have access to one.

No particular models in mind.
If you have systems with more complex topologies, like multiple physical 
packages
or HTT enabled, I will be interested in seeing test results for those.
Thanks again.
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: patch for topology detection of Intel CPUs

2010-09-06 Thread Andriy Gapon
on 06/09/2010 20:12 Olivier Smedts said the following:
> Here is mine : no difference before and after the patch :

Thanks!

[snip]

> The only thing I noticed is this, after the patch :
> ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
> ada1:  ATA-7 SATA 2.x device
> ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada1: Command Queueing enabled
> ada1: 152627MB (312581808 512 byte sectors: 16H 63S/T 16383C)
> SMP: AP CPU #1 Launched!cd0 at ahcich2 bus 0 scbus2 target 0 lun 0
> 
> cd0:  Removable CD-ROM SCSI-0 device
> cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)
> cd0: Attempt to query device size failed: NOT READY, Medium not present
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #2 Launched!
> Trying to mount root from zfs:tank/freebsd
> 
> 
> Before the patch, all the "SMP: AP CPU #X Launched!" were correctly
> displayed, with carriage returns. Yes, I use "options
> PRINTF_BUFR_SIZE=128". And I don't know if that's related to the
> patch.

No, it's not related, it's a probabilistic thing.
Those "Launched!" messages are printed from threads running on freshly started
APs and thus are executed truly parallel to BSP.

BTW, is the above snippet from /var/log/messages or from actual console (e.g.
ttyv0)?  If it's from message, then could you please check how it looks on the
console?


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [head tinderbox] failure on amd64/amd64

2010-09-09 Thread Andriy Gapon
on 10/09/2010 01:57 Doug Barton said the following:
> On 9/8/2010 7:39 AM, Mike Tancsa wrote:
>> Perhaps as an interim measure a local procmail rule to filter out cvsup
>> failures from going to the list ?
> 
> That's a particularly unhelpful response. Not only is it borderline rude to
> attempt to shift the responsibility for this to the users, it's a violation of
> the robustness principle.

My impression that the suggestion was to do the filtering on the sending end,
not the recipients' end.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 11/09/2010 21:30 Nate Lawson said the following:
>> PROCESSOR-0311 [255895] cpu_attach: acpi_cpu3: P_BLK at 0x410/6
>> PROCESSOR-0696 [257314] cpu_cx_cst: acpi_cpu3: C2[1] not 
>> available.
>> PROCESSOR-0730 [257314] cpu_cx_cst: acpi_cpu3: Got C3 - 245 
>> latency
> 
> I think the issue is that C2 is not available for some reason and thus
> C3 can't be used either. The way to tell is to use acpidump and look for
> the CPU objects' _CST fields.
> 

The "not available" message means that transition latency is defined too high.
That is, in this case latency is greater than 100 for C2.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 11/09/2010 21:30 Nate Lawson said the following:
> I think the issue is that C2 is not available for some reason and thus
> C3 can't be used either. The way to tell is to use acpidump and look for
> the CPU objects' _CST fields.

>From reading of the code, C3 should be used in this case even if C2 is not
available.
But I think that it might get removed for a different reason: PM2_CNT_BLK length
seems to be zero.  With ACPI_DB_INFO enabled there should be "no BM control"
message in the log.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 12/09/2010 02:14 Norikatsu Shigemura said the following:
>   According to acpidump -dt, I could find CPU0CST table, but
>   not found _CST.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - - -
> Scope (\)
> {
> Name (SSDT, Package (0x0C)
> {
>   :
> "CPU0CST ", 
> 0xDA9AB618, 
> 0x05CD, 
>   :
> })
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - - -
> 
>   Hum... ACPI CA 20100806 has a bug?

How do you conclude?  Does a different version work?
It seems that our acpidump doesn't dump a dynamically loaded table.
That the table was loaded we can see from these messages:

ACPI: Dynamic OEM Table Load:
ACPI: SSDT 0 005CD (v01  PmRef  Cpu0Cst 3001 INTL 20061109)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 12/09/2010 11:12 Alexander Motin said the following:
> Just an idea. Limits of 100 and 1000 are defined for detection of
> C-states using P_LVLx_LAT registers. Because _CST explicitly specifies
> which states are available, these limitations may not apply there. I
> would try to comment these checks in acpi_cpu_cx_cst() and look what
> happen. At least I haven't found in ACPI 3.0 specification any latency
> limits applied to _CST.

Not 100% sure, but what you said does make sense.
I couldn't also find any such wording in ACPI 4.0 spec.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 12/09/2010 12:26 Norikatsu Shigemura said the following:
> Hi avg and mav.
> 
> On Sun, 12 Sep 2010 11:12:20 +0300
> Alexander Motin  wrote:
>>>>> PROCESSOR-0696 [257314] cpu_cx_cst: acpi_cpu3: C2[1] not 
>>>>> available.
>>>>> PROCESSOR-0730 [257314] cpu_cx_cst: acpi_cpu3: Got C3 - 245 
>>>>> latency
>>>> I think the issue is that C2 is not available for some reason and thus
>>>> C3 can't be used either. The way to tell is to use acpidump and look for
>>>> the CPU objects' _CST fields.
>>> The "not available" message means that transition latency is defined too 
>>> high.
>>> That is, in this case latency is greater than 100 for C2.
>> Just an idea. Limits of 100 and 1000 are defined for detection of
>> C-states using P_LVLx_LAT registers. Because _CST explicitly specifies
> 
>   Oops! I forgot. Thank you, I tried.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - -
> --- sys/dev/acpica/acpi_cpu.c.orig2010-09-12 01:31:38.144243000 +0900
> +++ sys/dev/acpica/acpi_cpu.c 2010-09-12 18:06:14.651938193 +0900
> @@ -597,7 +597,7 @@
>  /* Validate and allocate resources for C2 (P_LVL2). */
>  gas.SpaceId = ACPI_ADR_SPACE_SYSTEM_IO;
>  gas.BitWidth = 8;
> -if (AcpiGbl_FADT.C2Latency <= 100) {
> +if (AcpiGbl_FADT.C2Latency <= 1000) {
>   gas.Address = sc->cpu_p_blk + 4;
>   acpi_bus_alloc_gas(sc->cpu_dev, &cx_ptr->res_type, &sc->cpu_rid,
>   &gas, &cx_ptr->p_lvlx, RF_SHAREABLE);
> @@ -613,7 +613,7 @@
>   return;
>  
>  /* Validate and allocate resources for C3 (P_LVL3). */
> -if (AcpiGbl_FADT.C3Latency <= 1000 && !(cpu_quirks & CPU_QUIRK_NO_C3)) {
> +if (AcpiGbl_FADT.C3Latency <= 1 && !(cpu_quirks & CPU_QUIRK_NO_C3)) {
>   gas.Address = sc->cpu_p_blk + 5;
>   acpi_bus_alloc_gas(sc->cpu_dev, &cx_ptr->res_type, &sc->cpu_rid, &gas,
>   &cx_ptr->p_lvlx, RF_SHAREABLE);

The above changes are incorrect.

> @@ -690,7 +690,7 @@
>   sc->cpu_cx_count++;
>   continue;
>   case ACPI_STATE_C2:
> - if (cx_ptr->trans_lat > 100) {
> + if (cx_ptr->trans_lat > 1000) {
>   ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>"acpi_cpu%d: C2[%d] not available.\n",
>device_get_unit(sc->cpu_dev), i));
> @@ -700,7 +700,7 @@
>   break;
>   case ACPI_STATE_C3:
>   default:
> - if (cx_ptr->trans_lat > 1000 ||
> + if (cx_ptr->trans_lat > 1 ||
>   (cpu_quirks & CPU_QUIRK_NO_C3) != 0) {
>  
>   ACPI_DEBUG_PRINT((ACPI_DB_INFO,

You should simply remove the check instead of bumping the threshold.

> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - -
> 
>   But cx_lowest is not changed:

Why do you expect it to be changed?

> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - -
> $ sysctl -a | grep cx
> hw.acpi.cpu.cx_lowest: C1
> dev.cpu.0.cx_supported: C1/3 C2/245

cx_supported has C2 now though.

> dev.cpu.0.cx_lowest: C1
> dev.cpu.0.cx_usage: 100.00% 0.00% last 3641us
> dev.cpu.1.cx_supported: C1/3 C2/245
> dev.cpu.1.cx_lowest: C1
> dev.cpu.1.cx_usage: 100.00% 0.00% last 798us
> dev.cpu.2.cx_supported: C1/3 C2/245
> dev.cpu.2.cx_lowest: C1
> dev.cpu.2.cx_usage: 100.00% 0.00% last 158us
> dev.cpu.3.cx_supported: C1/3 C2/245
> dev.cpu.3.cx_lowest: C1
> dev.cpu.3.cx_usage: 100.00% 0.00% last 227us


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 12/09/2010 12:29 Alexander Motin said the following:
> hw.acpi.cpu.cx_lowest has default in C1. Have you tried to rise it via
> sysctl?
> 

And also check performance_cx_lowest, economy_cx_lowest in
/etc/defaults/rc.conf.  And /etc/rc.d/power_profile.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 12/09/2010 13:25 Norikatsu Shigemura said the following:
> Hi avg.
> 
> On Sun, 12 Sep 2010 19:09:52 +0900
> Norikatsu Shigemura  wrote:
>>  Logic is mistake.  I'll re-make a patch and retry.
> 
>   I re-tried following patch:
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - -
> --- sys/dev/acpica/acpi_cpu.c.orig2010-09-12 01:31:38.144243000 +0900
> +++ sys/dev/acpica/acpi_cpu.c 2010-09-12 19:11:00.906223222 +0900
> @@ -690,19 +690,11 @@
>   sc->cpu_cx_count++;
>   continue;
>   case ACPI_STATE_C2:
> - if (cx_ptr->trans_lat > 100) {
> - ACPI_DEBUG_PRINT((ACPI_DB_INFO,
> -  "acpi_cpu%d: C2[%d] not available.\n",
> -  device_get_unit(sc->cpu_dev), i));
> - continue;
> - }
>   sc->cpu_non_c3 = i;
>   break;
>   case ACPI_STATE_C3:
>   default:
> - if (cx_ptr->trans_lat > 1000 ||
> - (cpu_quirks & CPU_QUIRK_NO_C3) != 0) {
> -
> + if (cpu_quirks & CPU_QUIRK_NO_C3) {
>   ACPI_DEBUG_PRINT((ACPI_DB_INFO,
>"acpi_cpu%d: C3[%d] not available.\n",
>device_get_unit(sc->cpu_dev), i));

The above looks good.

> @@ -731,6 +723,9 @@
>   cx_ptr++;
>   sc->cpu_cx_count++;
>   }
> +else {
> +device_printf(sc->cpu_dev, "DEBUG: cx_ptr->p_lvlx IS NULL.\n");
> +}
>  }
>  AcpiOsFree(buf.Pointer);

What's this?  The indentation is messed up too :-)

> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - -
> 
>   Test is OK:
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - -
> # sysctl hw.acpi.cpu.cx_lowest=C2 && sleep 10 && sysctl dev.cpu.0.cx_usage 
> dev.cpu.1.cx_usage dev.cpu.2.cx_usage dev.cpu.3.cx_usage
> hw.acpi.cpu.cx_lowest: C3 -> C2
> dev.cpu.0.cx_usage: 2.37% 97.62% last 3028us
> dev.cpu.1.cx_usage: 0.87% 99.12% last 4379us
> dev.cpu.2.cx_usage: 0.54% 99.45% last 14314us
> dev.cpu.3.cx_usage: 1.36% 98.63% last 16982us
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
> - - -
> 
>   But I don't know how I couldn't get C3:-(.
>   Not reachable my DEBUG code.
> 
>   Thank you.

acpi_lid0: Lid closed
em0: Link is up 1000 Mbps Full Duplex
PROCESSOR-0722 [402244] cpu_cx_cst: acpi_cpu0: Got C2 - 245 latency
PROCESSOR-0722 [403097] cpu_cx_cst    : acpi_cpu1: Got C2 - 245 latency
PROCESSOR-0722 [403855] cpu_cx_cst: acpi_cpu2: Got C2 - 245 latency
PROCESSOR-0722 [405022] cpu_cx_cst: acpi_cpu3: Got C2 - 245 latency

Maybe because of this?
It seems like you do something and ACPI disables C3, leaving only C2/

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 12/09/2010 13:37 Alexander Motin said the following:
> Andriy Gapon wrote:
>> acpi_lid0: Lid closed
>> em0: Link is up 1000 Mbps Full Duplex
>> PROCESSOR-0722 [402244] cpu_cx_cst: acpi_cpu0: Got C2 - 245 
>> latency
>> PROCESSOR-0722 [403097] cpu_cx_cst: acpi_cpu1: Got C2 - 245 
>> latency
>> PROCESSOR-0722 [403855] cpu_cx_cst: acpi_cpu2: Got C2 - 245 
>> latency
>> PROCESSOR-0722 [405022] cpu_cx_cst: acpi_cpu3: Got C2 - 245 
>> latency
>>
>> Maybe because of this?
>> It seems like you do something and ACPI disables C3, leaving only C2/
> 
> One strange thing. During boot it can be seen:
> acpi_cpu0: Got C2 - 205 latency
> acpi_cpu0: Got C3 - 245 latency
> , but after boot in sysctl we can see:
> dev.cpu.0.cx_supported: C1/3 C2/245
> 
> It respecting latency it looks like not C3 got lost, but C2. AFAIR,
> sysctl numbers C-states completely abstract, just as array indexes. So
> thing reported as C2 could instead be C3, while C2 is absent for some
> reason.

Observations are correct, but incomplete; the conclusions are wrong.
At the end of the boot there are message like this one:
PROCESSOR-0722 [402244] cpu_cx_cst: acpi_cpu0: Got C2 - 245 latency
This is a result of re-evaluation of _CST because of a notification from ACPI.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9

2010-09-12 Thread Andriy Gapon
on 12/09/2010 18:22 Andriy Gapon said the following:
> Observations are correct, but incomplete; the conclusions are wrong.
> At the end of the boot there are message like this one:
> PROCESSOR-0722 [402244] cpu_cx_cst: acpi_cpu0: Got C2 - 245 
> latency
> This is a result of re-evaluation of _CST because of a notification from ACPI.
> 

But still, as you suggest, a patch like the following should be tested and
committed:

--- a/sys/dev/acpica/acpi_cpu.c
+++ b/sys/dev/acpica/acpi_cpu.c
@@ -828,7 +828,8 @@ acpi_cpu_cx_list(struct acpi_cpu_softc *sc)
 sbuf_new(&sb, sc->cpu_cx_supported, sizeof(sc->cpu_cx_supported),
SBUF_FIXEDLEN);
 for (i = 0; i < sc->cpu_cx_count; i++) {
-   sbuf_printf(&sb, "C%d/%d ", i + 1, sc->cpu_cx_states[i].trans_lat);
+   sbuf_printf(&sb, "C%d/%d ", sc->cpu_cx_states[i].type,
+   sc->cpu_cx_states[i].trans_lat);
if (sc->cpu_cx_states[i].type < ACPI_STATE_C3)
sc->cpu_non_c3 = i;
 }

P.S. I restored acpi@ cc: which I think is quite relevant, but was somehow lost.
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: regarding pciids

2010-09-15 Thread Andriy Gapon
on 14/09/2010 03:59 Alexander Best said the following:
> hi there,
> 
> any thoughts on using http://pciids.sourceforge.net/ for pciids instead of the
> Hart and Boemler lists. the SF site seems to be updated more regularly and
> would get rid of the need to decide for each entry, whether to take the Hart 
> or
> Boemler one.

+1 FWIW
Especially given that the format is what we use too (modulo subvendor, sub-etc)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


amd64: VM_KMEM_SIZE_SCALE changed to 1

2010-09-17 Thread Andriy Gapon

[re-post, my address book was polluted with cu_rrr_ent@ entry, sorry]

on 09/09/2010 11:01 Andriy Gapon said the following:
> on 26/07/2010 19:07 Andriy Gapon said the following:
>>
>> Anyone knows any reason why VM_KMEM_SIZE_SCALE on amd64 should not be set to 
>> 1?
>> I mean things potentially breaking, or some unpleasant surprise for an
>> administrator/user...
> 
> So, after having the discussion, what is our collective conclusion?
> a) Go for it!
> or
> b) Don't do it, fool!
> or
> c) Let's wait another year...

Nobody said (b), so:
http://svn.freebsd.org/viewvc/base?view=revision&revision=212784

This thread in Gmane for your convenience:
http://thread.gmane.org/gmane.os.freebsd.architechture/13419/focus=13551

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Multiple hpet messages during boot

2010-09-17 Thread Andriy Gapon
on 17/09/2010 18:36 M. Warner Losh said the following:
> 
> so is there support for the following:

Aye.

> Index: subr_bus.c
> ===
> --- subr_bus.c(revision 212791)
> +++ subr_bus.c(working copy)
> @@ -3996,9 +3996,11 @@
>   arg, cookiep);
>   if (error != 0)
>   return (error);
> + if (bootverbose == 0)
> + return (0);
>   if (handler != NULL && !(flags & INTR_MPSAFE))
>   device_printf(dev, "[GIANT-LOCKED]\n");
> - if (bootverbose && (flags & INTR_MPSAFE))
> + if (flags & INTR_MPSAFE)
>   device_printf(dev, "[MPSAFE]\n");
>   if (filter != NULL) {
>   if (handler == NULL)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


if_epair as module

2010-09-18 Thread Andriy Gapon

Anybody uses if_epair compiled as _module_ on any platform other than amd64?
If yes, could you please respond to me in private?
Big thanks in advance.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Crash during boot of current (rev 212885)

2010-09-19 Thread Andriy Gapon
on 20/09/2010 06:38 Randall Stewart said the following:
> Hi Lawrence:
> 
> I am currently doing a binary search..
> 
> I know that 212660 shows the break.
> 
> I am just about to try 212560 ;-)
> 
> If that works I will update to 212646 and see if it works.. ;-)

Randall,

please also make sure that you have sufficiently recent ld as described in
UPDATING from 20100915.
I'd be interested to see output of readelf -a -W for your kernel that crashes.


> On Sep 19, 2010, at 7:31 PM, Lawrence Stewart wrote:
> 
>> Hiya Randall!
>>
>> On 09/20/10 08:56, Randall Stewart wrote:
>>> Hey all:
>>>
>>> I am now seeing a crash when I boot my Intel (in 64 bit more)...
>>>
>>> Its very early in the boot process.. and thus no crash dump ;-0
>>>
>>> Its in
>>>
>>> netisr_start_swi()
>>>
>>> When it initializes netisr_mtx with a mtx_init() it crashes saying
>>> that netisr_mtx is unaligned... (the address ddb shows for netisr_mtx ends
>>> with c ... so it definitely is unaligned...
>>>
>>> Looking at the netisr_workstream structure (where netisr_mtx is) it
>>> appears to be in theory aligned right (follows 2 pointers)... so
>>> did something change the DP_CPU Define stuff to cause us to get unaligned
>>> access?
>>>
>>> Just curious... If I don't hear from anyone I will start backing things
>>> out 1
>>> rev at a time until I find what did it I guess ;-)
>>
>> My guess would be r212647. Try backing that rev out and if it fixes
>> things, hopefully Andriy will have some thoughts on how to fix the
>> problem. Apologies if my guess is a red herring.
>>
>> Cheers,
>> Lawrence
>>
> 
> --
> Randall Stewart
> 803-317-4952 (cell)
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Crash during boot of current (rev 212885)

2010-09-19 Thread Andriy Gapon
on 20/09/2010 08:33 Randall Stewart said the following:
> Andrly:
> 
> Ok..
> 
> I can do that.
> 
> I can positively say that when I have a kernel with 212646.. all is well.
> 
> But a kernel with 212647 crashes as described below...
> 
> I will ship you the read-elf offlist

I assume you have checked that ld is fresh, but would like to be sure.


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


[HEADS UP] recent ld is needed to build kernel

2010-09-19 Thread Andriy Gapon

Sorry, I should have sent out this earlier.

Please note an entry in UPDATING from 20100915:

A workaround for a fixed ld bug has been removed in kernel code,
so make sure that your system ld is built from sources after
revision 210245 (r211583 if building head kernel on stable/8,
r211584 for stable/7).  A symptom of incorrect ld version is
different addresses for set_pcpu section and __start_set_pcpu
symbol in kernel and/or modules.

Apologies for any problems, because of the late notice.
-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: gptboot rewrite, bootonce, etc.

2010-09-20 Thread Andriy Gapon
on 20/09/2010 15:47 Pawel Jakub Dawidek said the following:
> No, it doesn't. ZFS works a bit differently. ZFS operate on pools, not
> really on partitions. One ZFS file system can span multiple
> disks/partitions. I'm not yet sure how to implement it, so it is
> intuitive, but I also haven't spend much time thinking about it. We
> needed UFS and that is what I implemented. It took me much more time
> than I expected anyway:)

Maybe reserve some area inside zfs boot2 and put relevant information there.
Similarly to how boot0cfg modifies data within boot0.
The information could include "nextboot-pool" and "nextboot-fs".

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: gptboot rewrite, bootonce, etc.

2010-09-20 Thread Andriy Gapon
on 20/09/2010 16:37 John Hay said the following:
> On Mon, Sep 20, 2010 at 03:59:20PM +0300, Andriy Gapon wrote:
>> on 20/09/2010 15:47 Pawel Jakub Dawidek said the following:
>>> No, it doesn't. ZFS works a bit differently. ZFS operate on pools, not
>>> really on partitions. One ZFS file system can span multiple
>>> disks/partitions. I'm not yet sure how to implement it, so it is
>>> intuitive, but I also haven't spend much time thinking about it. We
>>> needed UFS and that is what I implemented. It took me much more time
>>> than I expected anyway:)
>>
>> Maybe reserve some area inside zfs boot2 and put relevant information there.
>> Similarly to how boot0cfg modifies data within boot0.
>> The information could include "nextboot-pool" and "nextboot-fs".
> 
> nextboot-fs sounds nice. I use the bootfs property of zpool and it would
> be nice if one can override it from the boot2 commandline.

I have a patch for doing that from loader(8) prompt.
I.e. you can change a filesystem from which to load kernel+modules and you can
still set root filesystem of course.
http://people.freebsd.org/~avg/zfsboot.diff

This can be extended (i think rather easily) to override from where boot2 loads 
loader

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


  1   2   3   4   5   6   7   8   9   10   >