Re: [osol-discuss] Re: Re: How To Mount CDROM?
> >There's at least one bug in Solaris Express snv_22 with the SUNWvolr > >package's preinstall script: > > > I'll file a bug and fix it. Thanks for your analysis and fix Yesterday I've already submitted it, under category volmgt/other. CR 6339683: SUNWvolr preinstall script broken, smf "smserver" service disabled ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] Re: Solaris on Intel Macs??
> > The next try was with more console_putchar calls added to the > > gateA20() code. This narrowed it down to the loop waiting for an > > empty keyboard controller input buffer. > > So how far did you get after that ? Well, it doesn't hang any more after printing "Loading stage2 ", after I added some code [*] to grub's gateA20() function. grub is now able to load and display the menu.lst file and the splashimage. grub reports reasonable base and upper memory. I can type exactly *one* character on the usb keyboard. When trying to read a second character from the keyboard, the system hangs. Or I can wait until the grub menu timeout expires. This starts loading the default entry. multiboot and the boot_archive is loaded. Text screen is cleared, , and the system hangs - before printing the "SunOS Release 5.xx" copyright string. [*] diff -rub ../opensolaris-20060404/usr/src/grub/grub-0.95/stage2/asm.S usr/src/grub/grub-0.95/stage2/asm.S --- ../opensolaris-20060404/usr/src/grub/grub-0.95/stage2/asm.S 2006-04-05 23:28:21.0 +0200 +++ usr/src/grub/grub-0.95/stage2/asm.S 2006-04-10 18:11:44.152925307 +0200 @@ -1787,7 +1787,30 @@ jnz 3f ret -3: /* use keyboard controller */ +3: /* +* try to switch gateA20 using PORT92, the "Fast A20 and Init" +* register +*/ +mov$0x92, %dx +inb%dx, %al + /* skip the port92 code if it's unimplemented (read returns 0xff) */ + cmpb$0xff, %al + jz 6f + + /* set or clear bit1, the ALT_A20_GATE bit */ + movb4(%esp), %ah + testb %ah, %ah + jz 4f + orb $2, %al + jmp 5f +4: and$0xfd, %al + + /* clear the INIT_NOW bit; don't accidently reset the machine */ +5: and $0xfe, %al + outb%al, %dx + + +6: /* use keyboard controller */ pushl %eax callgloop1 @@ -1797,9 +1820,12 @@ gloopint1: inb $K_STATUS + cmpb$0xff, %al + jz gloopint1_done andb$K_IBUF_FUL, %al jnz gloopint1 +gloopint1_done: movb$KB_OUTPUT_MASK, %al cmpb$0, 0x8(%esp) jz gdoit @@ -1820,6 +1846,8 @@ gloop1: inb $K_STATUS + cmpb$0xff, %al + jz gloop2ret andb$K_IBUF_FUL, %al jnz gloop1 ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] Re: Solaris on Intel Macs??
> > Or I can wait until the grub menu timeout expires. This starts loading the > > default entry. multiboot and the boot_archive is loaded. Text screen is > > cleared, , and the system hangs - before printing the > > "SunOS Release 5.xx" copyright string. > > A completely wild guess but maybe we got into multiboot's main and > got to here: > Hmm, I've already tried to search for "gateA20" code in multiboot, but havn't found such a piece of code... > http://cvs.opensolaris.org/source/xref/on/usr/src/psm/stand/boot/i386/common/key board.c#kb_init These loops (called via ischar() / getchar() => kb_ischar() / kb_getchar()) look interesting and would hang on systems that don't have a ps/2 keyboard controller, so it could be a problem with the intel macs. But would the kernel call them when it was not started with the "-a" flag? I guess when started with "-a" the kernel would try to read various parameters from the console. Is there any reason to call ischar() / getchar() when "-a" is not passed to the kernel? ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] 76B Cannot find install software
Joerg Schilling wrote: > Jürgen Keil <[EMAIL PROTECTED]> wrote: > > > On some boards you can also change the configuration of the > > s-ata controller to p-ata "legacy ide" (instead of ahci mode). In > > legacy mode, Solaris should be able to find both the (s-ata) disks > > and the (s-ata) optical device. > > What is the disadvantage from the legacy mode? - no cfgadm_sata support, so you can't disconnect/connect/configure/unconfigure s-ata devices while the kernel is up and running - no native command queing - no s-ata port multiplier support - afaik: zfs is unable to read a s-ata hdd's SMART data, so things like automatic replacing a failing s-ata disk with a hotspare probably doesn't work (I guess that could be added to the ata driver) - it seems some sata controllers support dma access to memory >= 4GB, so the kernel doesn't have to use bounce buffers. The ata driver can only access 32-bit addresses via dma. ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] 'more' broken in b77 miniroot?
> Date: Wed, 28 Nov 2007 13:27:38 -0500 > From: Kyle McDonald <[EMAIL PROTECTED]> > To: James Carlson <[EMAIL PROTECTED]> > CC: Jürgen Keil <[EMAIL PROTECTED]>, opensolaris-discuss@opensolaris.org > Subject: Re: [osol-discuss] 'more' broken in b77 miniroot? > > James Carlson wrote: > > Jürgen Keil writes: > > > >> In snv_75a, the miniroot /sbin/sulogin shell script contains this line: > >> > >> exec 0<> /dev/console 1>&0 2>&0 > >> > >> The miniroot /sbin/sulogin from snv_75a has SCCS ID > >> "@(#)sulogin.sh 1.5". Has that changed for snv_77? > >> > > > > It's still the same in the gate. > > > > > This might be the difference. > > I didn't choose 'Single User Shell' from the menu. > > The machine is configured to do Custom Jumpstart automatically, and to > see the environment the Begin script would run in, I temporarily changed > the begin script to just call 'exit 1'. This made JumpStart give up and > leave me a shell prompt. > > Is this prompt JumpStart left me at supposed to be the same as 'sulogin'? Maybe not. Can you try "ls -lR / | truss more" ? What kind of error does it get (when it tries to read from stderr fd#2) ? You may also want to check the shell's filedescriptor flags with "pfiles $$". And in case stderr isn't opened O_RDWR check the process tree with "ptree $$" and use "pfiles {pid}" on the parents to find out where the readability of stderr is lost. ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [driver-discuss] CPU temperature and fan
> Is there any existing tools or interface on the solaris can monitor CPU > temperature and control fan status? I'm using the following dtrace script to monitor cpu temperatures on a Tecra S1 centrino laptop (monitors some dtrace probes in the tzmon kernel module). Unfortunatelly it's not very useful on ASUS mainboards with the Q-Fan feature enabled: ASUS BIOS controlls the cpu fan speed, and ASUS' ACPI code always reports a cpu temperature of 40.0°C: #!/usr/sbin/dtrace -s #pragma D option quiet sdt:tzmon:tzmon_eval_zone:tz-temp { printf("temp %d.%1u°K/%d.%1u°C", arg0 / 10, arg0 % 10, (arg0 - 2732) / 10, (arg0 - 2732) % 10); } sdt:tzmon:tzmon_eval_zone:tz-temp /(int)arg1 > 0/ { printf(", crit hot %d.%1u°K/%d.%1u°C", arg1 / 10, arg1 % 10, (arg1 - 2732) / 10, (arg1 - 2732) % 10); } sdt:tzmon:tzmon_eval_zone:tz-temp /(int)arg2 > 0/ { printf(", hot %d.%1u°K/%d.%1u°C", arg2 / 10, arg2 % 10, (arg2 - 2732) / 10, (arg2 - 2732) % 10); } sdt:tzmon:tzmon_eval_zone:tz-temp { printf(", %s\n", stringof(arg3)); } ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [ufs-discuss] PANIC! mounting cdrom slice on b78
Robert William Fuller wrote: > [EMAIL PROTECTED] wrote: > > Hi Kyle, > > > > given that what happens looks ever-so-slightly different each time, a > > hardware glitch could be possible; to exclude this, would you happen to > > know whether these panics occurred before build 78 as well ? If they occur > > if you use the b77 hsfs module on your post-b78 system ? Does the machine > > you're using have a history of hardware issues, or other symptoms that'd > > point at flaky hardware (such as e.g. ZFS block checksumming errors) ? > > Did anybody else notice they're all NULL pointer de-references??? It's > probably not a hardware problem For example, if it's a memory > problem, then you'll often see random pointers, but not 3 NULL pointers > in a row Yep, I noticed that, too. IIRC a bug like ``kmem_free(NULL, size)'' somewhere in the kernel can have the effect that a subsequent ``kmem_alloc(size, KM_SLEEP)'' somewhere else in the kernel will return with a NULL pointer! (Assuming you run release bits) For that reason I did suggest to Kyle to try to reproduce this hsfs mount panic with kmem heap checking enabled. Add the following line to /etc/system, reboot, retry to reproduce the hsfs mount panic: set kmem_flags=0xf ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [ufs-discuss] PANIC! mounting cdrom slice on b78
Frank Hofmann wrote: > On Mon, 16 Jun 2008, Juergen Keil wrote: > > > IIRC a bug like ``kmem_free(NULL, size)'' somewhere in the kernel can have the > > effect that a subsequent ``kmem_alloc(size, KM_SLEEP)'' somewhere else in the > > kernel will return with a NULL pointer! (Assuming you run release bits) > > If this is so, then it's a bug and should be fixed. Quote kmem_alloc(9F): > > NOTES > kmem_alloc(0, flag) always returns NULL. kmem_free(NULL, 0) > is legal. > > That's manpage - consider it a spec ... Well, it says kmem_free with a ptr == NULL and size == 0 is legal; but what about ptr == NULL and size > 0? Quick test with ::call in kmdb, when booted with kmem_flags=0xf: - kmem_alloc::call 8 0 kmem_free::call 8 works, as expected - kmem_free::call 0 8 kmdb fails this call, with "caught a trap" ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [ufs-discuss] PANIC! mounting cdrom slice on b78
Frank Hofmann wrote: > On Mon, 16 Jun 2008, Juergen Keil wrote: > > > For that reason I did suggest to Kyle to try to reproduce this hsfs mount > > panic with kmem heap checking enabled. > > > > Add the following line to /etc/system, reboot, retry to reproduce the hsfs > > mount panic: > > > > set kmem_flags=0xf > > Good idea. Ok, I can actually reproduce that panic using last week's opensolaris bits. All I have to do is try and "mount -F hsfs" a non-existent slice; e.g. using a CD containing OpenSolaris 2008.05, mount -F hsfs /dev/dsk/c1t1d0s4 /mnt ("mount -F hsfs /dev/dsk/c1t1d0p0 /mnt" should work, though): panic[cpu1]/thread=ff0348445720: BAD TRAP: type=e (#pf Page fault) rp=ff00108bb990 addr=40 occurred in module "genunix" due to a NULL pointer dereference mount: #pf Page fault Bad kernel fault at addr=0x40 pid=19108, pc=0xfba92633, sp=0xff00108bba80, eflags=0x10207 cr0: 80050033 cr4: 6f8 cr2: 40 cr3: 22f819000 cr8: c rdi: fbca88a0 rsi:1 rdx:8 rcx:0 r8: fbca8a70 r9:0 rax:0 rbx:0 rbp: ff00108bbaa0 r10: ff02d24a6500 r11: ff00108bb680 r12: 1b0103 r13: ff00108bbc08 r14: 1b0103 r15: 10 fsb:0 gsb: ff02d2e75540 ds: 4b es: 4b fs:0 gs: 1c3 trp:e err:0 rip: fba92633 cs: 30 rfl:10207 rsp: ff00108bba80 ss: 38 ff00108bb870 unix:die+c8 () ff00108bb980 unix:trap+13c3 () ff00108bb990 unix:_cmntrap+e9 () ff00108bbaa0 genunix:vfs_devismounted+23 () ff00108bbbc0 hsfs:hs_getmdev+176 () ff00108bbc60 hsfs:hsfs_mount+195 () ff00108bbc90 genunix:fsop_mount+21 () ff00108bbe00 genunix:domount+9ff () ff00108bbe80 genunix:mount+d2 () ff00108bbec0 genunix:syscall_ap+8f () ff00108bbf10 unix:brand_sys_syscall32+197 () syncing file systems... done dumping to /dev/dsk/c9t0d0s1, offset 860356608, content: kernel > $C ff00108bbaa0 vfs_devismounted+0x23(1b0103) ff00108bbbc0 hs_getmdev+0x176(ff02dcf8a508, 804729e, 101, ff00108bbc08, ff00108bbc3c, ff0315246708) ff00108bbc60 hsfs_mount+0x195(ff02dcf8a508, ff02ffea2c00, ff00108bbe30, ff0315246708) ff00108bbc90 fsop_mount+0x21(ff02dcf8a508, ff02ffea2c00, ff00108bbe30, ff0315246708) ff00108bbe00 domount+0x9ff(0, ff00108bbe30, ff02ffea2c00, ff0315246708, ff00108bbe28) ff00108bbe80 mount+0xd2(ff0347a60fd8, ff00108bbeb8) ff00108bbec0 syscall_ap+0x8f() ff00108bbf10 sys_syscall32+0x101() The panic with "kmem_flags=0xf" is more interesting: > ::status debugging crash dump vmcore.5 (64-bit) from tiger2 operating system: 5.11 snv_93_jk (i86pc) panic message: kernel heap corruption detected dump content: kernel pages only kernel memory allocator: invalid free: buffer not in cache buffer=ff0010455e30 bufctl=0 cache: kmem_alloc_256 panic[cpu1]/thread=ff03a05ad060: kernel heap corruption detected ff0010455a20 genunix:kmem_error+497 () ff0010455a40 genunix:kmem_free+d6 () ff0010455bb0 hsfs:hs_mountfs+8b9 () ff0010455c60 hsfs:hsfs_mount+1e9 () ff0010455c90 genunix:fsop_mount+21 () ff0010455e00 genunix:domount+9ff () ff0010455e80 genunix:mount+d2 () ff0010455ec0 genunix:syscall_ap+8f () ff0010455f10 unix:brand_sys_syscall32+197 () syncing file systems... done dumping to /dev/dsk/c9t0d0s1, offset 860356608, content: kernel > $C ff0010455980 vpanic() ff0010455a20 kmem_error+0x497(1, ff02ce62b020, ff0010455e30) ff0010455a40 kmem_free+0xd6(ff0010455e30, e8) ff0010455bb0 hs_mountfs+0x8b9(ff03a5096dc8, 1b0104, ff03a2b9f140, 6100, 0, ff034ed39978, 0) ff0010455c60 hsfs_mount+0x1e9(ff03a5096dc8, ff02f09e8900, ff0010455e30, ff034ed39978) ff0010455c90 fsop_mount+0x21(ff03a5096dc8, ff02f09e8900, ff0010455e30, ff034ed39978) ff0010455e00 domount+0x9ff(0, ff0010455e30, ff02f09e8900, ff034ed39978, ff0010455e28) ff0010455e80 mount+0xd2(ff02e97cce38, ff0010455eb8) ff0010455ec0 syscall_ap+0x8f() ff0010455f10 sys_syscall32+0x101() > hs_mountfs+0x8b9::dis hs_mountfs+0x88f: movq -0x78(%rbp),%r8 hs_mountfs+0x893: xorq %r9,%r9 hs_mountfs+0x896: call +0x34c9f65 hs_mountfs+0x89b: movq 0x30(%rsp),%rdi hs_mountfs+0x8a0: call +0x34c700b hs_mountfs+0x8a5: testq %r13,%r
Re: [osol-discuss] [ufs-discuss] PANIC! mounting cdrom slice on b78
Hmm, in usr/src/uts/common/fs/hsfs/hsfs_vfsops.c function hs_mountfs(), whenever we use one of the first three |goto cleanup|, the local variables |svp| and |jvp| are uninitialized. That should corrupt the kernel heap when we kmem_free() with an unitialized stack lock pointer in the cleanup section ... struct hs_volume *svp; /* Supplemental VD for ISO-9660:1999 */ struct hs_volume *jvp; /* Joliet VD */ ... /* * Refuse to go any further if this * device is being used for swapping */ if (IS_SWAPVP(common_specvp(devvp))) { error = EBUSY; goto cleanup; } vap.va_mask = AT_SIZE; if ((error = VOP_GETATTR(devvp, &vap, ATTR_COMM, cr, NULL)) != 0) { cmn_err(CE_NOTE, "Cannot get attributes of the CD-ROM driver"); goto cleanup; } /* * Make sure we have a nonzero size partition. * The current version of the SD driver will *not* fail the open * of such a partition so we have to check for it here. */ if (vap.va_size == 0) { error = ENXIO; goto cleanup; } /* * Init a new hsfs structure. */ fsp = kmem_zalloc(sizeof (*fsp), KM_SLEEP); svp = kmem_zalloc(sizeof (*svp), KM_SLEEP); jvp = kmem_zalloc(sizeof (*jvp), KM_SLEEP); ... cleanup: (void) VOP_CLOSE(devvp, FREAD, 1, (offset_t)0, cr, NULL); VN_RELE(devvp); if (fsp) kmem_free(fsp, sizeof (*fsp)); if (svp) kmem_free(svp, sizeof (*svp)); if (jvp) kmem_free(jvp, sizeof (*jvp)); return (error); ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [ufs-discuss] PANIC! mounting cdrom slice on b78
I filed a bug at http://bugs.opensolaris.org/; Bug-ID is not yet known. Fix is obvious: diff --git a/usr/src/uts/common/fs/hsfs/hsfs_vfsops.c b/usr/src/uts/common/fs/hsfs/hsfs_vfsops.c --- a/usr/src/uts/common/fs/hsfs/hsfs_vfsops.c +++ b/usr/src/uts/common/fs/hsfs/hsfs_vfsops.c @@ -596,8 +596,8 @@ hs_mountfs( size_t pathbufsz = strlen(path) + 1; int redo_rootvp; - struct hs_volume *svp; /* Supplemental VD for ISO-9660:1999 */ - struct hs_volume *jvp; /* Joliet VD */ + struct hs_volume *svp = NULL; /* Supplemental VD for ISO-9660:1999 */ + struct hs_volume *jvp = NULL; /* Joliet VD */ /* * The rules for which extension will be used are: > Hmm, in usr/src/uts/common/fs/hsfs/hsfs_vfsops.c function hs_mountfs(), > whenever we use one of the first three |goto cleanup|, the local variables > |svp| and |jvp| are uninitialized. That should corrupt the kernel heap > when we kmem_free() with an unitialized stack lock pointer in the > cleanup section ... > > > > struct hs_volume *svp; /* Supplemental VD for ISO-9660:1999 */ > struct hs_volume *jvp; /* Joliet VD */ > > ... > > /* > * Refuse to go any further if this > * device is being used for swapping > */ > if (IS_SWAPVP(common_specvp(devvp))) { > error = EBUSY; > goto cleanup; > } > > vap.va_mask = AT_SIZE; > if ((error = VOP_GETATTR(devvp, &vap, ATTR_COMM, cr, NULL)) != 0) { > cmn_err(CE_NOTE, "Cannot get attributes of the CD-ROM driver"); > goto cleanup; > } > > /* > * Make sure we have a nonzero size partition. > * The current version of the SD driver will *not* fail the open > * of such a partition so we have to check for it here. > */ > if (vap.va_size == 0) { > error = ENXIO; > goto cleanup; > } > > /* > * Init a new hsfs structure. > */ > fsp = kmem_zalloc(sizeof (*fsp), KM_SLEEP); > svp = kmem_zalloc(sizeof (*svp), KM_SLEEP); > jvp = kmem_zalloc(sizeof (*jvp), KM_SLEEP); > > ... > > > cleanup: > (void) VOP_CLOSE(devvp, FREAD, 1, (offset_t)0, cr, NULL); > VN_RELE(devvp); > if (fsp) > kmem_free(fsp, sizeof (*fsp)); > if (svp) > kmem_free(svp, sizeof (*svp)); > if (jvp) > kmem_free(jvp, sizeof (*jvp)); > return (error); > ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org