-
Hi
The docs told me that tran_start must return
only after the command is completed (in case the
NOINTR Flag is set in the scsi_pkt) ,though i do not
understand how this is working in some of the
drivers that came with the opensrc mainly isp and mfi
...as in i see in isp.c that
while ((sp->cmd_flags & CFLAG_COMPLETED) == 0) {
drv_usecwait(ISP_NOINTR_POLL_DELAY_TIME);
--delay_loops;
/* If the count is 0 take drastic action like a
reset
*/
}
//end of while ,polled cmd is done
Now what i do NOT understand is that the only way
the above flag can be set is through a function that
is called from its ISR in the interrupt context,but
during dump interrupts are NOT enabled, so wonder
how would this work or am i missing something here ??
Regarding my own efforts of testing 'crashdump
support'
My boot device dumpadm output :
Dump content: kernel pages
Dump device: /dev/dsk/c1t0d0s1 (swap)
Savecore directory: /var/crash/unknown
Savecore enabled: yes
My 'Dummy Driver' designed to crashtest:
NOTICE: Inside dummy_attach
fffffe8000aa6aa0 dummy:dummy_attach+52 ()
fffffe8000aa6b00 genunix:devi_attach+8f ()
fffffe8000aa6b30 genunix:attach_node+71 ()
fffffe8000aa6b60 genunix:i_ndi_config_node+ab ()
fffffe8000aa6b80 genunix:i_ddi_attachchild+41 ()
fffffe8000aa6bb0 genunix:devi_attach_node+71 ()
fffffe8000aa6bf0
genunix:config_immediate_children+d7
()
fffffe8000aa6c20 genunix:devi_config_common+66 ()
fffffe8000aa6c60 genunix:mt_config_thread+11a ()
fffffe8000aa6c70 unix:thread_start+8 ()
syncing file systems... 1
*** ALL PRINTS BELOW COMING FROM MY TRAN_START
ROUTINE
***
WARNING: FLAG_NOINTR is set..... cdb is
02a,EDTL=1024
WARNING: *******Busy wait.....
WARNING: completion in FLAG_NOINTR case
scsistatus=0
WARNING: coming out of while....polled cmd done.
1 done
dumping to /dev/dsk/c1t0d0s1 , offset 213909504,
content: kernel
WARNING: Inside scsi_reset
.
.
.
dumping to /dev/dsk/c1t0d0s1 , offset 213905904
0% done: 0 pages dumped, compression ratio 0.00,
> dump
failed: error 5
The above is the error i am getting ,even when i am
polling for completion ,this is the first and last
scsi_pkt i seem to get that has its FLAG_NOINTR bit
set
NOTE: I have NOT implemented tran_reset /abort for
now
as i believe that should not have a bearing on the
actual dump file write ..?
Pls advise
Thanks
Som
>
>
> --- Somnath kotur <[EMAIL PROTECTED]> wrote:
>
> > Yes Garett,
> > Since i am writing a driver that would
> > hold
> > a boot device i was wondering what are the
> > requirements i must follow. Are there any
> guidelines
> > though as to how the polling must be done ,i.e. as
> > per
> > the ISP driver and the mfi driver(written by David
> > Gwyne) ... when NOINTR is set ,they both seem to
> > a)send the IO and
> > b)immediately poll for IO completion all within
> the
> > context of tran_start()
> > Is that the right way to go about it? or should
> > there
> > be a periodic timer that can do the polling
> outside
> > of
> > tran_start() ?
> > To test this 'crashdump support' feature, i have
> > written a dummy driver that panics on attach which
> > is
> > loaded on my boot device
> >
> > Thanks
> > Som
> >
> >
> > --- Garrett D'Amore <[EMAIL PROTECTED]>
> wrote:
> >
> > > James C. McPherson wrote:
> > > > Somnath kotur wrote:
> > > >> Woah, so that is indeed scary esp if the
> driver
> > > that
> > > >> is panicking is the one holding the
> bootdevice
> > as
> > > the
> > > >> integrity of the dump can no longer be
> > guaranteed
> > > as
> > > >> you suggested yourself too right?
> > > >
> > > > This is why it's *essential* that drivers
> which
> > > provide
> > > > access to the dump device are Correct.
> > >
> > > Or at least the portion of the code path
> > > corresponding to dump(9e).
> > > :-) Its not uncommon during development to see
> > > errors caused by bugs in
> > > the storage stack, and still get a reasonable
> > dump.
> > > But at the end of
> > > the day, getting anything meaningful from
> dump(9e)
> > > is just a nicety to
> > > assist support/engineering, and no actual
> promises
> > > are made. A bug in
> > > the kernel can easily cause a hard hang or hard
> > > reset in certain chips
> > > which will prevent a dump from ever being made.
> > >
> > > At the end of the day, anytime the system
> panic's,
> > > hangs, or resets
> > > unintentionally, its a Bug, and needs to be
> fixed.
> >
> > > Customers should
> > > never see such behavior. Correct function of
> > parts
> > > of the system after
> > > such an event would be nice, but its far more
> > > important to ensure that
> > > the event itself doesn't occur.
> > >
> > > -- Garrett
> > > >
> > > >
> > > >
> > > >> --- "James C. McPherson"
> > > <[EMAIL PROTECTED]>
> > > >> wrote:
> > > >>
> > > >>> Somnath kotur wrote:
> > > >>>> James,
> > > >>>> Here is the link for windows
> > > >>>>
> > > >>
> > >
> >
>
http://blogs.technet.com/askperf/archive/2008/01/08/understanding-crash-dump-files.aspx
> > >
> > > >>
> > > >>>> Particularly grep for 'dump_'
> > > >>> Thankyou for that link, it's quite
> > informative.
> > > >>>
> > > >>>
> > > >>> If, on Solaris, your system panics *after*
> > we've
> > > >>> setup
> > > >>> the dump device, then you'll get a crash
> dump.
> > > >>> Assuming
> > > >>> that you've implemented the routines that
> > > Garrett
> > > >>> mentioned
> > > >>> in an earlier email.
> > > >>>
> > > >>> If the system panics *before* that point,
> then
> > > >>> you'll see
> > > >>>
> > > >>>
> > > >>
> > >
> >
>
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/dumpsubr.c#564
> > >
> > > >>
> > > >>>
> > > >>> 564 /*
> > > >>> 565 * Dump the system.
> > > >>> 566 */
> > > >>> 567 void
> > > >>> 568 dumpsys(void)
> > > >>> 569 {
> > > >>> 570 pfn_t pfn;
> > > >>> 571 pgcnt_t bitnum;
> > > >>> 572 int npages = 0;
> > > >>> 573 int percent_done = 0;
> > > >>> 574 uint32_t csize;
> > > >>> 575 u_offset_t total_csize = 0;
> > > >>> 576 int compress_ratio;
> > > >>> 577 proc_t *p;
> > > >>> 578 pid_t npids, pidx;
> > > >>> 579 char *content;
> > > >>> 580
> > > >>> 581 if (dumpvp == NULL || dumphdr ==
> NULL)
> > {
> > > >>> 582 uprintf("skipping system dump -
> no
> > > dump device
> > > >>> configured\n");
> > > >>> 583 return;
> > > >>> 584 }
> > > >>> ......
> > > >>>
> > > >>>
> > > >>>
> > > >>> There is still no reloading of the driver
> that
> > > >>> provides
> > > >>> the boot device. If the driver cannot handle
> > the
> > > >>> dump,
> > > >>> then as I mentioned, you'll either drop to
> > kmdb
> > > or
> > > >>> you'll
> > > >>> see a reboot (x86) or drop to obp (sparc).
> > > >>>
> > > >>>
> > > >>> James C. McPherson
> > > >>> --
> > > >>> Senior Kernel Software Engineer, Solaris
> > > >>> Sun Microsystems
> > > >>> http://blogs.sun.com/jmcp
> > > >>> http://www.jmcp.homeunix.com/blog
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > >
> >
>
____________________________________________________________________________________
> > >
> > > >>
> > > >> Looking for last minute shopping deals? Find
> > > them fast with Yahoo!
> > > >> Search.
> > > >>
> > >
> >
>
http://tools.search.yahoo.com/newsearch/category.php?category=shopping
> > > >
> > > >
> > >
> > >
> >
> >
> >
> >
> >
>
____________________________________________________________________________________
> > Looking for last minute shopping deals?
> > Find them fast with Yahoo! Search.
> >
>
http://tools.search.yahoo.com/newsearch/category.php?category=shopping
> >
>
>
>
>
>
____________________________________________________________________________________
> Never miss a thing. Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss