Juergen, Yes i did compile it on the target_platform and even tried defining the scsi_pkt_wrapper in my own file ( to avoid structure padding issues ) ,still to no avail
When i tried the other expt you had suggested,which is looking at the boot_archive using the installation CD/DVD i found that the command 'gunzip < /a/platform/i86pc/boot_archive > tmp/boot.img ' itself gave an error saying gunzip: stdin: Invalid compressed data = crc error gunzip: stdin: Invalid compressed data = length error So guess the archive itself is corrupt..all rootcaused to the kernel heap corruption you think? Thanks So m --- Juergen Keil <[EMAIL PROTECTED]> wrote: > Som, > > > Yes ,there is no explicit hdr file that > > could be included so i computed the size just as > it is > > done in the open solaris src code,same place as > you > > mentioned > > > > cmdlen = ROUNDUP(cmdlen); > > tgtlen = ROUNDUP(tgtlen); > > hbalen = ROUNDUP(hbalen); > > statuslen = ROUNDUP(statuslen); > > pktlen = sizeof (struct > scsi_pkt_wrapper)=> > > sizeof(scsi_pkt) + sizeof(int) > > + cmdlen + tgtlen + hbalen + > > statuslen; > > > > > > So yes this is more of a hack,but i thought would > be > > ok in order to determine at what point exactly the > > heap corruption happens,sound OK? (of course > assuming > > this is how the scsi_hba_pkt_alloc on the > installed OS > > actually happens!) > > Hmm, yes, sounds ok -- at least when you compile > this > on the target platform where the driver will be > used. > > > But I do see that the "struct scsi_pkt" has changed > between S10u2 and current OpenSolaris: > > S10U2: > ====== > > struct scsi_pkt { > opaque_t pkt_ha_private; /* private > data for host adapter */ > struct scsi_address pkt_address; /* > destination packet is for */ > opaque_t pkt_private; /* private > data for target driver */ > void (*pkt_comp)(struct scsi_pkt *); /* > completion routine */ > uint_t pkt_flags; /* flags */ > int pkt_time; /* time > allotted to complete command */ > uchar_t *pkt_scbp; /* pointer > to status block */ > uchar_t *pkt_cdbp; /* pointer > to command block */ > ssize_t pkt_resid; /* data > bytes not transferred */ > uint_t pkt_state; /* state of > command */ > uint_t pkt_statistics; /* > statistics */ > uchar_t pkt_reason; /* reason > completion called */ > }; > > > OpenSolaris snv_85: > =================== > > has more fields after pkt_reason: > > struct scsi_pkt { > opaque_t pkt_ha_private; /* private > data for host adapter */ > struct scsi_address pkt_address; /* > destination packet is for */ > opaque_t pkt_private; /* private > data for target driver */ > void (*pkt_comp)(struct scsi_pkt *); /* > completion routine */ > uint_t pkt_flags; /* flags */ > int pkt_time; /* time > allotted to complete command */ > uchar_t *pkt_scbp; /* pointer > to status block */ > uchar_t *pkt_cdbp; /* pointer > to command block */ > ssize_t pkt_resid; /* data > bytes not transferred */ > uint_t pkt_state; /* state of > command */ > uint_t pkt_statistics; /* > statistics */ > uchar_t pkt_reason; /* reason > completion called */ > uint_t pkt_cdblen; > uint_t pkt_tgtlen; > uint_t pkt_scblen; > ddi_dma_handle_t pkt_handle; > uint_t pkt_numcookies; > off_t pkt_dma_offset; > size_t pkt_dma_len; > uint_t pkt_dma_flags; > ddi_dma_cookie_t *pkt_cookies; > }; > > > > > So, in case you compiled that "sizeof(struct > scsi_pkt)" on opensolaris, > and use it on S10, it'll use the wrong size... > > > > Maybe you could verify the contents of the redzone > byte by looking at > scsi_pkt->pkt_cdbp[ROUNDUP(cmdlen)] ?? > > Looking at the code in scsi_hba_pkt_alloc(), the > data buffer for > pkt_cdbp is the final piece of data in the kmem > alloced buffer, so > the redzone byte should be right after the cdb/cmd > buffer. > > > > > Thanks > > Som > > > > --- Juergen Keil <[EMAIL PROTECTED]> wrote: > > > > > Som, > > > > > > > > > > Your guess was right ,thanks a ton > > > again..yes > > > > when i ran installation with kernel heap > checking > > > > enabled ,ran into a panic reporting > > > > 'redzone violation: write past end of buffer' > ,and > > > > this was for the 'scsi_pkt' structure during > > > > scsi_hba_pkt_free() > > > > > > > > Oddly (or maybe not considering 32/64-bit > issues) > > > this > > > > problem does not happen with the normal 64bit > > > driver > > > > ,only with 32 bit since thats what the > > > installation > > > > kernel is also running > > > > > > > > In order to debug exactly where the error > might be > > > > happening i tried to insert the same few lines > as > > > in > > > > kmem_free( which is called during > > > scsi_hba_pkt_free) > > > > at different points in my code where scsi_pkt > is > > > being > > > > accessed > > > > > > > > if (((uint8_t *)buf)[size] != > KMEM_REDZONE_BYTE) { > > > > /// LOG ERROR or panic here! > > > > } > > > > > > > > where buf-> scsi_pkt structure in my driver > > > > size =scsi_pkt_wrapper_len > > > > > > > > For some reason i found error right after > > > > scsi_hba_pkt_alloc() was invoked in my > > > > tran_init_pkt(),what does this mean ..how > could > > > that > > > > happen ?? > > > > > > How does you driver read the value of > > > "scsi_pkt_wrapper_len" ? > > > > > > This is an implementation detail from > > > usr/src/uts/common/io/scsi/impl/scsi_hba.c, and > I > > > don't think > > > there is an official header file that you can > > > include to get > > > the value from the struct scsi_pkt_wrapper > member > > > pkt_wrapper_len > > > > > > > > > > > > Looking at the scsi_hba_pkt_alloc(), I don't see > how > === message truncated === ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs _______________________________________________ driver-discuss mailing list driver-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/driver-discuss