Megaraid2: Dell PERC 4/DC connected to Dell Powervault 220S, poor performance.
Hi, Im wondering if anyone can help me out with the megaraid2 driver. Ive got a Dell PERC 4/DC connected to a Dell powervault 220S, cluster mode is enabled (write cache OFF) because eventually there will be two machines connected to the powervault for shared scsi. Ive tried the Dell PERC 4 card in 2 different servers; a Dell 1850 and currently its all set up on a Dual P3 1Ghz and I get the same results with each server. Ive tried a few recent 2.6 kernels but currently Im running kernel 2.6.21.5. The problem is that the performance of this set up is incredibly poor. Ive done some experimenting with the megaraid2 module params (Ive set cmd_per_lun=126 and max_sectors=1024 after reading various forum posts) and managed to get the transfer rate up a bit but it still performs very badly. I see some reservation conflict and unit not ready messages when the megaraid2 module loads which I am unsure of. After doing some googling I found someone saying that the megaraid2 driver in kernel 2.4.31 is the only version that they got any decent performance out of the Dell PERC 4. I tried kernel 2.4.31 and indeed the performance is much more like what you would expect (only thing I have to go by is a windows box with a PERC 4 in it). Ive tested both kernel 2.4.31 and 2.6.21.5 with our application, dd, bonnie and iozone on RAID 5 and RAID 1 arrays - kernel 2.4.31 performs better in each case. The partitions are formatted as ext3 but Ive tried a few other filesystems such as XFS with various options and Im getting similar differences between the results for each kernel. Heres an example comparison of bonnie benchmark results when using kernel 2.4.31 and kernel 2.6.21.5, I used the following command; bonnie -d ./ -s 512 -r 256 -x 5 -u 0 -g 0 Kernel 2.4.31 / RAID 5 -- Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP saasfs1512M 11324 99 27539 27 6236 6 9113 67 42385 15 1392.7 4 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min/sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP saasfs1 16 686 97 + +++ + +++ 635 96 + +++ 2097 88 Kernel 2.6.21.5 / RAID 5 Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP saasfs1512M 10357 64 10649 12 4957 5 11079 59 72623 39 1761.0 4 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min/sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP saasfs1 16 19748 96 + +++ 27202 100 20052 97 + +++ 26070 99 Kernel 2.4.31 / RAID 1 -- Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP saasfs1512M 11303 99 44858 45 12319 10 8862 65 51459 17 2201.2 7 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min/sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP saasfs1 16 631 96 + +++ + +++ 621 94 + +++ 2349 97 Kernel 2.6.21.5 / RAID 1 Version 1.03 --Sequential Output-- --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP saasfs1512M 15803 98 29338 34 10693 10 14016 75 95105 49 2172.1 5 --Sequential Create-- Random Create -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files:max:min/sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP saasfs1 16 651 98 + +++ + +++ 631 97 + +++ 2137 96 I dont really want to have to stick with kernel 2.4.31 so if anyone could shed some light on this issue it would be much appreciated. Maybe Ive got the values Im passing into the module parameters completely wrong or theres something else I need to tweak to get the Dell PERC 4 to work better with the kernel 2.6 megaraid2 driver. Just below Ive put together the details of the scsi raid card and both kernels 2.4.31 and 2.6.21.5 (including kernel options,
[PATCH] sg: increase sglist_len of the sg_scatter_hold structure
unsigned short is too small for sizeof(struct scatterlist) * min(q-max_hw_segments, q-max_phys_segments). This fixes memory leak with 4096 segments since 16 (likely sg size with x86) * 4096 sets sglist_len to zero. This might not happen without sg chaining support. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/sg.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 2fc24e7..2c44bb0 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -114,7 +114,7 @@ static struct class_interface sg_interface = { typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */ - unsigned short sglist_len; /* size of malloc'd scatter-gather list ++ */ + unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ unsigned bufflen; /* Size of (aggregate) data buffer */ unsigned b_malloc_len; /* actual len malloc'ed in buffer */ struct scatterlist *buffer;/* scatter list */ -- 1.5.2.4 - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RESEND] Fix a potential NULL pointer deref in the aic7xxx, ahc_print_register() function
On 04/08/07, James Bottomley [EMAIL PROTECTED] wrote: On Sat, 2007-08-04 at 20:30 +0200, Jesper Juhl wrote: (resend of patch previously submitted on 28-Jul-2007 23:06) Ehlo, The Coverity checker noticed that we have a potential NULL pointer deref in drivers/scsi/aic7xxx/aic7xxx_core.c::ahc_print_register(). This patch handles it by adding the same test against NULL that is used elsewhere in the same function. It's on my list of things to look at ... but not very high. I suspect it actually isn't triggerable, but if you can tell me how, it will save me from looking. Here's what Coverity reported : ... 6525int 6526ahc_print_register(ahc_reg_parse_entry_t *table, u_int num_entries, 6527 const char *name, u_int address, u_int value, 6528 u_int *cur_column, u_int wrap_point) 6529{ 6530int printed; 6531u_int printed_mask; 6532 Event var_compare_op: Added cur_column due to comparison cur_column != 0 Also see events: [var_deref_op] At conditional (1): cur_column != 0 taking false path 6533if (cur_column != NULL *cur_column = wrap_point) { 6534printf(\n); 6535*cur_column = 0; 6536} 6537printed = printf(%s[0x%x], name, value); At conditional (2): table == 0 taking true path 6538if (table == NULL) { 6539printed += printf( ); Event var_deref_op: Variable cur_column tracked as NULL was dereferenced. Also see events: [var_compare_op] 6540*cur_column += printed; 6541return (printed); 6542} ... So it requires a NULL 'table' and a != NULL 'cur_column' to trigger. Whether or not that's actually possible I'm not sure, but it seems safer to guard against it :) By the way; if this can actually be triggered, then ahd_print_register() has the same problem. -- Jesper Juhl [EMAIL PROTECTED] Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining
From: Boaz Harrosh [EMAIL PROTECTED] Subject: Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining Date: Tue, 31 Jul 2007 23:12:26 +0300 Boaz Harrosh wrote: FUJITA Tomonori wrote: From: Benny Halevy [EMAIL PROTECTED] Subject: Re: [PATCHSET 0/5] Peaceful co-existence of scsi_sgtable and Large IO sg-chaining Date: Wed, 25 Jul 2007 11:26:44 +0300 However, I'm perfectly happy to go with whatever the empirical evidence says is best .. and hopefully, now we don't have to pick this once and for all time ... we can alter it if whatever is chosen proves to be suboptimal. I agree. This isn't a catholic marriage :) We'll run some performance experiments comparing the sgtable chaining implementation vs. a scsi_data_buff implementation pointing at a possibly chained sglist and let's see if we can measure any difference. We'll send results as soon as we have them. I did some tests with your sgtable patchset and the approach to use separate buffer for sglists. As expected, there was no performance difference with small I/Os. I've not tried very large I/Os, which might give some difference. Next week I will try to mount lots of scsi_debug devices and use large parallel IO to try and find a difference. I will test Jens's sglist-arch tree against above sglist-arch+scsi_sgtable I was able to run some tests here are my results. The results: PPT - is Pages Per Transfer (sg_count) The numbers are accumulated time of 20 transfers of 32GB each, and the average of 4 such runs. (Lower time is better) Transfers are sg_dd into scsi_debug Kernel | total time 128-PPT | total time 2048-PPT ---||- sglist-arch| 47.26 | Test Failed scsi_data_buff | 41.68 | 35.05 scsi_sgtable | 42.42 | 36.45 The test: 1. scsi_debug I mounted the scsi_debug module which was converted and fixed for chaining with the following options: $ modprobe scsi_debug virtual_gb=32 delay=0 dev_size_mb=32 fake_rw=1 32 GB of virtual drive on 32M of memory with 0 delay and read/write do nothing with the fake_rw=1. After that I just enabled chained IO on the device So what I'm actually testing is only sg + scsi-ml request queuing and sglist allocation/deallocation. Which is what I want to test. 2. sg_dd In the test script (see prof_test_scsi_debug attached) I use sg_dd in direct io mode to send a direct scsi-command to above device. Your script is doing: sg_dd blk_sgio=1 dio=1 if=/dev/zero of=/dev/sdb ... dio works for SG_IO? Only sg suuports it, I think. And sd_read is more appropriate than sg_dd for performance tests. I'm not sure about the results. How can sglist-arch be slower than scsi_data_buff and scsi_sgtable. I did 2 tests, in both I transfer 32GB of data. 1st test with 128 (4K) pages IO size. 2nd test with 2048 pages IO size. The second test will successfully run only if chaining is enabled and working. Otherwise it will fail. The tested Kernels: 1. Jens's sglist-arch I was not able to pass all tests with this Kernel. For some reason when bigger than 256 pages commands are queued the Machine will run out of memory and will kill the test. After the test is killed the system is left with 10M of memory and can hardly reboot. sglist-arch works for me. I hit memory leak due to the sg (I used sg instead of SG_IO) bug though the bug happens even with scsi_data_buff and sgtable. 2. scsi_data_buff This tree is what I posted last. It is basically: 0. sglist-arch 1. revert of scsi-ml support for chaining. 2. sg-pools cleanup [PATCH AB1] I don't think this is the proper way. As I said, we can implement scsi_data_buffer stuff on the top of sglist cleanly. No need to revert something in sglist. I don't think that sg-pools cleanup is necessary. We can live without the sg segment size hack due to sglist. My scsi data buffer patch is at: git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-bidi.git sdbuffer The differences are: I use 'scsi_data_buffer structure' instead of scsi_data_buff; it's on the top of sglist cleanly. I also put your scsi_data_buff and sgtable patchset: git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-bidi.git boaz-sdbuff git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-bidi.git boaz-sgtable - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][RESEND] Fix a potential NULL pointer deref in the aic7xxx, ahc_print_register() function
All of this logic was simplified back in '05 in the BSD drivers by adding this to the top of the function: u_int dummy_column; if (cur_column == NULL) { dummy_column = 0; cur_column = dummy_column; } and then stripping out the cur_column == NULL checks in the routine. -- Justin Jesper Juhl wrote: On 04/08/07, James Bottomley [EMAIL PROTECTED] wrote: On Sat, 2007-08-04 at 20:30 +0200, Jesper Juhl wrote: (resend of patch previously submitted on 28-Jul-2007 23:06) Ehlo, The Coverity checker noticed that we have a potential NULL pointer deref in drivers/scsi/aic7xxx/aic7xxx_core.c::ahc_print_register(). This patch handles it by adding the same test against NULL that is used elsewhere in the same function. It's on my list of things to look at ... but not very high. I suspect it actually isn't triggerable, but if you can tell me how, it will save me from looking. Here's what Coverity reported : ... 6525int 6526ahc_print_register(ahc_reg_parse_entry_t *table, u_int num_entries, 6527 const char *name, u_int address, u_int value, 6528 u_int *cur_column, u_int wrap_point) 6529{ 6530int printed; 6531u_int printed_mask; 6532 Event var_compare_op: Added cur_column due to comparison cur_column != 0 Also see events: [var_deref_op] At conditional (1): cur_column != 0 taking false path 6533if (cur_column != NULL *cur_column = wrap_point) { 6534printf(\n); 6535*cur_column = 0; 6536} 6537printed = printf(%s[0x%x], name, value); At conditional (2): table == 0 taking true path 6538if (table == NULL) { 6539printed += printf( ); Event var_deref_op: Variable cur_column tracked as NULL was dereferenced. Also see events: [var_compare_op] 6540*cur_column += printed; 6541return (printed); 6542} ... So it requires a NULL 'table' and a != NULL 'cur_column' to trigger. Whether or not that's actually possible I'm not sure, but it seems safer to guard against it :) By the way; if this can actually be triggered, then ahd_print_register() has the same problem. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/3] 2.6.23-rc2: known regressions
Hi all, Here is a list of some known regressions in 2.6.23-rc2. Feel free to add new regressions/remove fixed etc. http://kernelnewbies.org/known_regressions List of Aces NameRegressions fixed since 21-Jun-2007 Adrian Bunk6 Andi Kleen 4 Andrew Morton 4 Linus Torvalds 4 Al Viro3 Cornelia Huck 3 Jens Axboe 3 Tejun Heo 3 David Woodhouse2 Hugh Dickins 2 Trent Piepho 2 SCSI Subject : lpfc_sli.c: off-by-10 References : http://lkml.org/lkml/2007/7/22/284 Last known good : ? Submitter : Adrian Bunk [EMAIL PROTECTED] Caused-By : ? Handled-By : ? Status : unknown USB Subject : 2.6.23-rc1: uhci_hcd. irq 4: nobody cared References : http://lkml.org/lkml/2007/7/29/75 Last known good : ? Submitter : Mark Hindley [EMAIL PROTECTED] Caused-By : ? Handled-By : Alan Stern [EMAIL PROTECTED] Status : unknown Subject : 2.6.23-rc1: USB hard disk broken References : http://lkml.org/lkml/2007/7/25/62 Last known good : ? Submitter : Tino Keitel [EMAIL PROTECTED] Caused-By : ? Handled-By : Oliver Neukum [EMAIL PROTECTED] Status : unknown Virtualization Subject : drivers/net/xen-netfront.c: bogus code References : http://lkml.org/lkml/2007/7/22/256 Last known good : ? Submitter : Adrian Bunk [EMAIL PROTECTED] Caused-By : ? Handled-By : Jeremy Fitzhardinge [EMAIL PROTECTED] Status : unknown Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2/2] 2.6.23-rc2: known regressions with patches
Hi all, Here is a list of some known regressions in 2.6.23-rc2 with patches available. Feel free to add new regressions/remove fixed etc. http://kernelnewbies.org/known_regressions List of Aces NameRegressions fixed since 21-Jun-2007 Adrian Bunk6 Andi Kleen 4 Andrew Morton 4 Linus Torvalds 4 Al Viro3 Cornelia Huck 3 Jens Axboe 3 Tejun Heo 3 David Woodhouse2 Hugh Dickins 2 Trent Piepho 2 Memory management Subject : [bug] SLUB freeing locks References : http://lkml.org/lkml/2007/7/26/90 Last known good : ? Submitter : Ingo Molnar [EMAIL PROTECTED] Caused-By : ? Handled-By : Peter Zijlstra [EMAIL PROTECTED] Patch : http://lkml.org/lkml/2007/7/26/97 Status : patch available Modpost Subject : modpost bug breaks ia64 cross compilation References : http://lkml.org/lkml/2007/7/27/30 http://lkml.org/lkml/2007/7/27/418 Last known good : ? Submitter : Jan Dittmer [EMAIL PROTECTED] Caused-By : Thomas Renninger [EMAIL PROTECTED] commit 29b71a1ca74491fab9fed09e9d835d840d042690 Handled-By : ? Patch : http://lkml.org/lkml/2007/8/2/211 Status : patch was suggested Networking Subject : tg3 dead after s2ram References : http://lkml.org/lkml/2007/7/31/121 Last known good : ? Submitter : Joachim Deguara [EMAIL PROTECTED] Caused-By : ? Handled-By : Michael Chan [EMAIL PROTECTED] Patch : http://lkml.org/lkml/2007/8/2/288 Status : patch available SCSI Subject : qla2xyz broken in current Linus tree References : http://marc.info/?l=linux-scsim=118581420308892w=2 Last known good : ? Submitter : Matthew Wilcox [EMAIL PROTECTED] Caused-By : Seokmann Ju [EMAIL PROTECTED] commit 281afe1947d855661754850de29d7530b2ff Handled-By : Seokmann Ju [EMAIL PROTECTED] Patch : http://marc.info/?l=linux-scsim=118581884800073w=2 Status : patch available Regards, Michal -- LOG http://www.stardust.webpages.pl/log/ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2/2] 2.6.23-rc2: known regressions with patches
On Sun, 2007-08-05 at 18:26 +0200, Michal Piotrowski wrote: Hi all, Modpost Subject : modpost bug breaks ia64 cross compilation References : http://lkml.org/lkml/2007/7/27/30 http://lkml.org/lkml/2007/7/27/418 Last known good : ? Submitter : Jan Dittmer [EMAIL PROTECTED] Caused-By : Thomas Renninger [EMAIL PROTECTED] commit 29b71a1ca74491fab9fed09e9d835d840d042690 Handled-By : ? Patch : http://lkml.org/lkml/2007/8/2/211 Status : patch was suggested A patch was sent to Tony. AFAIK it got accepted, not sure whether it already is in any and which git tree... Networking Subject : tg3 dead after s2ram References : http://lkml.org/lkml/2007/7/31/121 Last known good : ? Submitter : Joachim Deguara [EMAIL PROTECTED] Caused-By : ? Handled-By : Michael Chan [EMAIL PROTECTED] Patch : http://lkml.org/lkml/2007/8/2/288 Status : patch available David Miller sent a message that it got applied. Thanks, Thomas - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sg: increase sglist_len of the sg_scatter_hold structure
FUJITA Tomonori wrote: unsigned short is too small for sizeof(struct scatterlist) * min(q-max_hw_segments, q-max_phys_segments). This fixes memory leak with 4096 segments since 16 (likely sg size with x86) * 4096 sets sglist_len to zero. This might not happen without sg chaining support. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/sg.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 2fc24e7..2c44bb0 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -114,7 +114,7 @@ static struct class_interface sg_interface = { typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */ - unsigned short sglist_len; /* size of malloc'd scatter-gather list ++ */ + unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ unsigned bufflen; /* Size of (aggregate) data buffer */ unsigned b_malloc_len; /* actual len malloc'ed in buffer */ struct scatterlist *buffer;/* scatter list */ Tomo, Thanks. Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2/2] 2.6.23-rc2: known regressions with patches
* Michal Piotrowski [EMAIL PROTECTED] wrote: Memory management Subject : [bug] SLUB freeing locks References : http://lkml.org/lkml/2007/7/26/90 Last known good : ? Submitter : Ingo Molnar [EMAIL PROTECTED] Caused-By : ? Handled-By : Peter Zijlstra [EMAIL PROTECTED] Patch : http://lkml.org/lkml/2007/7/26/97 Status : patch available fixed by commit 2208b764c14d0f1ad63da64b1a42db6077b6fe42. Ingo - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/3] 2.6.23-rc2: known regressions
Michal Piotrowski wrote: Virtualization Subject : drivers/net/xen-netfront.c: bogus code References : http://lkml.org/lkml/2007/7/22/256 Last known good : ? Submitter : Adrian Bunk [EMAIL PROTECTED] Caused-By : ? Handled-By : Jeremy Fitzhardinge [EMAIL PROTECTED] Status : unknown This is just a chunk of dead code. It probably doesn't even affect the generated code. I've got a patch here for it, and I'll probably post it in the next day or so, but this seems like a non-issue. J - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sg: use idr to replace static arrays
sg uses a scheme to reallocate a single contiguous array of all its pointers for lookup and management. This didn't matter too much when sg could only attach 256 nodes, but now the maximum has been bumped up to 32k we're starting to push the limits of the maximum allocatable contiguous memory. The solution to this is to eliminate the static array and do everything via idr, which this patch does. James Index: BUILD-2.6/drivers/scsi/sg.c === --- BUILD-2.6.orig/drivers/scsi/sg.c2007-08-05 11:26:02.0 -0500 +++ BUILD-2.6/drivers/scsi/sg.c 2007-08-05 13:26:04.0 -0500 @@ -43,6 +43,7 @@ static int sg_version_num = 30534;/* 2 #include linux/poll.h #include linux/moduleparam.h #include linux/cdev.h +#include linux/idr.h #include linux/seq_file.h #include linux/blkdev.h #include linux/delay.h @@ -99,12 +100,11 @@ static int scatter_elem_sz_prev = SG_SCA #define SG_SECTOR_SZ 512 #define SG_SECTOR_MSK (SG_SECTOR_SZ - 1) -#define SG_DEV_ARR_LUMP 32 /* amount to over allocate sg_dev_arr by */ - static int sg_add(struct class_device *, struct class_interface *); static void sg_remove(struct class_device *, struct class_interface *); -static DEFINE_RWLOCK(sg_dev_arr_lock); /* Also used to lock +static DEFINE_IDR(sg_index_idr); +static DEFINE_RWLOCK(sg_index_lock); /* Also used to lock file descriptor list for device */ static struct class_interface sg_interface = { @@ -162,6 +162,7 @@ typedef struct sg_device { /* holds the struct scsi_device *device; wait_queue_head_t o_excl_wait; /* queue open() when O_EXCL in use */ int sg_tablesize; /* adapter's max scatter-gather table size */ + u32 index; /* device index number */ Sg_fd *headfp; /* first open fd belonging to this device */ volatile char detached; /* 0-attached, 1-detached pending removal */ volatile char exclude; /* opened for exclusive access */ @@ -209,10 +210,6 @@ static Sg_device *sg_get_dev(int dev); static int sg_last_dev(void); #endif -static Sg_device **sg_dev_arr = NULL; -static int sg_dev_max; -static int sg_nr_dev; - #define SZ_SG_HEADER sizeof(struct sg_header) #define SZ_SG_IO_HDR sizeof(sg_io_hdr_t) #define SZ_SG_IOVEC sizeof(sg_iovec_t) @@ -1331,40 +1328,35 @@ static struct class *sg_sysfs_class; static int sg_sysfs_valid = 0; -static int sg_alloc(struct gendisk *disk, struct scsi_device *scsidp) +static Sg_device *sg_alloc(struct gendisk *disk, struct scsi_device *scsidp) { struct request_queue *q = scsidp-request_queue; Sg_device *sdp; unsigned long iflags; - void *old_sg_dev_arr = NULL; - int k, error; + int error; + u32 k; sdp = kzalloc(sizeof(Sg_device), GFP_KERNEL); if (!sdp) { printk(KERN_WARNING kmalloc Sg_device failure\n); - return -ENOMEM; + return ERR_PTR(-ENOMEM); + } + error = -ENOMEM; + if (!idr_pre_get(sg_index_idr, GFP_KERNEL)) { + printk(KERN_WARNING idr expansion Sg_device failure\n); + goto out; } - write_lock_irqsave(sg_dev_arr_lock, iflags); - if (unlikely(sg_nr_dev = sg_dev_max)) {/* try to resize */ - Sg_device **tmp_da; - int tmp_dev_max = sg_nr_dev + SG_DEV_ARR_LUMP; - write_unlock_irqrestore(sg_dev_arr_lock, iflags); - - tmp_da = kzalloc(tmp_dev_max * sizeof(Sg_device *), GFP_KERNEL); - if (unlikely(!tmp_da)) - goto expand_failed; - - write_lock_irqsave(sg_dev_arr_lock, iflags); - memcpy(tmp_da, sg_dev_arr, sg_dev_max * sizeof(Sg_device *)); - old_sg_dev_arr = sg_dev_arr; - sg_dev_arr = tmp_da; - sg_dev_max = tmp_dev_max; + write_lock_irqsave(sg_index_lock, iflags); + error = idr_get_new(sg_index_idr, sdp, k); + write_unlock_irqrestore(sg_index_lock, iflags); + + if (error) { + printk(KERN_WARNING idr allocation Sg_device failure: %d\n, + error); + goto out; } - for (k = 0; k sg_dev_max; k++) - if (!sg_dev_arr[k]) - break; if (unlikely(k = SG_MAX_DEVS)) goto overflow; @@ -1375,25 +1367,17 @@ static int sg_alloc(struct gendisk *disk sdp-device = scsidp; init_waitqueue_head(sdp-o_excl_wait); sdp-sg_tablesize = min(q-max_hw_segments, q-max_phys_segments); + sdp-index = k; - sg_nr_dev++; - sg_dev_arr[k] = sdp; - write_unlock_irqrestore(sg_dev_arr_lock, iflags); - error = k; - + error = 0; out: - if (error 0) + if (error) { kfree(sdp); - kfree(old_sg_dev_arr); -
Re: [PATCH] sg: increase sglist_len of the sg_scatter_hold structure
On Sun, 05 Aug 2007 12:55:16 -0400 Douglas Gilbert [EMAIL PROTECTED] wrote: FUJITA Tomonori wrote: unsigned short is too small for sizeof(struct scatterlist) * min(q-max_hw_segments, q-max_phys_segments). This fixes memory leak with 4096 segments since 16 (likely sg size with x86) * 4096 sets sglist_len to zero. This might not happen without sg chaining support. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/sg.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 2fc24e7..2c44bb0 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -114,7 +114,7 @@ static struct class_interface sg_interface = { typedef struct sg_scatter_hold { /* holding area for scsi scatter gather info */ unsigned short k_use_sg; /* Count of kernel scatter-gather pieces */ - unsigned short sglist_len; /* size of malloc'd scatter-gather list ++ */ + unsigned sglist_len; /* size of malloc'd scatter-gather list ++ */ unsigned bufflen; /* Size of (aggregate) data buffer */ unsigned b_malloc_len; /* actual len malloc'ed in buffer */ struct scatterlist *buffer;/* scatter list */ Tomo, Thanks. Signed-off-by: Douglas Gilbert [EMAIL PROTECTED] Thanks for the quick reply. Allocating 64K contiguous memory is not good so the next thing to do is converting sg to use the sg chaining support fully. Or it might be time to finish the overdue task, to convert sg to use the block layer functions. - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html