Re: [patch] [SCSI] bnx2fc: zero out sense buffer properly
On Sat, 18 Aug 2012 11:46:37 +0300 Dan Carpenter wrote: > ->sense_buffer used to be an array but it changed to pointer in > de25deb180 "[SCSI] use dynamically allocated sense buffer". This call > to memset() needs to be updated as well. > > Signed-off-by: Dan Carpenter > > diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c > index 73f231c..8d4626c 100644 > --- a/drivers/scsi/bnx2fc/bnx2fc_io.c > +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c > @@ -1807,7 +1807,7 @@ static void bnx2fc_parse_fcp_rsp(struct bnx2fc_cmd > *io_req, > fcp_sns_len = SCSI_SENSE_BUFFERSIZE; > } > > - memset(sc_cmd->sense_buffer, 0, sizeof(sc_cmd->sense_buffer)); > + memset(sc_cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); I guess that you can remove the line instead. IIRC, scsi-ml does it for LLDs. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] [SCSI] bnx2fc: zero out sense buffer properly
On Sat, 18 Aug 2012 11:46:37 +0300 Dan Carpenter dan.carpen...@oracle.com wrote: -sense_buffer used to be an array but it changed to pointer in de25deb180 [SCSI] use dynamically allocated sense buffer. This call to memset() needs to be updated as well. Signed-off-by: Dan Carpenter dan.carpen...@oracle.com diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 73f231c..8d4626c 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -1807,7 +1807,7 @@ static void bnx2fc_parse_fcp_rsp(struct bnx2fc_cmd *io_req, fcp_sns_len = SCSI_SENSE_BUFFERSIZE; } - memset(sc_cmd-sense_buffer, 0, sizeof(sc_cmd-sense_buffer)); + memset(sc_cmd-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE); I guess that you can remove the line instead. IIRC, scsi-ml does it for LLDs. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] swiotlb: Remove SWIOTLB overflow buffer support
On Mon, 9 Jul 2012 16:25:05 -0400 Konrad Rzeszutek Wilk wrote: > On Fri, Jul 06, 2012 at 05:06:12PM -0600, Shuah Khan wrote: >> Remove SWIOTLB overflow buffer support and return DMA_ERROR_CODE >> (a value of zero) to make it consistent with iommu implementation >> on Intel, AMD, and swiotlb-xen. > > While this is a good forward step and this needs to be done eventually, > you should first send out patches for the drivers that don't check > for the DMA_ERROR_CODE when doing mapping. In other words for the > drivers that map but don't call dma_mapping_error to check. > > When that is fixed and *all the drivers that don't call dma_mapping_error > are fixed, then this patch makes sense. > > So for right now, NACK. Yeah, I'm not sure we could fix (or remove) *all the drivers though. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC] swiotlb: Remove SWIOTLB overflow buffer support
On Mon, 9 Jul 2012 16:25:05 -0400 Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote: On Fri, Jul 06, 2012 at 05:06:12PM -0600, Shuah Khan wrote: Remove SWIOTLB overflow buffer support and return DMA_ERROR_CODE (a value of zero) to make it consistent with iommu implementation on Intel, AMD, and swiotlb-xen. While this is a good forward step and this needs to be done eventually, you should first send out patches for the drivers that don't check for the DMA_ERROR_CODE when doing mapping. In other words for the drivers that map but don't call dma_mapping_error to check. When that is fixed and *all the drivers that don't call dma_mapping_error are fixed, then this patch makes sense. So for right now, NACK. Yeah, I'm not sure we could fix (or remove) *all the drivers though. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 4/4] alpha: remove unused DEBUG_FORCEDAC define in IOMMU
This just removes unused DEBUG_FORCEDAC define in the IOMMU code. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> Cc: Richard Henderson <[EMAIL PROTECTED]> Cc: Ivan Kokshaysky <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> --- arch/alpha/kernel/pci_iommu.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index 54540c3..be6fa10 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -31,7 +31,6 @@ #endif #define DEBUG_NODIRECT 0 -#define DEBUG_FORCEDAC 0 #define ISA_DMA_MASK 0x00ff -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 3/4] alpha: make IOMMU respect the segment boundary limits
This patch makes the IOMMU code not allocate a memory area spanning LLD's segment boundary. is_span_boundary() judges whether a memory area spans LLD's segment boundary. If iommu_arena_find_pages() finds such a area, it tries to find the next available memory area. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> Cc: Richard Henderson <[EMAIL PROTECTED]> Cc: Ivan Kokshaysky <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> --- arch/alpha/kernel/pci_iommu.c | 40 ++-- 1 files changed, 34 insertions(+), 6 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index e54f829..54540c3 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -126,13 +126,34 @@ iommu_arena_new(struct pci_controller *hose, dma_addr_t base, return iommu_arena_new_node(0, hose, base, window_size, align); } +static inline int is_span_boundary(unsigned int index, unsigned int nr, + unsigned long shift, + unsigned long boundary_size) +{ + shift = (shift + index) & (boundary_size - 1); + return shift + nr > boundary_size; +} + /* Must be called with the arena lock held */ static long -iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) +iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena, + long n, long mask) { unsigned long *ptes; long i, p, nent; int pass = 0; + unsigned long base; + unsigned long boundary_size; + + BUG_ON(arena->dma_base & ~PAGE_MASK); + base = arena->dma_base >> PAGE_SHIFT; + if (dev) + boundary_size = ALIGN(dma_get_max_seg_size(dev) + 1, PAGE_SIZE) + >> PAGE_SHIFT; + else + boundary_size = ALIGN(1UL << 32, PAGE_SIZE) >> PAGE_SHIFT; + + BUG_ON(!is_power_of_2(boundary_size)); /* Search forward for the first mask-aligned sequence of N free ptes */ ptes = arena->ptes; @@ -142,6 +163,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) again: while (i < n && p+i < nent) { + if (!i && is_span_boundary(p, n, base, boundary_size)) { + p = ALIGN(p + 1, mask + 1); + goto again; + } + if (ptes[p+i]) p = ALIGN(p + i + 1, mask + 1), i = 0; else @@ -170,7 +196,8 @@ again: } static long -iommu_arena_alloc(struct pci_iommu_arena *arena, long n, unsigned int align) +iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n, + unsigned int align) { unsigned long flags; unsigned long *ptes; @@ -181,7 +208,7 @@ iommu_arena_alloc(struct pci_iommu_arena *arena, long n, unsigned int align) /* Search for N empty ptes */ ptes = arena->ptes; mask = max(align, arena->align_entry) - 1; - p = iommu_arena_find_pages(arena, n, mask); + p = iommu_arena_find_pages(dev, arena, n, mask); if (p < 0) { spin_unlock_irqrestore(>lock, flags); return -1; @@ -231,6 +258,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, size_t size, unsigned long paddr; dma_addr_t ret; unsigned int align = 0; + struct device *dev = pdev ? >dev : NULL; paddr = __pa(cpu_addr); @@ -278,7 +306,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, size_t size, /* Force allocation to 64KB boundary for ISA bridges. */ if (pdev && pdev == isa_bridge) align = 8; - dma_ofs = iommu_arena_alloc(arena, npages, align); + dma_ofs = iommu_arena_alloc(dev, arena, npages, align); if (dma_ofs < 0) { printk(KERN_WARNING "pci_map_single failed: " "could not allocate dma page tables\n"); @@ -565,7 +593,7 @@ sg_fill(struct device *dev, struct scatterlist *leader, struct scatterlist *end, paddr &= ~PAGE_MASK; npages = calc_npages(paddr + size); - dma_ofs = iommu_arena_alloc(arena, npages, 0); + dma_ofs = iommu_arena_alloc(dev, arena, npages, 0); if (dma_ofs < 0) { /* If we attempted a direct map above but failed, die. */ if (leader->dma_address == 0) @@ -832,7 +860,7 @@ iommu_reserve(struct pci_iommu_arena *arena, long pg_count, long align_mask) /* Search for N empty ptes. */ ptes = arena->ptes; - p = iommu_arena_find_pages(arena, pg_count, align_mask); + p = iommu_arena_find_pages(NULL, arena, pg_count, align_mask); if (p < 0) { spin_unlock_irqrestore(>lock, flags);
[PATCH -mm 1/4] alpha: convert IOMMU to use ALIGN()
This patch is preparation for modifications to fix the IOMMU segment boundary problem. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> Cc: Richard Henderson <[EMAIL PROTECTED]> Cc: Ivan Kokshaysky <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> --- arch/alpha/kernel/pci_iommu.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index 26d3789..bbf9990 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -136,11 +136,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) /* Search forward for the first mask-aligned sequence of N free ptes */ ptes = arena->ptes; nent = arena->size >> PAGE_SHIFT; - p = (arena->next_entry + mask) & ~mask; + p = ALIGN(arena->next_entry, mask + 1); i = 0; while (i < n && p+i < nent) { if (ptes[p+i]) - p = (p + i + 1 + mask) & ~mask, i = 0; + p = ALIGN(p + i + 1, mask + 1), i = 0; else i = i + 1; } @@ -153,7 +153,7 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) p = 0, i = 0; while (i < n && p+i < nent) { if (ptes[p+i]) - p = (p + i + 1 + mask) & ~mask, i = 0; + p = ALIGN(p + i + 1, mask + 1), i = 0; else i = i + 1; } -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 0/4] fix iommu segment boundary problems (alpha)
This patchset is another sequel to my patchset to fix iommu segment boundary problems, IOMMUs allocate memory areas without considering a low level driver's segment boundary limits: http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html This patchset fixes the Alpha IOMMU code. There are four patches in this patchset. The first two patches are preparation for the third patch, which fixes the IOMMU segment boundary problem. The fourth patch just a cleanup, which removes an unused code. This is against 2.6.25-rc2-mm1. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 2/4] alpha: IOMMU had better access to the free space bitmap at only one place
iommu_arena_find_pages duplicates the code to access to the bitmap for free space management. This patch convert the IOMMU code to have only one place to access the bitmap, in the popular way that other IOMMUs (e.g. POWER and SPARC) do. This patch is preparation for modifications to fix the IOMMU segment boundary problem. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> Cc: Richard Henderson <[EMAIL PROTECTED]> Cc: Ivan Kokshaysky <[EMAIL PROTECTED]> Cc: Andrew Morton <[EMAIL PROTECTED]> --- arch/alpha/kernel/pci_iommu.c | 28 +++- 1 files changed, 15 insertions(+), 13 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index bbf9990..e54f829 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -132,12 +132,15 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) { unsigned long *ptes; long i, p, nent; + int pass = 0; /* Search forward for the first mask-aligned sequence of N free ptes */ ptes = arena->ptes; nent = arena->size >> PAGE_SHIFT; p = ALIGN(arena->next_entry, mask + 1); i = 0; + +again: while (i < n && p+i < nent) { if (ptes[p+i]) p = ALIGN(p + i + 1, mask + 1), i = 0; @@ -146,19 +149,18 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) } if (i < n) { -/* Reached the end. Flush the TLB and restart the - search from the beginning. */ - alpha_mv.mv_pci_tbi(arena->hose, 0, -1); - - p = 0, i = 0; - while (i < n && p+i < nent) { - if (ptes[p+i]) - p = ALIGN(p + i + 1, mask + 1), i = 0; - else - i = i + 1; - } - - if (i < n) + if (pass < 1) { + /* +* Reached the end. Flush the TLB and restart +* the search from the beginning. + */ + alpha_mv.mv_pci_tbi(arena->hose, 0, -1); + + pass++; + p = 0; + i = 0; + goto again; + } else return -1; } -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 2/4] alpha: IOMMU had better access to the free space bitmap at only one place
iommu_arena_find_pages duplicates the code to access to the bitmap for free space management. This patch convert the IOMMU code to have only one place to access the bitmap, in the popular way that other IOMMUs (e.g. POWER and SPARC) do. This patch is preparation for modifications to fix the IOMMU segment boundary problem. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Richard Henderson [EMAIL PROTECTED] Cc: Ivan Kokshaysky [EMAIL PROTECTED] Cc: Andrew Morton [EMAIL PROTECTED] --- arch/alpha/kernel/pci_iommu.c | 28 +++- 1 files changed, 15 insertions(+), 13 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index bbf9990..e54f829 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -132,12 +132,15 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) { unsigned long *ptes; long i, p, nent; + int pass = 0; /* Search forward for the first mask-aligned sequence of N free ptes */ ptes = arena-ptes; nent = arena-size PAGE_SHIFT; p = ALIGN(arena-next_entry, mask + 1); i = 0; + +again: while (i n p+i nent) { if (ptes[p+i]) p = ALIGN(p + i + 1, mask + 1), i = 0; @@ -146,19 +149,18 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) } if (i n) { -/* Reached the end. Flush the TLB and restart the - search from the beginning. */ - alpha_mv.mv_pci_tbi(arena-hose, 0, -1); - - p = 0, i = 0; - while (i n p+i nent) { - if (ptes[p+i]) - p = ALIGN(p + i + 1, mask + 1), i = 0; - else - i = i + 1; - } - - if (i n) + if (pass 1) { + /* +* Reached the end. Flush the TLB and restart +* the search from the beginning. + */ + alpha_mv.mv_pci_tbi(arena-hose, 0, -1); + + pass++; + p = 0; + i = 0; + goto again; + } else return -1; } -- 1.5.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 0/4] fix iommu segment boundary problems (alpha)
This patchset is another sequel to my patchset to fix iommu segment boundary problems, IOMMUs allocate memory areas without considering a low level driver's segment boundary limits: http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html This patchset fixes the Alpha IOMMU code. There are four patches in this patchset. The first two patches are preparation for the third patch, which fixes the IOMMU segment boundary problem. The fourth patch just a cleanup, which removes an unused code. This is against 2.6.25-rc2-mm1. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 1/4] alpha: convert IOMMU to use ALIGN()
This patch is preparation for modifications to fix the IOMMU segment boundary problem. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Richard Henderson [EMAIL PROTECTED] Cc: Ivan Kokshaysky [EMAIL PROTECTED] Cc: Andrew Morton [EMAIL PROTECTED] --- arch/alpha/kernel/pci_iommu.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index 26d3789..bbf9990 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -136,11 +136,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) /* Search forward for the first mask-aligned sequence of N free ptes */ ptes = arena-ptes; nent = arena-size PAGE_SHIFT; - p = (arena-next_entry + mask) ~mask; + p = ALIGN(arena-next_entry, mask + 1); i = 0; while (i n p+i nent) { if (ptes[p+i]) - p = (p + i + 1 + mask) ~mask, i = 0; + p = ALIGN(p + i + 1, mask + 1), i = 0; else i = i + 1; } @@ -153,7 +153,7 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) p = 0, i = 0; while (i n p+i nent) { if (ptes[p+i]) - p = (p + i + 1 + mask) ~mask, i = 0; + p = ALIGN(p + i + 1, mask + 1), i = 0; else i = i + 1; } -- 1.5.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH -mm 3/4] alpha: make IOMMU respect the segment boundary limits
This patch makes the IOMMU code not allocate a memory area spanning LLD's segment boundary. is_span_boundary() judges whether a memory area spans LLD's segment boundary. If iommu_arena_find_pages() finds such a area, it tries to find the next available memory area. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Richard Henderson [EMAIL PROTECTED] Cc: Ivan Kokshaysky [EMAIL PROTECTED] Cc: Andrew Morton [EMAIL PROTECTED] --- arch/alpha/kernel/pci_iommu.c | 40 ++-- 1 files changed, 34 insertions(+), 6 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index e54f829..54540c3 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -126,13 +126,34 @@ iommu_arena_new(struct pci_controller *hose, dma_addr_t base, return iommu_arena_new_node(0, hose, base, window_size, align); } +static inline int is_span_boundary(unsigned int index, unsigned int nr, + unsigned long shift, + unsigned long boundary_size) +{ + shift = (shift + index) (boundary_size - 1); + return shift + nr boundary_size; +} + /* Must be called with the arena lock held */ static long -iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) +iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena, + long n, long mask) { unsigned long *ptes; long i, p, nent; int pass = 0; + unsigned long base; + unsigned long boundary_size; + + BUG_ON(arena-dma_base ~PAGE_MASK); + base = arena-dma_base PAGE_SHIFT; + if (dev) + boundary_size = ALIGN(dma_get_max_seg_size(dev) + 1, PAGE_SIZE) +PAGE_SHIFT; + else + boundary_size = ALIGN(1UL 32, PAGE_SIZE) PAGE_SHIFT; + + BUG_ON(!is_power_of_2(boundary_size)); /* Search forward for the first mask-aligned sequence of N free ptes */ ptes = arena-ptes; @@ -142,6 +163,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask) again: while (i n p+i nent) { + if (!i is_span_boundary(p, n, base, boundary_size)) { + p = ALIGN(p + 1, mask + 1); + goto again; + } + if (ptes[p+i]) p = ALIGN(p + i + 1, mask + 1), i = 0; else @@ -170,7 +196,8 @@ again: } static long -iommu_arena_alloc(struct pci_iommu_arena *arena, long n, unsigned int align) +iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n, + unsigned int align) { unsigned long flags; unsigned long *ptes; @@ -181,7 +208,7 @@ iommu_arena_alloc(struct pci_iommu_arena *arena, long n, unsigned int align) /* Search for N empty ptes */ ptes = arena-ptes; mask = max(align, arena-align_entry) - 1; - p = iommu_arena_find_pages(arena, n, mask); + p = iommu_arena_find_pages(dev, arena, n, mask); if (p 0) { spin_unlock_irqrestore(arena-lock, flags); return -1; @@ -231,6 +258,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, size_t size, unsigned long paddr; dma_addr_t ret; unsigned int align = 0; + struct device *dev = pdev ? pdev-dev : NULL; paddr = __pa(cpu_addr); @@ -278,7 +306,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, size_t size, /* Force allocation to 64KB boundary for ISA bridges. */ if (pdev pdev == isa_bridge) align = 8; - dma_ofs = iommu_arena_alloc(arena, npages, align); + dma_ofs = iommu_arena_alloc(dev, arena, npages, align); if (dma_ofs 0) { printk(KERN_WARNING pci_map_single failed: could not allocate dma page tables\n); @@ -565,7 +593,7 @@ sg_fill(struct device *dev, struct scatterlist *leader, struct scatterlist *end, paddr = ~PAGE_MASK; npages = calc_npages(paddr + size); - dma_ofs = iommu_arena_alloc(arena, npages, 0); + dma_ofs = iommu_arena_alloc(dev, arena, npages, 0); if (dma_ofs 0) { /* If we attempted a direct map above but failed, die. */ if (leader-dma_address == 0) @@ -832,7 +860,7 @@ iommu_reserve(struct pci_iommu_arena *arena, long pg_count, long align_mask) /* Search for N empty ptes. */ ptes = arena-ptes; - p = iommu_arena_find_pages(arena, pg_count, align_mask); + p = iommu_arena_find_pages(NULL, arena, pg_count, align_mask); if (p 0) { spin_unlock_irqrestore(arena-lock, flags); return -1; -- 1.5.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http
[PATCH -mm 4/4] alpha: remove unused DEBUG_FORCEDAC define in IOMMU
This just removes unused DEBUG_FORCEDAC define in the IOMMU code. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Richard Henderson [EMAIL PROTECTED] Cc: Ivan Kokshaysky [EMAIL PROTECTED] Cc: Andrew Morton [EMAIL PROTECTED] --- arch/alpha/kernel/pci_iommu.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c index 54540c3..be6fa10 100644 --- a/arch/alpha/kernel/pci_iommu.c +++ b/arch/alpha/kernel/pci_iommu.c @@ -31,7 +31,6 @@ #endif #define DEBUG_NODIRECT 0 -#define DEBUG_FORCEDAC 0 #define ISA_DMA_MASK 0x00ff -- 1.5.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
On Sun, 17 Feb 2008 23:41:42 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: FUJITA Tomonori <[EMAIL PROTECTED]> > Date: Sat, 16 Feb 2008 15:03:43 +0900 > > > [PATCH] sparc64: make IOMMU code respect the segment boundary limits > > > > Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> > > Looks good, but I think it will break sound for some ALI chips. > > Please see arch/sparc64/kernel/pci.c:ali_sound_dma_hack() > and it's caller pci_dma_supported(). Could you explain the problem a little more? The shift argument is only used as an offset when iommu-helper decides whether a memory area (index plus npages) spanning LLD's segment boudnary size or not. For example, if a device's segment boudary size is 64K, the helper see the following value is larger than 64K or not: ((the offset + index of the IOMMU table) ((64K / 8K) - 1) + npages -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
On Sun, 17 Feb 2008 23:41:42 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: From: FUJITA Tomonori [EMAIL PROTECTED] Date: Sat, 16 Feb 2008 15:03:43 +0900 [PATCH] sparc64: make IOMMU code respect the segment boundary limits Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Looks good, but I think it will break sound for some ALI chips. Please see arch/sparc64/kernel/pci.c:ali_sound_dma_hack() and it's caller pci_dma_supported(). Could you explain the problem a little more? The shift argument is only used as an offset when iommu-helper decides whether a memory area (index plus npages) spanning LLD's segment boudnary size or not. For example, if a device's segment boudary size is 64K, the helper see the following value is larger than 64K or not: ((the offset + index of the IOMMU table) ((64K / 8K) - 1) + npages -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] iommu-helper: segment boundary limit should be a power of 2
The segment boundary limit should be a power of 2 so let's make it clear. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> --- lib/iommu-helper.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c index 495575a..de0eced 100644 --- a/lib/iommu-helper.c +++ b/lib/iommu-helper.c @@ -54,6 +54,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, unsigned long align_mask) { unsigned long index; + + BUG_ON(!is_power_of_2(boundary_size)); again: index = find_next_zero_area(map, size, start, nr, align_mask); if (index != -1) { -- 1.5.3.7 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] iommu-helper: segment boundary limit should be a power of 2
The segment boundary limit should be a power of 2 so let's make it clear. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- lib/iommu-helper.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c index 495575a..de0eced 100644 --- a/lib/iommu-helper.c +++ b/lib/iommu-helper.c @@ -54,6 +54,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, unsigned long align_mask) { unsigned long index; + + BUG_ON(!is_power_of_2(boundary_size)); again: index = find_next_zero_area(map, size, start, nr, align_mask); if (index != -1) { -- 1.5.3.7 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
Sorry for the late response, On Mon, 11 Feb 2008 21:40:36 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: FUJITA Tomonori <[EMAIL PROTECTED]> > Date: Thu, 07 Feb 2008 10:38:01 +0900 > > > Great, thanks! SPARC IOMMUs use bitmap for the free area management > > like POWER IOMMUs so it could use lib/iommu-helper as POWER does. > > Please look at Linus's current tree, I believe I have things > in a working state, and the DMA mask portions should be easier > to add now. Thanks! It looks great. iommu->page_table_map_base is on IO_PAGE_SIZE boundary? If so, the following patch works, I think (sorry, it's only compile tested again). Thanks, = >From 91af83802c30d071f98444846e85310a8d58ab3d Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori <[EMAIL PROTECTED]> Date: Sat, 16 Feb 2008 14:29:02 +0900 Subject: [PATCH] sparc64: make IOMMU code respect the segment boundary limits Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> --- arch/sparc64/kernel/iommu.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c index d3276eb..ffefbf7 100644 --- a/arch/sparc64/kernel/iommu.c +++ b/arch/sparc64/kernel/iommu.c @@ -134,7 +134,8 @@ unsigned long iommu_range_alloc(struct device *dev, else boundary_size = ALIGN(1UL << 32, 1 << IO_PAGE_SHIFT); - n = iommu_area_alloc(arena->map, limit, start, npages, 0, + n = iommu_area_alloc(arena->map, limit, start, npages, +iommu->page_table_map_base >> IO_PAGE_SHIFT, boundary_size >> IO_PAGE_SHIFT, 0); if (n == -1) { if (likely(pass < 1)) { -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
Sorry for the late response, On Mon, 11 Feb 2008 21:40:36 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: From: FUJITA Tomonori [EMAIL PROTECTED] Date: Thu, 07 Feb 2008 10:38:01 +0900 Great, thanks! SPARC IOMMUs use bitmap for the free area management like POWER IOMMUs so it could use lib/iommu-helper as POWER does. Please look at Linus's current tree, I believe I have things in a working state, and the DMA mask portions should be easier to add now. Thanks! It looks great. iommu-page_table_map_base is on IO_PAGE_SIZE boundary? If so, the following patch works, I think (sorry, it's only compile tested again). Thanks, = From 91af83802c30d071f98444846e85310a8d58ab3d Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori [EMAIL PROTECTED] Date: Sat, 16 Feb 2008 14:29:02 +0900 Subject: [PATCH] sparc64: make IOMMU code respect the segment boundary limits Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- arch/sparc64/kernel/iommu.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c index d3276eb..ffefbf7 100644 --- a/arch/sparc64/kernel/iommu.c +++ b/arch/sparc64/kernel/iommu.c @@ -134,7 +134,8 @@ unsigned long iommu_range_alloc(struct device *dev, else boundary_size = ALIGN(1UL 32, 1 IO_PAGE_SHIFT); - n = iommu_area_alloc(arena-map, limit, start, npages, 0, + n = iommu_area_alloc(arena-map, limit, start, npages, +iommu-page_table_map_base IO_PAGE_SHIFT, boundary_size IO_PAGE_SHIFT, 0); if (n == -1) { if (likely(pass 1)) { -- 1.5.3.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.24 REGRESSION] BUG: Soft lockup - with VFS
On Fri, 8 Feb 2008 23:46:19 -0800 Pete Zaitcev <[EMAIL PROTECTED]> wrote: > On Tue, 5 Feb 2008 14:05:06 -0800, Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > http://students.zipernowsky.hu/~oliverp/kernel/regression_2624/ > > > I think ub.c is basically abandoned in favour of usb-storage. > > If so, perhaps we should remove or disble ub.c? > > Looks like it's just Tomo or Jens made a mistake when converting to > the new s/g API. Nothing to be too concerned about. I know I should've > reviewed their patch closer, but it seemed too simple... I guess I can put the blame for this on Jens' commit (45711f1a) ;) On a serious note, it seems that two scatter lists per request leaded to this bug. Can the scatter list in struct ub_request be removed? Thanks, > -- Pete > > Fix up the conversion to sg_init_table(). > > Signed-off-by: Pete Zaitcev <[EMAIL PROTECTED]> > > --- a/drivers/block/ub.c > +++ b/drivers/block/ub.c > @@ -657,7 +657,6 @@ static int ub_request_fn_1(struct ub_lun *lun, struct > request *rq) > if ((cmd = ub_get_cmd(lun)) == NULL) > return -1; > memset(cmd, 0, sizeof(struct ub_scsi_cmd)); > - sg_init_table(cmd->sgv, UB_MAX_REQ_SG); > > blkdev_dequeue_request(rq); > > @@ -668,6 +667,7 @@ static int ub_request_fn_1(struct ub_lun *lun, struct > request *rq) > /* >* get scatterlist from block layer >*/ > + sg_init_table(>sgv[0], UB_MAX_REQ_SG); > n_elem = blk_rq_map_sg(lun->disk->queue, rq, >sgv[0]); > if (n_elem < 0) { > /* Impossible, because blk_rq_map_sg should not hit ENOMEM. */ > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.24 REGRESSION] BUG: Soft lockup - with VFS
On Fri, 8 Feb 2008 23:46:19 -0800 Pete Zaitcev [EMAIL PROTECTED] wrote: On Tue, 5 Feb 2008 14:05:06 -0800, Andrew Morton [EMAIL PROTECTED] wrote: http://students.zipernowsky.hu/~oliverp/kernel/regression_2624/ I think ub.c is basically abandoned in favour of usb-storage. If so, perhaps we should remove or disble ub.c? Looks like it's just Tomo or Jens made a mistake when converting to the new s/g API. Nothing to be too concerned about. I know I should've reviewed their patch closer, but it seemed too simple... I guess I can put the blame for this on Jens' commit (45711f1a) ;) On a serious note, it seems that two scatter lists per request leaded to this bug. Can the scatter list in struct ub_request be removed? Thanks, -- Pete Fix up the conversion to sg_init_table(). Signed-off-by: Pete Zaitcev [EMAIL PROTECTED] --- a/drivers/block/ub.c +++ b/drivers/block/ub.c @@ -657,7 +657,6 @@ static int ub_request_fn_1(struct ub_lun *lun, struct request *rq) if ((cmd = ub_get_cmd(lun)) == NULL) return -1; memset(cmd, 0, sizeof(struct ub_scsi_cmd)); - sg_init_table(cmd-sgv, UB_MAX_REQ_SG); blkdev_dequeue_request(rq); @@ -668,6 +667,7 @@ static int ub_request_fn_1(struct ub_lun *lun, struct request *rq) /* * get scatterlist from block layer */ + sg_init_table(urq-sgv[0], UB_MAX_REQ_SG); n_elem = blk_rq_map_sg(lun-disk-queue, rq, urq-sgv[0]); if (n_elem 0) { /* Impossible, because blk_rq_map_sg should not hit ENOMEM. */ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PATCH] final SCSI updates for 2.6.24 merge window
On Thu, 07 Feb 2008 19:37:07 -0600 James Bottomley <[EMAIL PROTECTED]> wrote: > > On Thu, 2008-02-07 at 18:56 -0600, James Bottomley wrote: > > Quite a bit of this is fixing things broken previously (the advansys fix > > is still pending resolution, but I'll send it as an -rc fix when we have > > it). There's the final elimination of all drivers that are esp based > > but don't use the scsi_esp core (that's mostly m68k and alpha). Plus > > the usual bunch of driver updates and the addition of a new enclosure > > services driver and the corresponding ULD. > > OK, the advansys fix came in. I've added it to the patch. > > James > > > > >From f983323fea178352ed3b69c70561a13825a3ce59 Mon Sep 17 00:00:00 2001 > From: FUJITA Tomonori <[EMAIL PROTECTED]> > Date: Fri, 8 Feb 2008 09:50:08 +0900 > Subject: [SCSI] advansys: fix overrun_buf aligned bug Seems that it was a bit late, Linus pulled scsi-misc before the patch was added. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Latest git oopses during boot
On Thu, 7 Feb 2008 23:24:00 +0100 "Harald Arnesen" <[EMAIL PROTECTED]> wrote: > On 2/7/08, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > > > > On Thu, 7 Feb 2008, Harald Arnesen wrote: > > > > > > I'll try applying the patch to a freshly downloaded git-tree. > > > > Ok, good. > > > > > Shall I try another compiler? I have at least these two: > > > > > > gcc version 3.4.6 (Ubuntu 3.4.6-6ubuntu2) > > > gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2) > > > > I would suggest a patch mis-application problem first (or possibly even > > the patch itself being broken - I simply didn't look very closely at the > > patch, but it *looked* ok). > > > > If it's a compiler bug, it's a pretty big one, and quite frankly, I doubt > > it. Compiler bugs do happen, but they are pretty rare, and they tend to > > have more subtle effects than the one you see. > > > > However: > > > > > in addition to the self-compiled 4.2.3 I used for the tests. > > > > 4.2.3? Really? That's pretty damn recent, and so almost totally untested. > > That does make a compiler bug at least more likely. > > > > So yes, if you already have other compilers installed, you should try > > them. If it really is a compiler bug, it's a really bad one, and you would > > want to let the gcc people know. > > > > Still, I'd double-check that the > > > > asc_dvc_varp->overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); > > > > line was added properly first. You should see it way after the point where > > it did > > > > asc_dvc_varp = >dvc_var.asc_dvc_var; > > > > to initialize it (and both statements should be inside a > > > > if (ASC_NARROW_BOARD(boardp)) { > > > > conditional - please check that the source code looks sane too). > > > > Linus > > I just re-downloaded an re-patched and re-compiled (with gcc 4.2.3), > and now the kernel boots. I must have screwed up the previous > patching. > > It now works, with Fujita's patch applied. Thanks Harald and Linus, The bug has been in the advansys driver. 2.6.23 and 2.6.24 works just because the size of Scsi_Host structure was multiples of 8. After 2.6.24, some patches change Scsi_Host structure and now the size is not multiples of 8. So we hit this bug. I'll resend the patch with a proper description. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Latest git oopses during boot
On Thu, 7 Feb 2008 12:14:56 +0100 "Harald Arnesen" <[EMAIL PROTECTED]> wrote: > On 2/7/08, Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > (cc's restored, and expanded a bit) > > Ah, sorry, not used to gmail's web interface. Clicked the wrong button. > > > > Seems to be the advansys driver, so I tried to remove it - and indeed, > > > the kernel now boots. So I guess it's either that driver or my ancient > > > Nikon Coolscan II that is the only thing attached to the board. > > > > Thanks. I uploaded the oops picture to > > http://userweb.kernel.org/~akpm/oops.jpg > > > > > Cc to the Matthew Wilcox added. > > > > mm... looks like all Matthew's changes were in 2.6.23. And 2.6.23 worked > > OK, yes? > > Both 2.6.23 and 2.6.24 are ok. > > > The only recent changes to drivers/scsi/advansys.c are > > > > commit b80ca4f7ee36c26d300c5a8f429e73372d153379 > > Author: FUJITA Tomonori <[EMAIL PROTECTED]> > > Date: Sun Jan 13 15:46:13 2008 +0900 > > > > [SCSI] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE > > > > This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in > > several LLDs. It's a preparation for the future changes to remove > > sense_buffer array in scsi_cmnd structure. > > > > Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> > > Signed-off-by: James Bottomley <[EMAIL PROTECTED]> > > > > :100644 100644 9dd3952... 492702b... M drivers/scsi/advansys.c > > > > commit 747d016e7e25e216b31022fe2b012508d99fb682 > > Author: Randy Dunlap <[EMAIL PROTECTED]> > > Date: Mon Jan 14 00:55:18 2008 -0800 > > > > advansys: fix section mismatch warning > > > > Fix section mismatch warning: > > > > WARNING: vmlinux.o(.exit.text+0x152a): Section mismatch: reference to > > .init. > > > > Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> > > Cc: Matthew Wilcox <[EMAIL PROTECTED]> > > Cc: James Bottomley <[EMAIL PROTECTED]> > > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > > Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> > > > > which seem fairly benign. > > > > > > gcc inlining is going to make it rather a lot of work to find out which > > statement has actually oopsed there. > -- Can you try this? Thanks, diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c index 374ed02..f5dde12 100644 --- a/drivers/scsi/advansys.c +++ b/drivers/scsi/advansys.c @@ -566,7 +566,7 @@ typedef struct asc_dvc_var { ASC_SCSI_BIT_ID_TYPE unit_not_ready; ASC_SCSI_BIT_ID_TYPE queue_full_or_busy; ASC_SCSI_BIT_ID_TYPE start_motor; - uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8); + uchar *overrun_buf; dma_addr_t overrun_dma; uchar scsi_reset_wait; uchar chip_no; @@ -13833,6 +13833,12 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, */ if (ASC_NARROW_BOARD(boardp)) { ASC_DBG(2, "AscInitAsc1000Driver()\n"); + + asc_dvc_varp->overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); + if (!asc_dvc_varp->overrun_buf) { + ret = -ENOMEM; + goto err_free_wide_mem; + } warn_code = AscInitAsc1000Driver(asc_dvc_varp); if (warn_code || asc_dvc_varp->err_code) { @@ -13840,8 +13846,10 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, "warn 0x%x, error 0x%x\n", asc_dvc_varp->init_state, warn_code, asc_dvc_varp->err_code); - if (asc_dvc_varp->err_code) + if (asc_dvc_varp->err_code) { ret = -ENODEV; + kfree(asc_dvc_varp->overrun_buf); + } } } else { if (advansys_wide_init_chip(shost)) @@ -13891,9 +13899,11 @@ static int advansys_release(struct Scsi_Host *shost) free_dma(shost->dma_channel); } if (ASC_NARROW_BOARD(board)) { + ASC_DVC_VAR *asc_dvc_varp = >dvc_var.asc_dvc_var; dma_unmap_single(board->dev, board->dvc_var.asc_dvc_var.overrun_dma, ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE); + kfree(asc_dvc_varp->overrun_buf); } else { iounmap(board->ioremap_addr); advansys_wide_free_mem(board); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Latest git oopses during boot
On Thu, 7 Feb 2008 12:14:56 +0100 Harald Arnesen [EMAIL PROTECTED] wrote: On 2/7/08, Andrew Morton [EMAIL PROTECTED] wrote: (cc's restored, and expanded a bit) Ah, sorry, not used to gmail's web interface. Clicked the wrong button. Seems to be the advansys driver, so I tried to remove it - and indeed, the kernel now boots. So I guess it's either that driver or my ancient Nikon Coolscan II that is the only thing attached to the board. Thanks. I uploaded the oops picture to http://userweb.kernel.org/~akpm/oops.jpg Cc to the Matthew Wilcox added. mm... looks like all Matthew's changes were in 2.6.23. And 2.6.23 worked OK, yes? Both 2.6.23 and 2.6.24 are ok. The only recent changes to drivers/scsi/advansys.c are commit b80ca4f7ee36c26d300c5a8f429e73372d153379 Author: FUJITA Tomonori [EMAIL PROTECTED] Date: Sun Jan 13 15:46:13 2008 +0900 [SCSI] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in several LLDs. It's a preparation for the future changes to remove sense_buffer array in scsi_cmnd structure. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Signed-off-by: James Bottomley [EMAIL PROTECTED] :100644 100644 9dd3952... 492702b... M drivers/scsi/advansys.c commit 747d016e7e25e216b31022fe2b012508d99fb682 Author: Randy Dunlap [EMAIL PROTECTED] Date: Mon Jan 14 00:55:18 2008 -0800 advansys: fix section mismatch warning Fix section mismatch warning: WARNING: vmlinux.o(.exit.text+0x152a): Section mismatch: reference to .init. Signed-off-by: Randy Dunlap [EMAIL PROTECTED] Cc: Matthew Wilcox [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] Signed-off-by: Linus Torvalds [EMAIL PROTECTED] which seem fairly benign. gcc inlining is going to make it rather a lot of work to find out which statement has actually oopsed there. -- Can you try this? Thanks, diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c index 374ed02..f5dde12 100644 --- a/drivers/scsi/advansys.c +++ b/drivers/scsi/advansys.c @@ -566,7 +566,7 @@ typedef struct asc_dvc_var { ASC_SCSI_BIT_ID_TYPE unit_not_ready; ASC_SCSI_BIT_ID_TYPE queue_full_or_busy; ASC_SCSI_BIT_ID_TYPE start_motor; - uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8); + uchar *overrun_buf; dma_addr_t overrun_dma; uchar scsi_reset_wait; uchar chip_no; @@ -13833,6 +13833,12 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, */ if (ASC_NARROW_BOARD(boardp)) { ASC_DBG(2, AscInitAsc1000Driver()\n); + + asc_dvc_varp-overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); + if (!asc_dvc_varp-overrun_buf) { + ret = -ENOMEM; + goto err_free_wide_mem; + } warn_code = AscInitAsc1000Driver(asc_dvc_varp); if (warn_code || asc_dvc_varp-err_code) { @@ -13840,8 +13846,10 @@ static int __devinit advansys_board_found(struct Scsi_Host *shost, warn 0x%x, error 0x%x\n, asc_dvc_varp-init_state, warn_code, asc_dvc_varp-err_code); - if (asc_dvc_varp-err_code) + if (asc_dvc_varp-err_code) { ret = -ENODEV; + kfree(asc_dvc_varp-overrun_buf); + } } } else { if (advansys_wide_init_chip(shost)) @@ -13891,9 +13899,11 @@ static int advansys_release(struct Scsi_Host *shost) free_dma(shost-dma_channel); } if (ASC_NARROW_BOARD(board)) { + ASC_DVC_VAR *asc_dvc_varp = board-dvc_var.asc_dvc_var; dma_unmap_single(board-dev, board-dvc_var.asc_dvc_var.overrun_dma, ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE); + kfree(asc_dvc_varp-overrun_buf); } else { iounmap(board-ioremap_addr); advansys_wide_free_mem(board); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Latest git oopses during boot
On Thu, 7 Feb 2008 23:24:00 +0100 Harald Arnesen [EMAIL PROTECTED] wrote: On 2/7/08, Linus Torvalds [EMAIL PROTECTED] wrote: On Thu, 7 Feb 2008, Harald Arnesen wrote: I'll try applying the patch to a freshly downloaded git-tree. Ok, good. Shall I try another compiler? I have at least these two: gcc version 3.4.6 (Ubuntu 3.4.6-6ubuntu2) gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2) I would suggest a patch mis-application problem first (or possibly even the patch itself being broken - I simply didn't look very closely at the patch, but it *looked* ok). If it's a compiler bug, it's a pretty big one, and quite frankly, I doubt it. Compiler bugs do happen, but they are pretty rare, and they tend to have more subtle effects than the one you see. However: in addition to the self-compiled 4.2.3 I used for the tests. 4.2.3? Really? That's pretty damn recent, and so almost totally untested. That does make a compiler bug at least more likely. So yes, if you already have other compilers installed, you should try them. If it really is a compiler bug, it's a really bad one, and you would want to let the gcc people know. Still, I'd double-check that the asc_dvc_varp-overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL); line was added properly first. You should see it way after the point where it did asc_dvc_varp = boardp-dvc_var.asc_dvc_var; to initialize it (and both statements should be inside a if (ASC_NARROW_BOARD(boardp)) { conditional - please check that the source code looks sane too). Linus I just re-downloaded an re-patched and re-compiled (with gcc 4.2.3), and now the kernel boots. I must have screwed up the previous patching. It now works, with Fujita's patch applied. Thanks Harald and Linus, The bug has been in the advansys driver. 2.6.23 and 2.6.24 works just because the size of Scsi_Host structure was multiples of 8. After 2.6.24, some patches change Scsi_Host structure and now the size is not multiples of 8. So we hit this bug. I'll resend the patch with a proper description. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PATCH] final SCSI updates for 2.6.24 merge window
On Thu, 07 Feb 2008 19:37:07 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Thu, 2008-02-07 at 18:56 -0600, James Bottomley wrote: Quite a bit of this is fixing things broken previously (the advansys fix is still pending resolution, but I'll send it as an -rc fix when we have it). There's the final elimination of all drivers that are esp based but don't use the scsi_esp core (that's mostly m68k and alpha). Plus the usual bunch of driver updates and the addition of a new enclosure services driver and the corresponding ULD. OK, the advansys fix came in. I've added it to the patch. James From f983323fea178352ed3b69c70561a13825a3ce59 Mon Sep 17 00:00:00 2001 From: FUJITA Tomonori [EMAIL PROTECTED] Date: Fri, 8 Feb 2008 09:50:08 +0900 Subject: [SCSI] advansys: fix overrun_buf aligned bug Seems that it was a bit late, Linus pulled scsi-misc before the patch was added. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
On Wed, 06 Feb 2008 16:01:47 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: FUJITA Tomonori <[EMAIL PROTECTED]> > Date: Thu, 07 Feb 2008 08:53:33 +0900 > > > On Wed, 06 Feb 2008 15:18:55 -0800 (PST) > > David Miller <[EMAIL PROTECTED]> wrote: > > > > > I intend to put merging back in, perhaps something similar to > > > powerpc's merging logic but without the expensive (in my opinion) > > > IOMMU allocation every loop. I think it is better to determine the > > > segment breaks in one pass, allocate that many IOMMU entries in one > > > allocation, then fill them all in. > > > > I thought about asking you if I can modify the SPARC IOMMUs to do > > allocation in every loop. > > > > The reason why I need the allocation in every loop is that I also need > > to fix the problem that IOMMUs allocate memory areas without > > considering a low level driver's segment boundary limits. > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html > > > > As far as I know, all the IOMMUs except for SPARC allocate a free area > > in every loop but if it's too expensive for SPARC, then we need to > > find a different way to handle segment boundary limits. > > Ok then what I'll do is adopt some version of powerpc's merging > allocator into the sparc64 code. Great, thanks! SPARC IOMMUs use bitmap for the free area management like POWER IOMMUs so it could use lib/iommu-helper as POWER does. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
On Wed, 06 Feb 2008 15:18:55 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: FUJITA Tomonori <[EMAIL PROTECTED]> > Date: Thu, 07 Feb 2008 08:12:36 +0900 > > > Really sorry about it. > > I am happy to test patches you send to me in the future :-) Thanks a lot. > > PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC > > but their first pass decides how to merge sg entires (and stores that > > information in the sg entries), then the second pass simpliy follows > > it (Hopefully I understand these IOMMUs correctly, or else I break > > them too). > > For now I've removed all of the merging code from the sparc64 IOMMU > support so that other users do not get corrupt filesystems. It > basically mimicks how the intel-iommu code works, ie. no attempts to > merge anything. I've just saw it. > I intend to put merging back in, perhaps something similar to > powerpc's merging logic but without the expensive (in my opinion) > IOMMU allocation every loop. I think it is better to determine the > segment breaks in one pass, allocate that many IOMMU entries in one > allocation, then fill them all in. I thought about asking you if I can modify the SPARC IOMMUs to do allocation in every loop. The reason why I need the allocation in every loop is that I also need to fix the problem that IOMMUs allocate memory areas without considering a low level driver's segment boundary limits. http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html As far as I know, all the IOMMUs except for SPARC allocate a free area in every loop but if it's too expensive for SPARC, then we need to find a different way to handle segment boundary limits. > Ideally, we should have some generic code that does all of this. > Then you would only need to test one implementation. > > It is definitely doable and increasingly necessary as we have so > many reimplementations of what is essentially identical core code. Agreed though it's a very hard task. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ordering of commits and breaking bisecting
On Wed, 6 Feb 2008 09:38:17 +1100 Stephen Rothwell <[EMAIL PROTECTED]> wrote: > Hi all, > > Can we be a bit more careful with the ordering of commits, please? > Commit d22a6966b8029913fac37d078ab2403898d94c63 (iommu sg merging: add > accessors for segment_boundary_mask in device_dma_parameters()) > introduces the dma_get_seg_boundary accessor which is used in the > *preceding* commits fb3475e9b6bfa666107512fbd6006c26014f04b8 (for > powerpc), and 1b39b077789955c8389488d53d075518fdcd582e and > fde9a1094ddf2892188a8a0eccda527de47cba8e (for x86). > > Thus we have another set of commits that will break bisection. This one > was easily avoidable. > > A little more care is needed. Sorry about it. I submitted the patches in the proper order but the order seemed to be reversed in the -mm. http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html I could have noticed the issue since it has been in the -mm for a long time, sorry. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
On Tue, 05 Feb 2008 20:41:38 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > > The sparc64 change: > > commit fde6a3c82d67f592eb587be4d1b0ae6d4321 > Author: FUJITA Tomonori <[EMAIL PROTECTED]> > Date: Mon Feb 4 22:28:02 2008 -0800 > > iommu sg merging: sparc64: make iommu respect the segment size limits > > This patch makes iommu respect segment size limits when merging sg > lists. > > Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> > Cc: Jeff Garzik <[EMAIL PROTECTED]> > Cc: James Bottomley <[EMAIL PROTECTED]> > Acked-by: Jens Axboe <[EMAIL PROTECTED]> > Cc: "David S. Miller" <[EMAIL PROTECTED]> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]> > Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> > > has significant errors and is going to eat people's disks, as it just > nearly did to mine. Really sorry about it. > Typically what you'll see are NULL pointer derefernces in > dma_4v_map_sg() and dma_4u_map_sg() and then the kernel usually craps > on your superblock very shortly thereafter. > > The changeset above modified only prepare_sg() but that is only the > first pass of the SG mapping algorithm of the sparc64 IOMMU layer. > > The second pass that fills in the entries depends upon how the first > pass does things. So if you change the first pass decision making you > have to update the second pass's as well. > > That second pass is implemented in fill_sg() (there is a version in > both arch/sparc64/kernel/iommu.c and arch/sparc64/kernel/pci_sun4v.c), > which probably needs new logic as was added to prepare_sg() to handle > dma_get_max_seg_size(). PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC but their first pass decides how to merge sg entires (and stores that information in the sg entries), then the second pass simpliy follows it (Hopefully I understand these IOMMUs correctly, or else I break them too). I'll try this work again for 2.6.26. The SPARC IOMMUs are the most complicated and seems that few people tests the -mm SPARC code. So I guess that I need to find a SPARC box... Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
On Tue, 05 Feb 2008 20:41:38 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: The sparc64 change: commit fde6a3c82d67f592eb587be4d1b0ae6d4321 Author: FUJITA Tomonori [EMAIL PROTECTED] Date: Mon Feb 4 22:28:02 2008 -0800 iommu sg merging: sparc64: make iommu respect the segment size limits This patch makes iommu respect segment size limits when merging sg lists. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] Cc: Jeff Garzik [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] Acked-by: Jens Axboe [EMAIL PROTECTED] Cc: David S. Miller [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] Signed-off-by: Linus Torvalds [EMAIL PROTECTED] has significant errors and is going to eat people's disks, as it just nearly did to mine. Really sorry about it. Typically what you'll see are NULL pointer derefernces in dma_4v_map_sg() and dma_4u_map_sg() and then the kernel usually craps on your superblock very shortly thereafter. The changeset above modified only prepare_sg() but that is only the first pass of the SG mapping algorithm of the sparc64 IOMMU layer. The second pass that fills in the entries depends upon how the first pass does things. So if you change the first pass decision making you have to update the second pass's as well. That second pass is implemented in fill_sg() (there is a version in both arch/sparc64/kernel/iommu.c and arch/sparc64/kernel/pci_sun4v.c), which probably needs new logic as was added to prepare_sg() to handle dma_get_max_seg_size(). PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC but their first pass decides how to merge sg entires (and stores that information in the sg entries), then the second pass simpliy follows it (Hopefully I understand these IOMMUs correctly, or else I break them too). I'll try this work again for 2.6.26. The SPARC IOMMUs are the most complicated and seems that few people tests the -mm SPARC code. So I guess that I need to find a SPARC box... Thanks, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Ordering of commits and breaking bisecting
On Wed, 6 Feb 2008 09:38:17 +1100 Stephen Rothwell [EMAIL PROTECTED] wrote: Hi all, Can we be a bit more careful with the ordering of commits, please? Commit d22a6966b8029913fac37d078ab2403898d94c63 (iommu sg merging: add accessors for segment_boundary_mask in device_dma_parameters()) introduces the dma_get_seg_boundary accessor which is used in the *preceding* commits fb3475e9b6bfa666107512fbd6006c26014f04b8 (for powerpc), and 1b39b077789955c8389488d53d075518fdcd582e and fde9a1094ddf2892188a8a0eccda527de47cba8e (for x86). Thus we have another set of commits that will break bisection. This one was easily avoidable. A little more care is needed. Sorry about it. I submitted the patches in the proper order but the order seemed to be reversed in the -mm. http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html I could have noticed the issue since it has been in the -mm for a long time, sorry. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: more iommu sg merging fallout
On Wed, 06 Feb 2008 16:01:47 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: From: FUJITA Tomonori [EMAIL PROTECTED] Date: Thu, 07 Feb 2008 08:53:33 +0900 On Wed, 06 Feb 2008 15:18:55 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: I intend to put merging back in, perhaps something similar to powerpc's merging logic but without the expensive (in my opinion) IOMMU allocation every loop. I think it is better to determine the segment breaks in one pass, allocate that many IOMMU entries in one allocation, then fill them all in. I thought about asking you if I can modify the SPARC IOMMUs to do allocation in every loop. The reason why I need the allocation in every loop is that I also need to fix the problem that IOMMUs allocate memory areas without considering a low level driver's segment boundary limits. http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html As far as I know, all the IOMMUs except for SPARC allocate a free area in every loop but if it's too expensive for SPARC, then we need to find a different way to handle segment boundary limits. Ok then what I'll do is adopt some version of powerpc's merging allocator into the sparc64 code. Great, thanks! SPARC IOMMUs use bitmap for the free area management like POWER IOMMUs so it could use lib/iommu-helper as POWER does. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 18:09:15 +0100 Matteo Tescione <[EMAIL PROTECTED]> wrote: > On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote: > > > On Tue, 05 Feb 2008 08:14:01 +0100 > > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote: > > > >> James Bottomley schrieb: > >> > >>> These are both features being independently worked on, are they not? > >>> Even if they weren't, the combination of the size of SCST in kernel plus > >>> the problem of having to find a migration path for the current STGT > >>> users still looks to me to involve the greater amount of work. > >> > >> I don't want to be mean, but does anyone actually use STGT in > >> production? Seriously? > >> > >> In the latest development version of STGT, it's only possible to stop > >> the tgtd target daemon using KILL / 9 signal - which also means all > >> iSCSI initiator connections are corrupted when tgtd target daemon is > >> started again (kernel upgrade, target daemon upgrade, server reboot etc.). > > > > I don't know what "iSCSI initiator connections are corrupted" > > mean. But if you reboot a server, how can an iSCSI target > > implementation keep iSCSI tcp connections? > > > > > >> Imagine you have to reboot all your NFS clients when you reboot your NFS > >> server. Not only that - your data is probably corrupted, or at least the > >> filesystem deserves checking... > > Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat) > rebooting the primary server doesn't affect my iscsi traffic, SCST correctly > manages stop/crash, by sending unit attention to clients on reconnect. > Drbd+heartbeat correctly manages those things too. > Still from an end-user POV, i was able to reboot/survive a crash only with > SCST, IETD still has reconnect problems and STGT are even worst. Please tell us on stgt-devel mailing list if you see problems. We will try to fix them. Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 17:07:07 +0100 Tomasz Chmielewski <[EMAIL PROTECTED]> wrote: > FUJITA Tomonori schrieb: > > On Tue, 05 Feb 2008 08:14:01 +0100 > > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote: > > > >> James Bottomley schrieb: > >> > >>> These are both features being independently worked on, are they not? > >>> Even if they weren't, the combination of the size of SCST in kernel plus > >>> the problem of having to find a migration path for the current STGT > >>> users still looks to me to involve the greater amount of work. > >> I don't want to be mean, but does anyone actually use STGT in > >> production? Seriously? > >> > >> In the latest development version of STGT, it's only possible to stop > >> the tgtd target daemon using KILL / 9 signal - which also means all > >> iSCSI initiator connections are corrupted when tgtd target daemon is > >> started again (kernel upgrade, target daemon upgrade, server reboot etc.). > > > > I don't know what "iSCSI initiator connections are corrupted" > > mean. But if you reboot a server, how can an iSCSI target > > implementation keep iSCSI tcp connections? > > The problem with tgtd is that you can't start it (configured) in an > "atomic" way. > Usually, one will start tgtd and it's configuration in a script (I > replaced some parameters with "..." to make it shorter and more readable): Thanks for the details. So the way to stop the daemon is not related with your problem. It's easily fixable. Can you start a new thread about this on stgt-devel mailing list? When we agree on the interface to start the daemon, I'll implement it. > tgtd > tgtadm --op new ... > tgtadm --lld iscsi --op new ... (snip) > So the only way to start/restart tgtd reliably is to do hacks which are > needed with yet another iSCSI kernel implementation (IET): use iptables. > > iptables > tgtd > sleep 1 > tgtadm --op new ... > tgtadm --lld iscsi --op new ... > iptables > > > A bit ugly, isn't it? > Having to tinker with a firewall in order to start a daemon is by no > means a sign of a well-tested and mature project. > > That's why I asked how many people use stgt in a production environment > - James was worried about a potential migration path for current users. I don't know how many people use stgt in a production environment but I'm not sure that this problem prevents many people from using it in a production environment. You want to reboot a server running target devices while initiators connect to it. Rebooting the target server behind the initiators seldom works. System adminstorators in my workplace reboot storage devices once a year and tell us to shut down the initiator machines that use them before that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Mon, 4 Feb 2008 20:07:01 -0600 "Chris Weiss" <[EMAIL PROTECTED]> wrote: > On Feb 4, 2008 11:30 AM, Douglas Gilbert <[EMAIL PROTECTED]> wrote: > > Alan Cox wrote: > > >> better. So for example, I personally suspect that ATA-over-ethernet is > > >> way > > >> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and > > >> low-level, and against those crazy SCSI people to begin with. > > > > > > Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP > > > would probably trash iSCSI for latency if nothing else. > > > > And a variant that doesn't do ATA or IP: > > http://www.fcoe.com/ > > > > however, and interestingly enough, the open-fcoe software target > depends on scst (for now anyway) STGT also supports software FCoE target driver though it's still experimental stuff. http://www.mail-archive.com/[EMAIL PROTECTED]/msg12705.html It works in user space like STGT's iSCSI (and iSER) target driver (i.e. no kernel/user space interaction). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 05:43:10 +0100 Matteo Tescione <[EMAIL PROTECTED]> wrote: > Hi all, > And sorry for intrusion, i am not a developer but i work everyday with iscsi > and i found it fantastic. > Altough Aoe, Fcoe and so on could be better, we have to look in real world > implementations what is needed *now*, and if we look at vmware world, > virtual iron, microsoft clustering etc, the answer is iSCSI. > And now, SCST is the best open-source iSCSI target. So, from an end-user > point of view, what are the really problems to not integrate scst in the > mainstream kernel? Currently, the best open-source iSCSI target implemenation in Linux is Nicholas's LIO, I guess. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 08:14:01 +0100 Tomasz Chmielewski <[EMAIL PROTECTED]> wrote: > James Bottomley schrieb: > > > These are both features being independently worked on, are they not? > > Even if they weren't, the combination of the size of SCST in kernel plus > > the problem of having to find a migration path for the current STGT > > users still looks to me to involve the greater amount of work. > > I don't want to be mean, but does anyone actually use STGT in > production? Seriously? > > In the latest development version of STGT, it's only possible to stop > the tgtd target daemon using KILL / 9 signal - which also means all > iSCSI initiator connections are corrupted when tgtd target daemon is > started again (kernel upgrade, target daemon upgrade, server reboot etc.). I don't know what "iSCSI initiator connections are corrupted" mean. But if you reboot a server, how can an iSCSI target implementation keep iSCSI tcp connections? > Imagine you have to reboot all your NFS clients when you reboot your NFS > server. Not only that - your data is probably corrupted, or at least the > filesystem deserves checking... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 05:43:10 +0100 Matteo Tescione [EMAIL PROTECTED] wrote: Hi all, And sorry for intrusion, i am not a developer but i work everyday with iscsi and i found it fantastic. Altough Aoe, Fcoe and so on could be better, we have to look in real world implementations what is needed *now*, and if we look at vmware world, virtual iron, microsoft clustering etc, the answer is iSCSI. And now, SCST is the best open-source iSCSI target. So, from an end-user point of view, what are the really problems to not integrate scst in the mainstream kernel? Currently, the best open-source iSCSI target implemenation in Linux is Nicholas's LIO, I guess. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 08:14:01 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: James Bottomley schrieb: These are both features being independently worked on, are they not? Even if they weren't, the combination of the size of SCST in kernel plus the problem of having to find a migration path for the current STGT users still looks to me to involve the greater amount of work. I don't want to be mean, but does anyone actually use STGT in production? Seriously? In the latest development version of STGT, it's only possible to stop the tgtd target daemon using KILL / 9 signal - which also means all iSCSI initiator connections are corrupted when tgtd target daemon is started again (kernel upgrade, target daemon upgrade, server reboot etc.). I don't know what iSCSI initiator connections are corrupted mean. But if you reboot a server, how can an iSCSI target implementation keep iSCSI tcp connections? Imagine you have to reboot all your NFS clients when you reboot your NFS server. Not only that - your data is probably corrupted, or at least the filesystem deserves checking... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Mon, 4 Feb 2008 20:07:01 -0600 Chris Weiss [EMAIL PROTECTED] wrote: On Feb 4, 2008 11:30 AM, Douglas Gilbert [EMAIL PROTECTED] wrote: Alan Cox wrote: better. So for example, I personally suspect that ATA-over-ethernet is way better than some crazy SCSI-over-TCP crap, but I'm biased for simple and low-level, and against those crazy SCSI people to begin with. Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP would probably trash iSCSI for latency if nothing else. And a variant that doesn't do ATA or IP: http://www.fcoe.com/ however, and interestingly enough, the open-fcoe software target depends on scst (for now anyway) STGT also supports software FCoE target driver though it's still experimental stuff. http://www.mail-archive.com/[EMAIL PROTECTED]/msg12705.html It works in user space like STGT's iSCSI (and iSER) target driver (i.e. no kernel/user space interaction). -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 17:07:07 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: FUJITA Tomonori schrieb: On Tue, 05 Feb 2008 08:14:01 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: James Bottomley schrieb: These are both features being independently worked on, are they not? Even if they weren't, the combination of the size of SCST in kernel plus the problem of having to find a migration path for the current STGT users still looks to me to involve the greater amount of work. I don't want to be mean, but does anyone actually use STGT in production? Seriously? In the latest development version of STGT, it's only possible to stop the tgtd target daemon using KILL / 9 signal - which also means all iSCSI initiator connections are corrupted when tgtd target daemon is started again (kernel upgrade, target daemon upgrade, server reboot etc.). I don't know what iSCSI initiator connections are corrupted mean. But if you reboot a server, how can an iSCSI target implementation keep iSCSI tcp connections? The problem with tgtd is that you can't start it (configured) in an atomic way. Usually, one will start tgtd and it's configuration in a script (I replaced some parameters with ... to make it shorter and more readable): Thanks for the details. So the way to stop the daemon is not related with your problem. It's easily fixable. Can you start a new thread about this on stgt-devel mailing list? When we agree on the interface to start the daemon, I'll implement it. tgtd tgtadm --op new ... tgtadm --lld iscsi --op new ... (snip) So the only way to start/restart tgtd reliably is to do hacks which are needed with yet another iSCSI kernel implementation (IET): use iptables. iptables block iSCSI traffic tgtd sleep 1 tgtadm --op new ... tgtadm --lld iscsi --op new ... iptables unblock iSCSI traffic A bit ugly, isn't it? Having to tinker with a firewall in order to start a daemon is by no means a sign of a well-tested and mature project. That's why I asked how many people use stgt in a production environment - James was worried about a potential migration path for current users. I don't know how many people use stgt in a production environment but I'm not sure that this problem prevents many people from using it in a production environment. You want to reboot a server running target devices while initiators connect to it. Rebooting the target server behind the initiators seldom works. System adminstorators in my workplace reboot storage devices once a year and tell us to shut down the initiator machines that use them before that. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel
On Tue, 05 Feb 2008 18:09:15 +0100 Matteo Tescione [EMAIL PROTECTED] wrote: On 5-02-2008 14:38, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Tue, 05 Feb 2008 08:14:01 +0100 Tomasz Chmielewski [EMAIL PROTECTED] wrote: James Bottomley schrieb: These are both features being independently worked on, are they not? Even if they weren't, the combination of the size of SCST in kernel plus the problem of having to find a migration path for the current STGT users still looks to me to involve the greater amount of work. I don't want to be mean, but does anyone actually use STGT in production? Seriously? In the latest development version of STGT, it's only possible to stop the tgtd target daemon using KILL / 9 signal - which also means all iSCSI initiator connections are corrupted when tgtd target daemon is started again (kernel upgrade, target daemon upgrade, server reboot etc.). I don't know what iSCSI initiator connections are corrupted mean. But if you reboot a server, how can an iSCSI target implementation keep iSCSI tcp connections? Imagine you have to reboot all your NFS clients when you reboot your NFS server. Not only that - your data is probably corrupted, or at least the filesystem deserves checking... Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat) rebooting the primary server doesn't affect my iscsi traffic, SCST correctly manages stop/crash, by sending unit attention to clients on reconnect. Drbd+heartbeat correctly manages those things too. Still from an end-user POV, i was able to reboot/survive a crash only with SCST, IETD still has reconnect problems and STGT are even worst. Please tell us on stgt-devel mailing list if you see problems. We will try to fix them. Thanks, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Integration of SCST in the mainstream Linux kernel
On Wed, 30 Jan 2008 14:10:47 +0100 "Bart Van Assche" <[EMAIL PROTECTED]> wrote: > On Jan 30, 2008 11:56 AM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > On Wed, 30 Jan 2008 09:38:04 +0100 > > "Bart Van Assche" <[EMAIL PROTECTED]> wrote: > > > > > > Please specify which parameters you are referring to. As you know I > > > > Sorry, I can't say. I don't know much about iSER. But seems that Pete > > and Robin can get the better I/O performance - line speed ratiwo with > > STGT. > > Robin Humble was using a DDR InfiniBand network, while my tests were > performed with an SDR InfiniBand network. Robin's results can't be > directly compared to my results. I know that you use different hardware. I used 'ratio' word. BTW, you said the performance difference of dio READ is 38% but I think it's 27.3 %, though it's still large. > Pete Wyckoff's results > (http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf) are hard > to interpret. I have asked Pete which of the numbers in his test can > be compared with what I measured, but Pete did not reply. > > > The version of OpenIB might matters too. For example, Pete said that > > STGT reads loses about 100 MB/s for some transfer sizes for some > > transfer sizes due to the OpenIB version difference or other unclear > > reasons. > > > > http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135 > > Pete wrote about a degradation from 600 MB/s to 500 MB/s for reads > with STGT+iSER. In my tests I measured 589 MB/s for reads (direct > I/O), which matches with the better result obtained by Pete. I don't know he used the same benchmark software so I don't think that we can compare them. All I tried to say is the OFED version might has big effect on the performance. So you might need to find the best one. > Note: the InfiniBand kernel modules I used were those from the > 2.6.22.9 kernel, not from the OFED distribution. I'm talking about a target machine (I think that Pete was also talking about OFED on his target machine). STGT uses OFED libraries, I think. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Integration of SCST in the mainstream Linux kernel
On Wed, 30 Jan 2008 09:38:04 +0100 "Bart Van Assche" <[EMAIL PROTECTED]> wrote: > On Jan 30, 2008 12:32 AM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > > > iSER has parameters to limit the maximum size of RDMA (it needs to > > repeat RDMA with a poor configuration)? > > Please specify which parameters you are referring to. As you know I Sorry, I can't say. I don't know much about iSER. But seems that Pete and Robin can get the better I/O performance - line speed ratio with STGT. The version of OpenIB might matters too. For example, Pete said that STGT reads loses about 100 MB/s for some transfer sizes for some transfer sizes due to the OpenIB version difference or other unclear reasons. http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135 It's fair to say that it takes long time and need lots of knowledge to get the maximum performance of SAN, I think. I think that it would be easier to convince James with the detailed analysis (e.g. where does it take so long, like Pete did), not just 'dd' performance results. Pushing iSCSI target code into mainline failed four times: IET, SCST, STGT (doing I/Os in kernel in the past), and PyX's one (*1). iSCSI target code is huge. You said SCST comprises 14,000 lines, but it's not iSCSI target code. The SCSI engine code comprises 14,000 lines. You need another 10,000 lines for the iSCSI driver. Note that SCST's iSCSI driver provides only basic iSCSI features. PyX's iSCSI target code implemenents more iSCSI features (like MC/S, ERL2, etc) and comprises about 60,000 lines and it still lacks some features like iSER, bidi, etc. I think that it's reasonable to say that we need more than 'dd' results before pushing about possible more than 60,000 lines to mainline. (*1) http://linux-iscsi.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Integration of SCST in the mainstream Linux kernel
On Wed, 30 Jan 2008 09:38:04 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: On Jan 30, 2008 12:32 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote: iSER has parameters to limit the maximum size of RDMA (it needs to repeat RDMA with a poor configuration)? Please specify which parameters you are referring to. As you know I Sorry, I can't say. I don't know much about iSER. But seems that Pete and Robin can get the better I/O performance - line speed ratio with STGT. The version of OpenIB might matters too. For example, Pete said that STGT reads loses about 100 MB/s for some transfer sizes for some transfer sizes due to the OpenIB version difference or other unclear reasons. http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135 It's fair to say that it takes long time and need lots of knowledge to get the maximum performance of SAN, I think. I think that it would be easier to convince James with the detailed analysis (e.g. where does it take so long, like Pete did), not just 'dd' performance results. Pushing iSCSI target code into mainline failed four times: IET, SCST, STGT (doing I/Os in kernel in the past), and PyX's one (*1). iSCSI target code is huge. You said SCST comprises 14,000 lines, but it's not iSCSI target code. The SCSI engine code comprises 14,000 lines. You need another 10,000 lines for the iSCSI driver. Note that SCST's iSCSI driver provides only basic iSCSI features. PyX's iSCSI target code implemenents more iSCSI features (like MC/S, ERL2, etc) and comprises about 60,000 lines and it still lacks some features like iSER, bidi, etc. I think that it's reasonable to say that we need more than 'dd' results before pushing about possible more than 60,000 lines to mainline. (*1) http://linux-iscsi.org/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Integration of SCST in the mainstream Linux kernel
On Wed, 30 Jan 2008 14:10:47 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: On Jan 30, 2008 11:56 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Wed, 30 Jan 2008 09:38:04 +0100 Bart Van Assche [EMAIL PROTECTED] wrote: Please specify which parameters you are referring to. As you know I Sorry, I can't say. I don't know much about iSER. But seems that Pete and Robin can get the better I/O performance - line speed ratiwo with STGT. Robin Humble was using a DDR InfiniBand network, while my tests were performed with an SDR InfiniBand network. Robin's results can't be directly compared to my results. I know that you use different hardware. I used 'ratio' word. BTW, you said the performance difference of dio READ is 38% but I think it's 27.3 %, though it's still large. Pete Wyckoff's results (http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf) are hard to interpret. I have asked Pete which of the numbers in his test can be compared with what I measured, but Pete did not reply. The version of OpenIB might matters too. For example, Pete said that STGT reads loses about 100 MB/s for some transfer sizes for some transfer sizes due to the OpenIB version difference or other unclear reasons. http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135 Pete wrote about a degradation from 600 MB/s to 500 MB/s for reads with STGT+iSER. In my tests I measured 589 MB/s for reads (direct I/O), which matches with the better result obtained by Pete. I don't know he used the same benchmark software so I don't think that we can compare them. All I tried to say is the OFED version might has big effect on the performance. So you might need to find the best one. Note: the InfiniBand kernel modules I used were those from the 2.6.22.9 kernel, not from the OFED distribution. I'm talking about a target machine (I think that Pete was also talking about OFED on his target machine). STGT uses OFED libraries, I think. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Integration of SCST in the mainstream Linux kernel
On Tue, 29 Jan 2008 13:31:52 -0800 Roland Dreier <[EMAIL PROTECTED]> wrote: > > . . STGT read SCST read.STGT read > SCST read. > > . . performance performance . performance > performance . > > . . (0.5K, MB/s) (0.5K, MB/s) . (1 MB, > MB/s) (1 MB, MB/s) . > > . iSER (8 Gb/s network) . 250N/A . 360 > N/A . > > . SRP (8 Gb/s network) . N/A421 . N/A > 683 . > > > On the comparable figures, which only seem to be IPoIB they're showing a > > 13-18% variance, aren't they? Which isn't an incredible difference. > > Maybe I'm all wet, but I think iSER vs. SRP should be roughly > comparable. The exact formatting of various messages etc. is > different but the data path using RDMA is pretty much identical. So > the big difference between STGT iSER and SCST SRP hints at some big > difference in the efficiency of the two implementations. iSER has parameters to limit the maximum size of RDMA (it needs to repeat RDMA with a poor configuration)? Anyway, here's the results from Robin Humble: iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd, initiator end booted with mem=512M, target with 8G ram direct i/o dd write/read 800/751 MB/s dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct http://www.mail-archive.com/[EMAIL PROTECTED]/msg13502.html I think that STGT is pretty fast with the fast backing storage. I don't think that there is the notable perfornace difference between kernel-space and user-space SRP (or ISER) implementations about moving data between hosts. IB is expected to enable user-space applications to move data between hosts quickly (if not, what can IB provide us?). I think that the question is how fast user-space applications can do I/Os ccompared with I/Os in kernel space. STGT is eager for the advent of good asynchronous I/O and event notification interfances. One more possible optimization for STGT is zero-copy data transfer. STGT uses pre-registered buffers and move data between page cache and thsse buffers, and then does RDMA transfer. If we implement own caching mechanism to use pre-registered buffers directly with (AIO and O_DIRECT), then STGT can move data without data copies. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Integration of SCST in the mainstream Linux kernel
On Tue, 29 Jan 2008 13:31:52 -0800 Roland Dreier [EMAIL PROTECTED] wrote: . . STGT read SCST read.STGT read SCST read. . . performance performance . performance performance . . . (0.5K, MB/s) (0.5K, MB/s) . (1 MB, MB/s) (1 MB, MB/s) . . iSER (8 Gb/s network) . 250N/A . 360 N/A . . SRP (8 Gb/s network) . N/A421 . N/A 683 . On the comparable figures, which only seem to be IPoIB they're showing a 13-18% variance, aren't they? Which isn't an incredible difference. Maybe I'm all wet, but I think iSER vs. SRP should be roughly comparable. The exact formatting of various messages etc. is different but the data path using RDMA is pretty much identical. So the big difference between STGT iSER and SCST SRP hints at some big difference in the efficiency of the two implementations. iSER has parameters to limit the maximum size of RDMA (it needs to repeat RDMA with a poor configuration)? Anyway, here's the results from Robin Humble: iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd, initiator end booted with mem=512M, target with 8G ram direct i/o dd write/read 800/751 MB/s dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct http://www.mail-archive.com/[EMAIL PROTECTED]/msg13502.html I think that STGT is pretty fast with the fast backing storage. I don't think that there is the notable perfornace difference between kernel-space and user-space SRP (or ISER) implementations about moving data between hosts. IB is expected to enable user-space applications to move data between hosts quickly (if not, what can IB provide us?). I think that the question is how fast user-space applications can do I/Os ccompared with I/Os in kernel space. STGT is eager for the advent of good asynchronous I/O and event notification interfances. One more possible optimization for STGT is zero-copy data transfer. STGT uses pre-registered buffers and move data between page cache and thsse buffers, and then does RDMA transfer. If we implement own caching mechanism to use pre-registered buffers directly with (AIO and O_DIRECT), then STGT can move data without data copies. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] ch: remove forward declarations
This moves ch_template and changer_fops structs to the end of file and removes forward declarations. This also removes some trailing whitespace. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> --- drivers/scsi/ch.c | 120 - 1 files changed, 54 insertions(+), 66 deletions(-) diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c index 2b07014..7aad154 100644 --- a/drivers/scsi/ch.c +++ b/drivers/scsi/ch.c @@ -90,16 +90,6 @@ static const char * vendor_labels[CH_TYPES-4] = { #define MAX_RETRIES 1 -static int ch_probe(struct device *); -static int ch_remove(struct device *); -static int ch_open(struct inode * inode, struct file * filp); -static int ch_release(struct inode * inode, struct file * filp); -static long ch_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); -#ifdef CONFIG_COMPAT -static long ch_ioctl_compat(struct file * filp, - unsigned int cmd, unsigned long arg); -#endif - static struct class * ch_sysfs_class; typedef struct { @@ -118,27 +108,6 @@ typedef struct { static DEFINE_IDR(ch_index_idr); static DEFINE_SPINLOCK(ch_index_lock); -static struct scsi_driver ch_template = -{ - .owner = THIS_MODULE, - .gendrv = { - .name = "ch", - .probe = ch_probe, - .remove = ch_remove, - }, -}; - -static const struct file_operations changer_fops = -{ - .owner = THIS_MODULE, - .open = ch_open, - .release= ch_release, - .unlocked_ioctl = ch_ioctl, -#ifdef CONFIG_COMPAT - .compat_ioctl = ch_ioctl_compat, -#endif -}; - static const struct { unsigned char sense; unsigned char asc; @@ -207,7 +176,7 @@ ch_do_scsi(scsi_changer *ch, unsigned char *cmd, { int errno, retries = 0, timeout, result; struct scsi_sense_hdr sshdr; - + timeout = (cmd[0] == INITIALIZE_ELEMENT_STATUS) ? timeout_init : timeout_move; @@ -245,7 +214,7 @@ static int ch_elem_to_typecode(scsi_changer *ch, u_int elem) { int i; - + for (i = 0; i < CH_TYPES; i++) { if (elem >= ch->firsts[i] && elem < ch->firsts[i] + @@ -261,15 +230,15 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char *data) u_char cmd[12]; u_char *buffer; int result; - + buffer = kmalloc(512, GFP_KERNEL | GFP_DMA); if(!buffer) return -ENOMEM; - + retry: memset(cmd,0,sizeof(cmd)); cmd[0] = READ_ELEMENT_STATUS; - cmd[1] = (ch->device->lun << 5) | + cmd[1] = (ch->device->lun << 5) | (ch->voltags ? 0x10 : 0) | ch_elem_to_typecode(ch,elem); cmd[2] = (elem >> 8) & 0xff; @@ -296,7 +265,7 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char *data) return result; } -static int +static int ch_init_elem(scsi_changer *ch) { int err; @@ -322,7 +291,7 @@ ch_readconfig(scsi_changer *ch) buffer = kzalloc(512, GFP_KERNEL | GFP_DMA); if (!buffer) return -ENOMEM; - + memset(cmd,0,sizeof(cmd)); cmd[0] = MODE_SENSE; cmd[1] = ch->device->lun << 5; @@ -365,7 +334,7 @@ ch_readconfig(scsi_changer *ch) } else { vprintk("reading element address assigment page failed!\n"); } - + /* vendor specific element types */ for (i = 0; i < 4; i++) { if (0 == vendor_counts[i]) @@ -443,7 +412,7 @@ static int ch_position(scsi_changer *ch, u_int trans, u_int elem, int rotate) { u_char cmd[10]; - + dprintk("position: 0x%x\n",elem); if (0 == trans) trans = ch->firsts[CHET_MT]; @@ -462,7 +431,7 @@ static int ch_move(scsi_changer *ch, u_int trans, u_int src, u_int dest, int rotate) { u_char cmd[12]; - + dprintk("move: 0x%x => 0x%x\n",src,dest); if (0 == trans) trans = ch->firsts[CHET_MT]; @@ -484,7 +453,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src, u_int dest1, u_int dest2, int rotate1, int rotate2) { u_char cmd[12]; - + dprintk("exchange: 0x%x => 0x%x => 0x%x\n", src,dest1,dest2); if (0 == trans) @@ -501,7 +470,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src, cmd[8] = (dest2 >> 8) & 0xff; cmd[9] = dest2 & 0xff; cmd[10] = (rotate1 ? 1 : 0) | (rotate2 ? 2 : 0); - + return ch_do_scsi(ch, cmd, NULL,0, DMA_NONE); } @@ -539,14 +508,14 @@ ch_set_voltag(scsi_changer *ch, u_int elem, elem, tag); memset(cmd,0,sizeof(cmd));
[PATCH 1/2] ch: fix device minor number management bug
ch_probe uses the total number of ch devices as minor. ch_probe: ch->minor = ch_devcount; ... ch_devcount++; Then ch_remove decreases ch_devcount: ch_remove: ch_devcount--; If you have two ch devices, e.g. sch0 and sch1, and remove sch0, ch_devcount is 1. Then if you add another ch device, ch_probe tries to create sch1. So you get a warning and fail to create sch1: Jan 24 16:01:05 nice kernel: sysfs: duplicate filename 'sch1' can not be created Jan 24 16:01:05 nice kernel: WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() Jan 24 16:01:05 nice kernel: Pid: 2571, comm: iscsid Not tainted 2.6.24-rc7-ga3d2c2e8-dirty #1 Jan 24 16:01:05 nice kernel: Jan 24 16:01:05 nice kernel: Call Trace: Jan 24 16:01:05 nice kernel: [] sysfs_add_one+0x54/0xbd Jan 24 16:01:05 nice kernel: [] create_dir+0x4f/0x87 Jan 24 16:01:05 nice kernel: [] sysfs_create_dir+0x35/0x4a Jan 24 16:01:05 nice kernel: [] kobject_get+0x12/0x17 Jan 24 16:01:05 nice kernel: [] kobject_add+0xf3/0x1a6 Jan 24 16:01:05 nice kernel: [] class_device_add+0xaa/0x39d Jan 24 16:01:05 nice kernel: [] class_device_create+0xcb/0xfa Jan 24 16:01:05 nice kernel: [] printk+0x4e/0x56 Jan 24 16:01:05 nice kernel: [] sysfs_ilookup_test+0x0/0xf Jan 24 16:01:05 nice kernel: [] :ch:ch_probe+0xbe/0x61a (snip) And you can't add new ch device any more. This patch converts ch to use a common minor number management way, idr like sg and bsg do. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> --- drivers/scsi/ch.c | 71 +--- 1 files changed, 40 insertions(+), 31 deletions(-) diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c index 765f2fc..2b07014 100644 --- a/drivers/scsi/ch.c +++ b/drivers/scsi/ch.c @@ -21,6 +21,7 @@ #include #include /* here are all the ioctls */ #include +#include #include #include @@ -33,6 +34,7 @@ #define CH_DT_MAX 16 #define CH_TYPES8 +#define CH_MAX_DEVS 128 MODULE_DESCRIPTION("device driver for scsi media changer devices"); MODULE_AUTHOR("Gerd Knorr <[EMAIL PROTECTED]>"); @@ -113,9 +115,8 @@ typedef struct { struct mutexlock; } scsi_changer; -static LIST_HEAD(ch_devlist); -static DEFINE_SPINLOCK(ch_devlist_lock); -static int ch_devcount; +static DEFINE_IDR(ch_index_idr); +static DEFINE_SPINLOCK(ch_index_lock); static struct scsi_driver ch_template = { @@ -598,20 +599,17 @@ ch_release(struct inode *inode, struct file *file) static int ch_open(struct inode *inode, struct file *file) { - scsi_changer *tmp, *ch; + scsi_changer *ch; int minor = iminor(inode); - spin_lock(_devlist_lock); - ch = NULL; - list_for_each_entry(tmp,_devlist,list) { - if (tmp->minor == minor) - ch = tmp; - } + spin_lock(_index_lock); + ch = idr_find(_index_idr, minor); + if (NULL == ch || scsi_device_get(ch->device)) { - spin_unlock(_devlist_lock); + spin_unlock(_index_lock); return -ENXIO; } - spin_unlock(_devlist_lock); + spin_unlock(_index_lock); file->private_data = ch; return 0; @@ -914,6 +912,7 @@ static int ch_probe(struct device *dev) { struct scsi_device *sd = to_scsi_device(dev); struct class_device *class_dev; + int minor, ret = -ENOMEM; scsi_changer *ch; if (sd->type != TYPE_MEDIUM_CHANGER) @@ -923,7 +922,22 @@ static int ch_probe(struct device *dev) if (NULL == ch) return -ENOMEM; - ch->minor = ch_devcount; + if (!idr_pre_get(_index_idr, GFP_KERNEL)) + goto free_ch; + + spin_lock(_index_lock); + ret = idr_get_new(_index_idr, ch, ); + spin_unlock(_index_lock); + + if (ret) + goto free_ch; + + if (minor > CH_MAX_DEVS) { + ret = -ENODEV; + goto remove_idr; + } + + ch->minor = minor; sprintf(ch->name,"ch%d",ch->minor); class_dev = class_device_create(ch_sysfs_class, NULL, @@ -932,8 +946,8 @@ static int ch_probe(struct device *dev) if (IS_ERR(class_dev)) { printk(KERN_WARNING "ch%d: class_device_create failed\n", ch->minor); - kfree(ch); - return PTR_ERR(class_dev); + ret = PTR_ERR(class_dev); + goto remove_idr; } mutex_init(>lock); @@ -942,35 +956,29 @@ static int ch_probe(struct device *dev) if (init) ch_init_elem(ch); + dev_set_drvdata(dev, ch); sdev_printk(KERN_INFO, sd, "Attached scsi changer %s\n", ch->name); - spin_lock(_devlist_lock); - list_add_tail(>list,_devlist); - ch_devcount++; - spin_unlock(_devlist_lock); retur
[PATCH 1/2] ch: fix device minor number management bug
ch_probe uses the total number of ch devices as minor. ch_probe: ch-minor = ch_devcount; ... ch_devcount++; Then ch_remove decreases ch_devcount: ch_remove: ch_devcount--; If you have two ch devices, e.g. sch0 and sch1, and remove sch0, ch_devcount is 1. Then if you add another ch device, ch_probe tries to create sch1. So you get a warning and fail to create sch1: Jan 24 16:01:05 nice kernel: sysfs: duplicate filename 'sch1' can not be created Jan 24 16:01:05 nice kernel: WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() Jan 24 16:01:05 nice kernel: Pid: 2571, comm: iscsid Not tainted 2.6.24-rc7-ga3d2c2e8-dirty #1 Jan 24 16:01:05 nice kernel: Jan 24 16:01:05 nice kernel: Call Trace: Jan 24 16:01:05 nice kernel: [802a22b8] sysfs_add_one+0x54/0xbd Jan 24 16:01:05 nice kernel: [802a283c] create_dir+0x4f/0x87 Jan 24 16:01:05 nice kernel: [802a28a9] sysfs_create_dir+0x35/0x4a Jan 24 16:01:05 nice kernel: [803069a1] kobject_get+0x12/0x17 Jan 24 16:01:05 nice kernel: [80306ece] kobject_add+0xf3/0x1a6 Jan 24 16:01:05 nice kernel: [8034252b] class_device_add+0xaa/0x39d Jan 24 16:01:05 nice kernel: [803428fb] class_device_create+0xcb/0xfa Jan 24 16:01:05 nice kernel: [80229e09] printk+0x4e/0x56 Jan 24 16:01:05 nice kernel: [802a2054] sysfs_ilookup_test+0x0/0xf Jan 24 16:01:05 nice kernel: [88022580] :ch:ch_probe+0xbe/0x61a (snip) And you can't add new ch device any more. This patch converts ch to use a common minor number management way, idr like sg and bsg do. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/ch.c | 71 +--- 1 files changed, 40 insertions(+), 31 deletions(-) diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c index 765f2fc..2b07014 100644 --- a/drivers/scsi/ch.c +++ b/drivers/scsi/ch.c @@ -21,6 +21,7 @@ #include linux/compat.h #include linux/chio.h/* here are all the ioctls */ #include linux/mutex.h +#include linux/idr.h #include scsi/scsi.h #include scsi/scsi_cmnd.h @@ -33,6 +34,7 @@ #define CH_DT_MAX 16 #define CH_TYPES8 +#define CH_MAX_DEVS 128 MODULE_DESCRIPTION(device driver for scsi media changer devices); MODULE_AUTHOR(Gerd Knorr [EMAIL PROTECTED]); @@ -113,9 +115,8 @@ typedef struct { struct mutexlock; } scsi_changer; -static LIST_HEAD(ch_devlist); -static DEFINE_SPINLOCK(ch_devlist_lock); -static int ch_devcount; +static DEFINE_IDR(ch_index_idr); +static DEFINE_SPINLOCK(ch_index_lock); static struct scsi_driver ch_template = { @@ -598,20 +599,17 @@ ch_release(struct inode *inode, struct file *file) static int ch_open(struct inode *inode, struct file *file) { - scsi_changer *tmp, *ch; + scsi_changer *ch; int minor = iminor(inode); - spin_lock(ch_devlist_lock); - ch = NULL; - list_for_each_entry(tmp,ch_devlist,list) { - if (tmp-minor == minor) - ch = tmp; - } + spin_lock(ch_index_lock); + ch = idr_find(ch_index_idr, minor); + if (NULL == ch || scsi_device_get(ch-device)) { - spin_unlock(ch_devlist_lock); + spin_unlock(ch_index_lock); return -ENXIO; } - spin_unlock(ch_devlist_lock); + spin_unlock(ch_index_lock); file-private_data = ch; return 0; @@ -914,6 +912,7 @@ static int ch_probe(struct device *dev) { struct scsi_device *sd = to_scsi_device(dev); struct class_device *class_dev; + int minor, ret = -ENOMEM; scsi_changer *ch; if (sd-type != TYPE_MEDIUM_CHANGER) @@ -923,7 +922,22 @@ static int ch_probe(struct device *dev) if (NULL == ch) return -ENOMEM; - ch-minor = ch_devcount; + if (!idr_pre_get(ch_index_idr, GFP_KERNEL)) + goto free_ch; + + spin_lock(ch_index_lock); + ret = idr_get_new(ch_index_idr, ch, minor); + spin_unlock(ch_index_lock); + + if (ret) + goto free_ch; + + if (minor CH_MAX_DEVS) { + ret = -ENODEV; + goto remove_idr; + } + + ch-minor = minor; sprintf(ch-name,ch%d,ch-minor); class_dev = class_device_create(ch_sysfs_class, NULL, @@ -932,8 +946,8 @@ static int ch_probe(struct device *dev) if (IS_ERR(class_dev)) { printk(KERN_WARNING ch%d: class_device_create failed\n, ch-minor); - kfree(ch); - return PTR_ERR(class_dev); + ret = PTR_ERR(class_dev); + goto remove_idr; } mutex_init(ch-lock); @@ -942,35 +956,29 @@ static int ch_probe(struct device *dev) if (init) ch_init_elem(ch); + dev_set_drvdata(dev, ch); sdev_printk(KERN_INFO, sd, Attached scsi changer %s\n, ch-name
[PATCH 2/2] ch: remove forward declarations
This moves ch_template and changer_fops structs to the end of file and removes forward declarations. This also removes some trailing whitespace. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- drivers/scsi/ch.c | 120 - 1 files changed, 54 insertions(+), 66 deletions(-) diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c index 2b07014..7aad154 100644 --- a/drivers/scsi/ch.c +++ b/drivers/scsi/ch.c @@ -90,16 +90,6 @@ static const char * vendor_labels[CH_TYPES-4] = { #define MAX_RETRIES 1 -static int ch_probe(struct device *); -static int ch_remove(struct device *); -static int ch_open(struct inode * inode, struct file * filp); -static int ch_release(struct inode * inode, struct file * filp); -static long ch_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); -#ifdef CONFIG_COMPAT -static long ch_ioctl_compat(struct file * filp, - unsigned int cmd, unsigned long arg); -#endif - static struct class * ch_sysfs_class; typedef struct { @@ -118,27 +108,6 @@ typedef struct { static DEFINE_IDR(ch_index_idr); static DEFINE_SPINLOCK(ch_index_lock); -static struct scsi_driver ch_template = -{ - .owner = THIS_MODULE, - .gendrv = { - .name = ch, - .probe = ch_probe, - .remove = ch_remove, - }, -}; - -static const struct file_operations changer_fops = -{ - .owner = THIS_MODULE, - .open = ch_open, - .release= ch_release, - .unlocked_ioctl = ch_ioctl, -#ifdef CONFIG_COMPAT - .compat_ioctl = ch_ioctl_compat, -#endif -}; - static const struct { unsigned char sense; unsigned char asc; @@ -207,7 +176,7 @@ ch_do_scsi(scsi_changer *ch, unsigned char *cmd, { int errno, retries = 0, timeout, result; struct scsi_sense_hdr sshdr; - + timeout = (cmd[0] == INITIALIZE_ELEMENT_STATUS) ? timeout_init : timeout_move; @@ -245,7 +214,7 @@ static int ch_elem_to_typecode(scsi_changer *ch, u_int elem) { int i; - + for (i = 0; i CH_TYPES; i++) { if (elem = ch-firsts[i] elem ch-firsts[i] + @@ -261,15 +230,15 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char *data) u_char cmd[12]; u_char *buffer; int result; - + buffer = kmalloc(512, GFP_KERNEL | GFP_DMA); if(!buffer) return -ENOMEM; - + retry: memset(cmd,0,sizeof(cmd)); cmd[0] = READ_ELEMENT_STATUS; - cmd[1] = (ch-device-lun 5) | + cmd[1] = (ch-device-lun 5) | (ch-voltags ? 0x10 : 0) | ch_elem_to_typecode(ch,elem); cmd[2] = (elem 8) 0xff; @@ -296,7 +265,7 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char *data) return result; } -static int +static int ch_init_elem(scsi_changer *ch) { int err; @@ -322,7 +291,7 @@ ch_readconfig(scsi_changer *ch) buffer = kzalloc(512, GFP_KERNEL | GFP_DMA); if (!buffer) return -ENOMEM; - + memset(cmd,0,sizeof(cmd)); cmd[0] = MODE_SENSE; cmd[1] = ch-device-lun 5; @@ -365,7 +334,7 @@ ch_readconfig(scsi_changer *ch) } else { vprintk(reading element address assigment page failed!\n); } - + /* vendor specific element types */ for (i = 0; i 4; i++) { if (0 == vendor_counts[i]) @@ -443,7 +412,7 @@ static int ch_position(scsi_changer *ch, u_int trans, u_int elem, int rotate) { u_char cmd[10]; - + dprintk(position: 0x%x\n,elem); if (0 == trans) trans = ch-firsts[CHET_MT]; @@ -462,7 +431,7 @@ static int ch_move(scsi_changer *ch, u_int trans, u_int src, u_int dest, int rotate) { u_char cmd[12]; - + dprintk(move: 0x%x = 0x%x\n,src,dest); if (0 == trans) trans = ch-firsts[CHET_MT]; @@ -484,7 +453,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src, u_int dest1, u_int dest2, int rotate1, int rotate2) { u_char cmd[12]; - + dprintk(exchange: 0x%x = 0x%x = 0x%x\n, src,dest1,dest2); if (0 == trans) @@ -501,7 +470,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src, cmd[8] = (dest2 8) 0xff; cmd[9] = dest20xff; cmd[10] = (rotate1 ? 1 : 0) | (rotate2 ? 2 : 0); - + return ch_do_scsi(ch, cmd, NULL,0, DMA_NONE); } @@ -539,14 +508,14 @@ ch_set_voltag(scsi_changer *ch, u_int elem, elem, tag); memset(cmd,0,sizeof(cmd)); cmd[0] = SEND_VOLUME_TAG; - cmd[1] = (ch-device-lun 5) | + cmd[1] = (ch-device-lun 5) | ch_elem_to_typecode(ch,elem); cmd[2] = (elem 8) 0xff; cmd[3] = elem
Re: INITIO scsi driver fails to work properly
On Tue, 15 Jan 2008 09:16:06 -0600 James Bottomley <[EMAIL PROTECTED]> wrote: > > On Sun, 2008-01-13 at 14:28 +0200, Filippos Papadopoulos wrote: > > On 1/11/08, James Bottomley <[EMAIL PROTECTED]> wrote: > > > > > > On Fri, 2008-01-11 at 18:44 +0200, Filippos Papadopoulos wrote: > > > > On Jan 11, 2008 5:44 PM, James Bottomley > > > > <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > > I havent reported "initio: I/O port range 0x0 is busy." > > > > > > > > > > Sorry ... we appear to have several reporters of different bugs in > > > > > this > > > > > thread. That message was copied by Chuck Ebbert from a Red Hat > > > > > bugzilla ... I was assuming it was the same problem. > > > > > > > > > > > I applied the patch on 2.6.24-rc6-git9 but unfortunatelly same > > > > > > thing happens. > > > > > > > > > > First off, has this driver ever worked for you in 2.6? Just booting > > > > > SLES9 (2.6.5) or RHEL4 (2.6.9) ... or one of their open equivalents to > > > > > check a really old kernel would be helpful. If you can get it to > > > > > work, > > > > > then we can proceed with a patch reversion regime based on the > > > > > assumption that the problem is a recent commit. > > > > > > > > Yes it works under 2.6.16.13. See the beginning of this thread, i > > > > mention there some things about newer versions. > > > > > > Thanks, actually, I see this: > > > > > > > I tried to install OpenSUSE 10.3 (kernel 2.6.22.5) and the latest > > > > OpenSUSE 11.0 Alpha 0 (kernel 2.6.24-rc4) but although the initio > > > > drivergets loaded during the installation process, yast reports that no > > > > hard > > > > disk is found. > > > > > > Could you try with a vanilla 2.6.22 kernel? The reason for all of this > > > is that 2.6.22 predates Alan's conversion of this driver (which was my > > > 95% candidate for the source of the bug). I want you to try the vanilla > > > kernel just in case the opensuse one contains a backport. > > > > > > Yes you are right. I compiled the vanilla 2.6.22 and initio driver works. > > Tell me if you want to apply any patch to it. > > > That's good news ... at least we know where the issue lies; now the > problem comes: there are two candidate patches for this issue: Alan's > driver update patch and Tomo's accessors patch. Unfortunately, due to > merge conflicts the two are pretty hopelessly intertwined. I think I > already spotted one bug in the accessor conversion, so I'll look at that > again. Alan's also going to acquire an inito board and retest his > conversions. > > I'm afraid it might be a while before we have anything for you to test. Can you try this patch? Thanks, diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c index 01bf018..6891d2b 100644 --- a/drivers/scsi/initio.c +++ b/drivers/scsi/initio.c @@ -2609,6 +2609,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c cblk->bufptr = cpu_to_le32((u32)dma_addr); cmnd->SCp.dma_handle = dma_addr; + cblk->sglen = nseg; cblk->flags |= SCF_SG; /* Turn on SG list flag */ total_len = 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: INITIO scsi driver fails to work properly
On Tue, 15 Jan 2008 09:16:06 -0600 James Bottomley [EMAIL PROTECTED] wrote: On Sun, 2008-01-13 at 14:28 +0200, Filippos Papadopoulos wrote: On 1/11/08, James Bottomley [EMAIL PROTECTED] wrote: On Fri, 2008-01-11 at 18:44 +0200, Filippos Papadopoulos wrote: On Jan 11, 2008 5:44 PM, James Bottomley [EMAIL PROTECTED] wrote: I havent reported initio: I/O port range 0x0 is busy. Sorry ... we appear to have several reporters of different bugs in this thread. That message was copied by Chuck Ebbert from a Red Hat bugzilla ... I was assuming it was the same problem. I applied the patch on 2.6.24-rc6-git9 but unfortunatelly same thing happens. First off, has this driver ever worked for you in 2.6? Just booting SLES9 (2.6.5) or RHEL4 (2.6.9) ... or one of their open equivalents to check a really old kernel would be helpful. If you can get it to work, then we can proceed with a patch reversion regime based on the assumption that the problem is a recent commit. Yes it works under 2.6.16.13. See the beginning of this thread, i mention there some things about newer versions. Thanks, actually, I see this: I tried to install OpenSUSE 10.3 (kernel 2.6.22.5) and the latest OpenSUSE 11.0 Alpha 0 (kernel 2.6.24-rc4) but although the initio drivergets loaded during the installation process, yast reports that no hard disk is found. Could you try with a vanilla 2.6.22 kernel? The reason for all of this is that 2.6.22 predates Alan's conversion of this driver (which was my 95% candidate for the source of the bug). I want you to try the vanilla kernel just in case the opensuse one contains a backport. Yes you are right. I compiled the vanilla 2.6.22 and initio driver works. Tell me if you want to apply any patch to it. That's good news ... at least we know where the issue lies; now the problem comes: there are two candidate patches for this issue: Alan's driver update patch and Tomo's accessors patch. Unfortunately, due to merge conflicts the two are pretty hopelessly intertwined. I think I already spotted one bug in the accessor conversion, so I'll look at that again. Alan's also going to acquire an inito board and retest his conversions. I'm afraid it might be a while before we have anything for you to test. Can you try this patch? Thanks, diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c index 01bf018..6891d2b 100644 --- a/drivers/scsi/initio.c +++ b/drivers/scsi/initio.c @@ -2609,6 +2609,7 @@ static void initio_build_scb(struct initio_host * host, struct scsi_ctrl_blk * c cblk-bufptr = cpu_to_le32((u32)dma_addr); cmnd-SCp.dma_handle = dma_addr; + cblk-sglen = nseg; cblk-flags |= SCF_SG; /* Turn on SG list flag */ total_len = 0; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bsg : Add support for io vectors in bsg
On Thu, 10 Jan 2008 16:46:05 -0500 Pete Wyckoff <[EMAIL PROTECTED]> wrote: > Is there another async I/O mechanism? Userspace builds the CDBs, > just needs some way to drop them in SCSI ML. BSG is almost perfect > for this, but doesn't do iovec, leading to lots of memcpy. syslets? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Driver 'sd' needs updating
CC'ed linux-scsi and James, On Thu, 10 Jan 2008 08:51:50 + Nick Warne <[EMAIL PROTECTED]> wrote: > > Hi everybody - Happy New Year to you all! > > OK, updated to git rc7 yesterday - I now see this in syslog: > >"Driver 'sd' needs updating - please use bus_type methods" > > The warning never appeared in RC6, and all google reveals are other > peoples logs that are posted about other issues. > > Do I need to fix up something here? No, you don't. It's harmless, a side effect of: commit 751bf4d7865e4ced406be93b04c7436d866d3684 Author: James Bottomley <[EMAIL PROTECTED]> Date: Wed Jan 2 11:14:30 2008 -0600 [SCSI] scsi_sysfs: restore prep_fn when ULD is removed It would be better to silence this warning. James, we need to reset prep_fn in each ULD? though it's not nice... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Driver 'sd' needs updating
CC'ed linux-scsi and James, On Thu, 10 Jan 2008 08:51:50 + Nick Warne [EMAIL PROTECTED] wrote: Hi everybody - Happy New Year to you all! OK, updated to git rc7 yesterday - I now see this in syslog: Driver 'sd' needs updating - please use bus_type methods The warning never appeared in RC6, and all google reveals are other peoples logs that are posted about other issues. Do I need to fix up something here? No, you don't. It's harmless, a side effect of: commit 751bf4d7865e4ced406be93b04c7436d866d3684 Author: James Bottomley [EMAIL PROTECTED] Date: Wed Jan 2 11:14:30 2008 -0600 [SCSI] scsi_sysfs: restore prep_fn when ULD is removed It would be better to silence this warning. James, we need to reset prep_fn in each ULD? though it's not nice... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bsg : Add support for io vectors in bsg
On Thu, 10 Jan 2008 16:46:05 -0500 Pete Wyckoff [EMAIL PROTECTED] wrote: Is there another async I/O mechanism? Userspace builds the CDBs, just needs some way to drop them in SCSI ML. BSG is almost perfect for this, but doesn't do iovec, leading to lots of memcpy. syslets? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Wed, 9 Jan 2008 10:04:42 +0100 Jarek Poplawski <[EMAIL PROTECTED]> wrote: > On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote: > ... > > diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c > > new file mode 100644 > > index 000..495575a > > --- /dev/null > > +++ b/lib/iommu-helper.c > > @@ -0,0 +1,80 @@ > > +/* > > + * IOMMU helper functions for the free area management > > + */ > > + > > +#include > > +#include > > + > > +static unsigned long find_next_zero_area(unsigned long *map, > > +unsigned long size, > > +unsigned long start, > > +unsigned int nr, > > +unsigned long align_mask) > > +{ > > + unsigned long index, end, i; > > +again: > > + index = find_next_zero_bit(map, size, start); > > + > > + /* Align allocation */ > > + index = (index + align_mask) & ~align_mask; > > + > > + end = index + nr; > > + if (end >= size) > > + return -1; > > This '>=' looks doubtful to me, e.g.: > map points to 0s only, size = 64, nr = 64, > we get: index = 0; end = 64; > and: return -1 ?! You are right. I did it only because I didn't want to change the original code (iommu_range_alloc in arch/powerpc/kernel/iommu.c). I thought that there might be a mysterious reason for it so I let it alone since it's tiny loss. Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Wed, 9 Jan 2008 10:04:42 +0100 Jarek Poplawski [EMAIL PROTECTED] wrote: On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote: ... diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c new file mode 100644 index 000..495575a --- /dev/null +++ b/lib/iommu-helper.c @@ -0,0 +1,80 @@ +/* + * IOMMU helper functions for the free area management + */ + +#include linux/module.h +#include linux/bitops.h + +static unsigned long find_next_zero_area(unsigned long *map, +unsigned long size, +unsigned long start, +unsigned int nr, +unsigned long align_mask) +{ + unsigned long index, end, i; +again: + index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) ~align_mask; + + end = index + nr; + if (end = size) + return -1; This '=' looks doubtful to me, e.g.: map points to 0s only, size = 64, nr = 64, we get: index = 0; end = 64; and: return -1 ?! You are right. I did it only because I didn't want to change the original code (iommu_range_alloc in arch/powerpc/kernel/iommu.c). I thought that there might be a mysterious reason for it so I let it alone since it's tiny loss. Thanks, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Tue, 8 Jan 2008 16:27:39 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Wed, 09 Jan 2008 08:57:53 +0900 > FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > > Andrew, can you replace > > > > iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch > > > > with the updated patch: > > > > http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html > > > > For your convenience I've attached the updated patch too. > > Thanks for putting the fix to -mm. > > --- a/lib/iommu-helper.c~a > > +++ a/lib/iommu-helper.c > > @@ -8,15 +8,20 @@ > > static unsigned long find_next_zero_area(unsigned long *map, > > unsigned long size, > > unsigned long start, > > -unsigned int nr) > > +unsigned int nr, > > +unsigned long align_mask) > > { > > unsigned long index, end, i; > > again: > > index = find_next_zero_bit(map, size, start); > > + > > + /* Align allocation */ > > + index = (index + align_mask) & ~align_mask; > > The ALIGN() macro is the approved way of doing this. > > (I don't think ALIGN adds much value really, especially given that you've > commented what's going on, but I guess it does make reviewing and reading a > little easier). Would be better to use __ALIGN_MASK? I can find only one user who directly use __ALIGN_MASK. The POWER IOMMU calculates align_mask by itself so it's easier to pass align_mask as an argument. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bsg : Add support for io vectors in bsg
On Tue, 8 Jan 2008 17:09:18 -0500 Pete Wyckoff <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote on Sat, 05 Jan 2008 14:01 +0900: > > From: Deepak Colluru <[EMAIL PROTECTED]> > > Subject: [PATCH] bsg : Add support for io vectors in bsg > > Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST) > > > > > From: Deepak Colluru <[EMAIL PROTECTED]> > > > > > > Add support for io vectors in bsg. > > > > > > Signed-off-by: Deepak Colluru <[EMAIL PROTECTED]> > > > --- > > > bsg.c | 52 +--- > > > 1 file changed, 49 insertions(+), 3 deletions(-) > > > > Thanks, but I have to NACK this. > > > > You can find the discussion about bsg io vector support and a similar > > patch in linux-scsi archive. I have no plan to support it since it > > needs the compat hack. > > You may recall this is one of the patches I need to use bsg with OSD > devices. OSDs overload the SCSI buffer model to put mulitple fields > in dataout and datain. Some is user data, but some is more > logically created by a library. Memcpying in userspace to wedge all > the segments into a single buffer is painful, and is required both > on outgoing and incoming data buffers. > > There are two approaches to add iovec to bsg. > > 1. Define a new sg_iovec_v4 that uses constant width types. Both > 32- and 64-bit userspace would hand arrays of this to the kernel. > > struct sg_v4_iovec { > __u64 iov_base; > __u32 iov_len; > __u32 __pad1; > }; > > Old patch here: http://article.gmane.org/gmane.linux.scsi/30461/ As I said before, I don't think that inventing a new "iovec" is a good idea. sgv3 use the common "iovec". In addition, sg_io_v4 can be used by other OSes like sg_io_v3. > 2. Do as Deepak has done, using the existing sg_iovec, but then > also work around the compat issue. Old v3 sg_iovec is: > > typedef struct sg_iovec /* same structure as used by readv() Linux system > */ > { /* call. It defines one scatter-gather element. */ > void __user *iov_base; /* Starting address */ > size_t iov_len; /* Length in bytes */ > } sg_iovec_t; > > Old patch here: http://article.gmane.org/gmane.linux.scsi/30460/ > > I took another look at the compat approach, to see if it is feasible > to keep the compat handling somewhere else, without the use of #ifdef > CONFIG_COMPAT and size-comparison code inside bsg.c. I don't see how. > The use of iovec is within a write operation on a char device. It's > not amenable to a compat_sys_ or a .compat_ioctl approach. > > I'm partial to #1 because the use of architecture-independent fields > matches the rest of struct sg_io_v4. But if you don't want to have > another iovec type in the kernel, could we do #2 but just return > -EINVAL if the need for compat is detected? I.e. change > dout_iovec_count to dout_iovec_length and do the math? If you are ok with removing the write/read interface and just have ioctl, we could can handle comapt stuff like others do. But I think that you (OSD people) really want to keep the write/read interface. Sorry, I think that there is no workaround to support iovec in bsg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Tue, 8 Jan 2008 16:59:48 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > > The patches are available at: > > > > http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/ > > > > Or if you prefer the git tree: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git > > iommu-sg-fixes > > btw., these improvements to the IOMMU code are in -mm and will go into > v2.6.25, right? The changes look robust to me. Thanks, they have been in -mm though the iommu helper fix hasn't yet. Balbir Singh found the bug in 2.6.24-rc6-mm1. I've just check mmotm and found that the IOMMU helper patch doesn't include the fix. Andrew, can you replace iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch with the updated patch: http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html For your convenience I've attached the updated patch too. Hopefully, they will go into v2.6.25. At least, I hope that the patches (0001-0011) that make the IOMMUs respect segment size limits when merging sg lists will be merged. They are simple and I got ACKs on POWER and PARISC. Thanks, = From: FUJITA Tomonori <[EMAIL PROTECTED]> Subject: [PATCH] add IOMMU helper functions for the free area management This adds IOMMU helper functions for the free area management. These functions take care of LLD's segment boundary limit for IOMMUs. They would be useful for IOMMUs that use bitmap for the free area management. Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> --- include/linux/iommu-helper.h |7 lib/Makefile |1 + lib/iommu-helper.c | 80 ++ 3 files changed, 88 insertions(+), 0 deletions(-) create mode 100644 include/linux/iommu-helper.h create mode 100644 lib/iommu-helper.c diff --git a/include/linux/iommu-helper.h b/include/linux/iommu-helper.h new file mode 100644 index 000..4dd4c04 --- /dev/null +++ b/include/linux/iommu-helper.h @@ -0,0 +1,7 @@ +extern unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, + unsigned long start, unsigned int nr, + unsigned long shift, + unsigned long boundary_size, + unsigned long align_mask); +extern void iommu_area_free(unsigned long *map, unsigned long start, + unsigned int nr); diff --git a/lib/Makefile b/lib/Makefile index b6793ed..0e7383f 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -64,6 +64,7 @@ obj-$(CONFIG_SMP) += percpu_counter.o obj-$(CONFIG_AUDIT_GENERIC) += audit.o obj-$(CONFIG_SWIOTLB) += swiotlb.o +obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o lib-$(CONFIG_GENERIC_BUG) += bug.o diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c new file mode 100644 index 000..495575a --- /dev/null +++ b/lib/iommu-helper.c @@ -0,0 +1,80 @@ +/* + * IOMMU helper functions for the free area management + */ + +#include +#include + +static unsigned long find_next_zero_area(unsigned long *map, +unsigned long size, +unsigned long start, +unsigned int nr, +unsigned long align_mask) +{ + unsigned long index, end, i; +again: + index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) & ~align_mask; + + end = index + nr; + if (end >= size) + return -1; + for (i = index; i < end; i++) { + if (test_bit(i, map)) { + start = i+1; + goto again; + } + } + return index; +} + +static inline void set_bit_area(unsigned long *map, unsigned long i, + int len) +{ + unsigned long end = i + len; + while (i < end) { + __set_bit(i, map); + i++; + } +} + +static inline int is_span_boundary(unsigned int index, unsigned int nr, + unsigned long shift, + unsigned long boundary_size) +{ + shift = (shift + index) & (boundary_size - 1); + return shift + nr > boundary_size; +} + +unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, + unsigned long start, unsigned int nr, + unsigned long shift, unsigned long boundary_size, + unsigned long align_mask) +{ + unsigned long index; +again: + index = find_next_zero_area(map, size, start, nr, align_mask); + if (index != -1) { +
Re: [GIT PATCH] SCSI bug fixes for 2.6.24-rc3
CC'ed Jes, On Tue, 8 Jan 2008 14:15:53 +0300 Evgeniy Dushistov <[EMAIL PROTECTED]> wrote: > On Sun, Nov 25, 2007 at 01:24:25PM +0200, James Bottomley wrote: > > This is a bit of a rash of bug fixes. The qla1280 is actually a bug fix > > (in spite of the title---it's actually correcting an existing problem > > with the qla1280 implementation of accessors that broke the current > > driver). > > > > Recently I build last Linus's git tree, and got: > req_cnt is used uninitialized in this function, > and see bellow > > The patch is available here: > > > > master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git > > > > The short changelog is: > ... > > Jes Sorensen (1): > > qla1280: convert to use the data buffer accessors > > > ... > > scsi/qla1280.c | 387 > > + > ... > > /* Calculate number of entries and segments required. */ > > - req_cnt = 1; > > > > Initilization of req_cnt was removed, but in this function > there are places like > req_cnt += (seg_cnt - 4) / 7; > or > req_cnt++; > This is should be so? req_cnt should not be removed. Jes tested the patch but this critical bug appears only with BITS_PER_LONG != 64 && CONFIG_HIGHMEM=n case. So he didn't see this, I think. qla1280_32bit_start_scsi also gives the following warning: drivers/scsi/qla1280.c: In function 'qla1280_32bit_start_scsi': drivers/scsi/qla1280.c:3044: warning: unused variable 'dma_handle' diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c index 146d540..2886407 100644 --- a/drivers/scsi/qla1280.c +++ b/drivers/scsi/qla1280.c @@ -3041,7 +3041,6 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) int cnt; int req_cnt; int seg_cnt; - dma_addr_t dma_handle; u8 dir; ENTER("qla1280_32bit_start_scsi"); @@ -3050,6 +3049,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) cmd->cmnd[0]); /* Calculate number of entries and segments required. */ + req_cnt = 1; seg_cnt = scsi_dma_map(cmd); if (seg_cnt) { /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PATCH] SCSI bug fixes for 2.6.24-rc3
CC'ed Jes, On Tue, 8 Jan 2008 14:15:53 +0300 Evgeniy Dushistov [EMAIL PROTECTED] wrote: On Sun, Nov 25, 2007 at 01:24:25PM +0200, James Bottomley wrote: This is a bit of a rash of bug fixes. The qla1280 is actually a bug fix (in spite of the title---it's actually correcting an existing problem with the qla1280 implementation of accessors that broke the current driver). Recently I build last Linus's git tree, and got: req_cnt is used uninitialized in this function, and see bellow The patch is available here: master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git The short changelog is: ... Jes Sorensen (1): qla1280: convert to use the data buffer accessors ... scsi/qla1280.c | 387 + ... /* Calculate number of entries and segments required. */ - req_cnt = 1; Initilization of req_cnt was removed, but in this function there are places like req_cnt += (seg_cnt - 4) / 7; or req_cnt++; This is should be so? req_cnt should not be removed. Jes tested the patch but this critical bug appears only with BITS_PER_LONG != 64 CONFIG_HIGHMEM=n case. So he didn't see this, I think. qla1280_32bit_start_scsi also gives the following warning: drivers/scsi/qla1280.c: In function 'qla1280_32bit_start_scsi': drivers/scsi/qla1280.c:3044: warning: unused variable 'dma_handle' diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c index 146d540..2886407 100644 --- a/drivers/scsi/qla1280.c +++ b/drivers/scsi/qla1280.c @@ -3041,7 +3041,6 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) int cnt; int req_cnt; int seg_cnt; - dma_addr_t dma_handle; u8 dir; ENTER(qla1280_32bit_start_scsi); @@ -3050,6 +3049,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct srb * sp) cmd-cmnd[0]); /* Calculate number of entries and segments required. */ + req_cnt = 1; seg_cnt = scsi_dma_map(cmd); if (seg_cnt) { /* -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Tue, 8 Jan 2008 16:59:48 +0100 Ingo Molnar [EMAIL PROTECTED] wrote: * FUJITA Tomonori [EMAIL PROTECTED] wrote: The patches are available at: http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/ Or if you prefer the git tree: git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git iommu-sg-fixes btw., these improvements to the IOMMU code are in -mm and will go into v2.6.25, right? The changes look robust to me. Thanks, they have been in -mm though the iommu helper fix hasn't yet. Balbir Singh found the bug in 2.6.24-rc6-mm1. I've just check mmotm and found that the IOMMU helper patch doesn't include the fix. Andrew, can you replace iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch with the updated patch: http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html For your convenience I've attached the updated patch too. Hopefully, they will go into v2.6.25. At least, I hope that the patches (0001-0011) that make the IOMMUs respect segment size limits when merging sg lists will be merged. They are simple and I got ACKs on POWER and PARISC. Thanks, = From: FUJITA Tomonori [EMAIL PROTECTED] Subject: [PATCH] add IOMMU helper functions for the free area management This adds IOMMU helper functions for the free area management. These functions take care of LLD's segment boundary limit for IOMMUs. They would be useful for IOMMUs that use bitmap for the free area management. Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED] --- include/linux/iommu-helper.h |7 lib/Makefile |1 + lib/iommu-helper.c | 80 ++ 3 files changed, 88 insertions(+), 0 deletions(-) create mode 100644 include/linux/iommu-helper.h create mode 100644 lib/iommu-helper.c diff --git a/include/linux/iommu-helper.h b/include/linux/iommu-helper.h new file mode 100644 index 000..4dd4c04 --- /dev/null +++ b/include/linux/iommu-helper.h @@ -0,0 +1,7 @@ +extern unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, + unsigned long start, unsigned int nr, + unsigned long shift, + unsigned long boundary_size, + unsigned long align_mask); +extern void iommu_area_free(unsigned long *map, unsigned long start, + unsigned int nr); diff --git a/lib/Makefile b/lib/Makefile index b6793ed..0e7383f 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -64,6 +64,7 @@ obj-$(CONFIG_SMP) += percpu_counter.o obj-$(CONFIG_AUDIT_GENERIC) += audit.o obj-$(CONFIG_SWIOTLB) += swiotlb.o +obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o lib-$(CONFIG_GENERIC_BUG) += bug.o diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c new file mode 100644 index 000..495575a --- /dev/null +++ b/lib/iommu-helper.c @@ -0,0 +1,80 @@ +/* + * IOMMU helper functions for the free area management + */ + +#include linux/module.h +#include linux/bitops.h + +static unsigned long find_next_zero_area(unsigned long *map, +unsigned long size, +unsigned long start, +unsigned int nr, +unsigned long align_mask) +{ + unsigned long index, end, i; +again: + index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) ~align_mask; + + end = index + nr; + if (end = size) + return -1; + for (i = index; i end; i++) { + if (test_bit(i, map)) { + start = i+1; + goto again; + } + } + return index; +} + +static inline void set_bit_area(unsigned long *map, unsigned long i, + int len) +{ + unsigned long end = i + len; + while (i end) { + __set_bit(i, map); + i++; + } +} + +static inline int is_span_boundary(unsigned int index, unsigned int nr, + unsigned long shift, + unsigned long boundary_size) +{ + shift = (shift + index) (boundary_size - 1); + return shift + nr boundary_size; +} + +unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, + unsigned long start, unsigned int nr, + unsigned long shift, unsigned long boundary_size, + unsigned long align_mask) +{ + unsigned long index; +again: + index = find_next_zero_area(map, size, start, nr, align_mask); + if (index != -1) { + if (is_span_boundary(index, nr, shift, boundary_size)) { + /* we could do more
Re: 2.6.24-rc6-mm1
On Tue, 8 Jan 2008 16:27:39 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Wed, 09 Jan 2008 08:57:53 +0900 FUJITA Tomonori [EMAIL PROTECTED] wrote: Andrew, can you replace iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch with the updated patch: http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html For your convenience I've attached the updated patch too. generates the incremental Thanks for putting the fix to -mm. --- a/lib/iommu-helper.c~a +++ a/lib/iommu-helper.c @@ -8,15 +8,20 @@ static unsigned long find_next_zero_area(unsigned long *map, unsigned long size, unsigned long start, -unsigned int nr) +unsigned int nr, +unsigned long align_mask) { unsigned long index, end, i; again: index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) ~align_mask; The ALIGN() macro is the approved way of doing this. (I don't think ALIGN adds much value really, especially given that you've commented what's going on, but I guess it does make reviewing and reading a little easier). Would be better to use __ALIGN_MASK? I can find only one user who directly use __ALIGN_MASK. The POWER IOMMU calculates align_mask by itself so it's easier to pass align_mask as an argument. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd
On Mon, 07 Jan 2008 15:25:36 +0200 Boaz Harrosh <[EMAIL PROTECTED]> wrote: > On Mon, Jan 07 2008 at 8:53 +0200, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > On Sun, 23 Dec 2007 13:09:05 +0200 > > Boaz Harrosh <[EMAIL PROTECTED]> wrote: > > > >> On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt <[EMAIL > >> PROTECTED]> wrote: > >>> The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly > >>> by some low level drivers (that typically happens with USB mass > >>> storage). > >>> > >>> This is a problem on non cache coherent architectures such as > >>> embedded PowerPCs where the sense buffer can share cache lines with > >>> other structure members, which leads to various forms of corruption. > >>> > >>> This uses the newly defined __dma_buffer annotation to enforce that > >>> on such platforms, the sense_buffer is contained within its own > >>> cache line. This has no effect on cache coherent architectures. > >>> > >>> Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> > >>> --- > >>> > >>> include/scsi/scsi_cmnd.h |2 +- > >>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>> > >>> --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 > >>> 13:07:14.0 +1100 > >>> +++ linux-merge/include/scsi/scsi_cmnd.h 2007-12-21 13:07:29.0 > >>> +1100 > >>> @@ -88,7 +88,7 @@ struct scsi_cmnd { > >>> working on */ > >>> > >>> #define SCSI_SENSE_BUFFERSIZE96 > >>> - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE]; > >>> + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer; > >>> /* obtained by REQUEST SENSE when > >>>* CHECK CONDITION is received on original > >>>* command (auto-sense) */ > >> This has the potential of leaving a big fat ugly hole in the middle of > >> scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be > >> the *first member* of struct scsi_cmnd. The command itself is already cache > >> aligned, allocated by the proper flags to it's slab. And put a fat comment > >> near it's definition. > >> > >> This is until a proper fix is sent. I have in my Q a proposition for a > >> more prominent solution, which I will send next month. Do to short comings > >> in the sense handling and optimizations, but should definitely cover this > >> problem. > >> > >> The code should have time to be discussed and tested, so it is only 2.6.26 > >> material. For the duration of the 2.6.25 kernel we can live with a reorder > >> of scsi_cmnd members, if it solves such a grave bug for some ARCHs. > >> > >> Boaz > >> > >> [RFD below] > >> My proposed solution will be has follows: > >> > >> 1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an > >> error > >> in effect the Q is frozen until the REQUEST_SENSE command returns. > >> > >> 2. The scsi-ml needs the sense buffer for its normal operation, > >> independent > >> from the ULD's request of the sence_buffer or not at request->sense. > >> But > >> in effect, 90% of all scsi-requests come with ULD's allocated buffer > >> for > >> sense, that is copied to, on command completion. > >> > >> 3. 99% of all commands complete successfully, so if an optimization is > >> proposed for the successful case, sacrificing a few cycles for the > >> error > >> case than thats a good thing. > >> > >> My suggestion is to have a per Q, driver-overridable, sense buffer that > >> is > >> DMAed/written to by drivers. At the end of the REQUEST_SENSE command one > >> of 2 things will be done. Either copy the sense to the ULD's supplied > >> buffer, > >> or if not available, allocate one from a dedicated mem_cache pool. > >> > >> So we are completely saving 92 bytes from scsi_cmnd by replacing the > >> buffer > >> with a pointer. We can always read the sense into a per Q buffer. And 10% > >> of > >> %1 of the times we will need to allocate a sense buffer from a dedicated > >> mem_cache > >> I would say thats a nice opt
Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd
On Mon, 07 Jan 2008 15:25:36 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Mon, Jan 07 2008 at 8:53 +0200, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sun, 23 Dec 2007 13:09:05 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly by some low level drivers (that typically happens with USB mass storage). This is a problem on non cache coherent architectures such as embedded PowerPCs where the sense buffer can share cache lines with other structure members, which leads to various forms of corruption. This uses the newly defined __dma_buffer annotation to enforce that on such platforms, the sense_buffer is contained within its own cache line. This has no effect on cache coherent architectures. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- include/scsi/scsi_cmnd.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 13:07:14.0 +1100 +++ linux-merge/include/scsi/scsi_cmnd.h 2007-12-21 13:07:29.0 +1100 @@ -88,7 +88,7 @@ struct scsi_cmnd { working on */ #define SCSI_SENSE_BUFFERSIZE96 - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE]; + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer; /* obtained by REQUEST SENSE when * CHECK CONDITION is received on original * command (auto-sense) */ This has the potential of leaving a big fat ugly hole in the middle of scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be the *first member* of struct scsi_cmnd. The command itself is already cache aligned, allocated by the proper flags to it's slab. And put a fat comment near it's definition. This is until a proper fix is sent. I have in my Q a proposition for a more prominent solution, which I will send next month. Do to short comings in the sense handling and optimizations, but should definitely cover this problem. The code should have time to be discussed and tested, so it is only 2.6.26 material. For the duration of the 2.6.25 kernel we can live with a reorder of scsi_cmnd members, if it solves such a grave bug for some ARCHs. Boaz [RFD below] My proposed solution will be has follows: 1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an error in effect the Q is frozen until the REQUEST_SENSE command returns. 2. The scsi-ml needs the sense buffer for its normal operation, independent from the ULD's request of the sence_buffer or not at request-sense. But in effect, 90% of all scsi-requests come with ULD's allocated buffer for sense, that is copied to, on command completion. 3. 99% of all commands complete successfully, so if an optimization is proposed for the successful case, sacrificing a few cycles for the error case than thats a good thing. My suggestion is to have a per Q, driver-overridable, sense buffer that is DMAed/written to by drivers. At the end of the REQUEST_SENSE command one of 2 things will be done. Either copy the sense to the ULD's supplied buffer, or if not available, allocate one from a dedicated mem_cache pool. So we are completely saving 92 bytes from scsi_cmnd by replacing the buffer with a pointer. We can always read the sense into a per Q buffer. And 10% of %1 of the times we will need to allocate a sense buffer from a dedicated mem_cache I would say thats a nice optimization. The changes to scsi_error/scsi_cmnd and friends, is pretty strait forward. But it depends on a conversion of 4/5 drivers to the new scsi_eh API for REQUEST_SENSE. I have only converted these drivers that interfered with the accessors effort + 1 other places. But there are a few more places that are not converted. Once done. The implementation can easily change with no affect on drivers. I think that removing the sense_buffer array from scsi_cmnd effects lots of LLDs. As I wrote in other mail, many LLDs assume that scsi_cmnd:sense_buffer is always available. Another big task is to take care about auto sense. Have you already had some patches? I've just started to work on this and I'd like to push that fix into 2.6.25. Tomo Hi. Since you ask to push this into 2.6.25, I have ask permission to prioritize this effort, as until now it was on a back burner. There are no short-term solusions and seems that __dma_buffer will not be merged. It would be better to fix this soon though it's a bit hard to fix this before 2.6.25, I think. I have only done 3 drivers up to now. (out of something like 15) I have seen 4 patterns of sense use. 1
Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd
On Sun, 23 Dec 2007 13:09:05 +0200 Boaz Harrosh <[EMAIL PROTECTED]> wrote: > On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt <[EMAIL PROTECTED]> > wrote: > > The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly > > by some low level drivers (that typically happens with USB mass > > storage). > > > > This is a problem on non cache coherent architectures such as > > embedded PowerPCs where the sense buffer can share cache lines with > > other structure members, which leads to various forms of corruption. > > > > This uses the newly defined __dma_buffer annotation to enforce that > > on such platforms, the sense_buffer is contained within its own > > cache line. This has no effect on cache coherent architectures. > > > > Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> > > --- > > > > include/scsi/scsi_cmnd.h |2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 > > 13:07:14.0 +1100 > > +++ linux-merge/include/scsi/scsi_cmnd.h2007-12-21 13:07:29.0 > > +1100 > > @@ -88,7 +88,7 @@ struct scsi_cmnd { > >working on */ > > > > #define SCSI_SENSE_BUFFERSIZE 96 > > - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE]; > > + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer; > > /* obtained by REQUEST SENSE when > > * CHECK CONDITION is received on original > > * command (auto-sense) */ > > This has the potential of leaving a big fat ugly hole in the middle of > scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be > the *first member* of struct scsi_cmnd. The command itself is already cache > aligned, allocated by the proper flags to it's slab. And put a fat comment > near it's definition. > > This is until a proper fix is sent. I have in my Q a proposition for a > more prominent solution, which I will send next month. Do to short comings > in the sense handling and optimizations, but should definitely cover this > problem. > > The code should have time to be discussed and tested, so it is only 2.6.26 > material. For the duration of the 2.6.25 kernel we can live with a reorder > of scsi_cmnd members, if it solves such a grave bug for some ARCHs. > > Boaz > > [RFD below] > My proposed solution will be has follows: > > 1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an error > in effect the Q is frozen until the REQUEST_SENSE command returns. > > 2. The scsi-ml needs the sense buffer for its normal operation, independent > from the ULD's request of the sence_buffer or not at request->sense. But > in effect, 90% of all scsi-requests come with ULD's allocated buffer for > sense, that is copied to, on command completion. > > 3. 99% of all commands complete successfully, so if an optimization is > proposed for the successful case, sacrificing a few cycles for the error > case than thats a good thing. > > My suggestion is to have a per Q, driver-overridable, sense buffer that is > DMAed/written to by drivers. At the end of the REQUEST_SENSE command one > of 2 things will be done. Either copy the sense to the ULD's supplied buffer, > or if not available, allocate one from a dedicated mem_cache pool. > > So we are completely saving 92 bytes from scsi_cmnd by replacing the buffer > with a pointer. We can always read the sense into a per Q buffer. And 10% of > %1 of the times we will need to allocate a sense buffer from a dedicated > mem_cache > I would say thats a nice optimization. > > The changes to scsi_error/scsi_cmnd and friends, is pretty strait forward. > But > it depends on a conversion of 4/5 drivers to the new scsi_eh API for > REQUEST_SENSE. I have only converted these drivers that interfered with the > accessors > effort + 1 other places. But there are a few more places that are not > converted. > Once done. The implementation can easily change with no affect on drivers. I think that removing the sense_buffer array from scsi_cmnd effects lots of LLDs. As I wrote in other mail, many LLDs assume that scsi_cmnd:sense_buffer is always available. Another big task is to take care about auto sense. Have you already had some patches? I've just started to work on this and I'd like to push that fix into 2.6.25. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, 6 Jan 2008 21:03:42 +0100 "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > On Jan 6, 2008 2:33 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > On Sun, 6 Jan 2008 12:35:35 +0100 > > "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > > > On Jan 6, 2008 12:23 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > > > On Sun, 6 Jan 2008 11:41:10 +0100 > > > > "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > > > > > I will applie your patch and see if this hunk from > > > > > find_next_zero_area() makes a difference: > > > > > > > > > >end = index + nr; > > > > > - if (end > size) > > > > > + if (end >= size) > > > > > return -1; > > -> that might still have made a difference, but ... > > > > > > - for (i = index + 1; i < end; i++) { > > > > > + for (i = index; i < end; i++) { > > ... as you say below, the test for the index position is only needed > if index is modified after find_next_zero_bit(). > > > > > > if (test_bit(i, map)) { > > > > > > > > The patch should not make a difference for X86_64. > > > > > > Hmm... > > > arch/x86/kernel/pci-gart_64.c: > > > alloc_iommu() calls iommu_area_alloc() > > > lib/iommu-helper.c: > > > iommu_area_alloc() calls find_next_zero_area() > > > -> so the above code should be called even on X86_64 > > > > Oops, I meant that the patch fixes the align allocation (non zero > > align_mask case). X86_64 doesn't use the align allocation. > > > > > > > And the change in the for loop means that 'index' will now be tested, > > > but with the old code it was not. > > > > With the old code, 'index' is tested by find_next_zero_bit. > > > > With the new code and non zero align_mask case, 'index' is not tested > > by find_next_zero_bit. So test_bit needs to start with 'index'. > > > > So If I understand the correctly, this patch should not make a > > difference for x86_64 though I might miss something. > > You did not miss anything. > After 18 packages my system crashed again. > > > > And double using something does fit with the errors I'm seeing... > > > > > > > Can you try the patch to revert my IOMMU changes? > > > > > > > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html > > > > > > Testing for this bug is a little bit slow, as I'm compiling ~100 > > > packages trying to trigger it. > > > If my current testrun with the patch from > > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html > > > crashes, I will revert the hole IOMMU changes with above patch and try > > > again. > > > > Thanks for testing, > > OK, I'm still testing this, but after 95 completed packages I'm rather > certain that reverting the IOMMU changes with this patch fixes my > problem. > I didn't have time to look more into this, so I can't offer any > concrete ideas where the bug is. > > If you send more patches, I'm willing to test them, but it might take > some more time during the next week. Can you try 2.6.24-rc7 + the IOMMU changes? The patches are available at: http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/ Or if you prefer the git tree: git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git iommu-sg-fixes I've looked at the changes to GART but they are straightforward and don't look wrong... Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, 6 Jan 2008 12:35:35 +0100 "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > On Jan 6, 2008 12:23 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > On Sun, 6 Jan 2008 11:41:10 +0100 > > "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > > > I will applie your patch and see if this hunk from > > > find_next_zero_area() makes a difference: > > > > > >end = index + nr; > > > - if (end > size) > > > + if (end >= size) > > > return -1; > > > - for (i = index + 1; i < end; i++) { > > > + for (i = index; i < end; i++) { > > > if (test_bit(i, map)) { > > > > The patch should not make a difference for X86_64. > > Hmm... > arch/x86/kernel/pci-gart_64.c: > alloc_iommu() calls iommu_area_alloc() > lib/iommu-helper.c: > iommu_area_alloc() calls find_next_zero_area() > -> so the above code should be called even on X86_64 Oops, I meant that the patch fixes the align allocation (non zero align_mask case). X86_64 doesn't use the align allocation. > And the change in the for loop means that 'index' will now be tested, > but with the old code it was not. With the old code, 'index' is tested by find_next_zero_bit. With the new code and non zero align_mask case, 'index' is not tested by find_next_zero_bit. So test_bit needs to start with 'index'. So If I understand the correctly, this patch should not make a difference for x86_64 though I might miss something. > And double using something does fit with the errors I'm seeing... > > > Can you try the patch to revert my IOMMU changes? > > > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html > > Testing for this bug is a little bit slow, as I'm compiling ~100 > packages trying to trigger it. > If my current testrun with the patch from > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html > crashes, I will revert the hole IOMMU changes with above patch and try again. Thanks for testing, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, 6 Jan 2008 11:41:10 +0100 "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > On Jan 6, 2008 4:28 AM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > > On Sat, 5 Jan 2008 17:25:24 -0800 > > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > On Sat, 5 Jan 2008 23:10:17 +0100 "Torsten Kaiser" <[EMAIL PROTECTED]> > > > wrote: > > > > But the cause of my mail is the following question: > > > > Regarding my "iommu-sg-merging-patches are new in -rc3-mm and could be > > > > the cause"-suspicion I looked at these patches and came across these > > > > hunks: > > > > > > > > This is removed from arch/x86/lib/bitstr_64.c: > > > > -/* Find string of zero bits in a bitmap */ > > > > -unsigned long > > > > -find_next_zero_string(unsigned long *bitmap, long start, long nbits, > > > > int len) > > > > -{ > > > > - unsigned long n, end, i; > > > > - > > > > - again: > > > > - n = find_next_zero_bit(bitmap, nbits, start); > > > > - if (n == -1) > > > > - return -1; > > > > - > > > > - /* could test bitsliced, but it's hardly worth it */ > > > > - end = n+len; > > > > - if (end > nbits) > > > > - return -1; > > > > - for (i = n+1; i < end; i++) { > > > > - if (test_bit(i, bitmap)) { > > > > - start = i+1; > > > > - goto again; > > > > - } > > > > - } > > > > - return n; > > > > -} > > > > > > > > This is added to lib/iommu-helper.c: > > > > +static unsigned long find_next_zero_area(unsigned long *map, > > > > +unsigned long size, > > > > +unsigned long start, > > > > +unsigned int nr) > > > > +{ > > > > + unsigned long index, end, i; > > > > +again: > > > > + index = find_next_zero_bit(map, size, start); > > > > + end = index + nr; > > > > + if (end > size) > > > > + return -1; > > > > + for (i = index + 1; i < end; i++) { > > > > + if (test_bit(i, map)) { > > > > + start = i+1; > > > > + goto again; > > > > + } > > > > + } > > > > + return index; > > > > +} > > > > > > > > The old version checks, if find_next_zero_bit returns -1, the new > > > > version doesn't do this. > > > > Is this intended and can find_next_zero_bit never fail? > > > > Hmm... but in the worst case it should only loop forever if I'm > > > > reading this right (index = -1 => for-loop counts from 0 to nr, if any > > > > bit is set it will jump to "again:" and if the next call to > > > > find_next_zero_bit also fails, its an endless loop) > > > > find_next_zero_bit returns -1? > > > > It seems that x86_64 doesn't. > > I'm sorry. I didn't look into find_next_zero_bit, I only noted that > the old version did check for -1 and the new one didn't. > Obviously the old check was superfluous. > > > POWER and SPARC64 IOMMUs use > > find_next_zero_bit too but both doesn't check if find_next_zero_bit > > returns -1. If find_next_zero_bit fails, it returns size. So it > > doesn't leads to an endless loop. > > Yes, this can't happen. It was a wrong assumption on my part. > > > But this patch has other bugs that break POWER IOMMUs. > > > > If you use the IOMMUs on POWER, please try the following patch: > > I'm using CONFIG_GART_IOMMU=y on x86_64. > > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html > > I also noted the line "index = (index + align_mask) & ~align_mask;" in > iommu_area_alloc() and didn't understand what this was trying to do > and how this should work, but as arch/x86/kernel/pci-gart_64.c always > uses 0 as align_mask I just ignored it. Yeah, it's for only POWER IOMMUs. It's meaningless for gart and calgary IOMMUs. > I will applie your patch and see if this hunk from > find_next_zero_area() makes a difference: > >end = index + nr; > - if (end > size) > + if (end >= size) > return -1; > - for (i = index + 1; i < end; i++) { > + for (i = index; i < end; i++) { > if (test_bit(i, map)) { The patch should not make a difference for X86_64. Can you try the patch to revert my IOMMU changes? http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, 6 Jan 2008 21:03:42 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: On Jan 6, 2008 2:33 PM, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sun, 6 Jan 2008 12:35:35 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: On Jan 6, 2008 12:23 PM, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sun, 6 Jan 2008 11:41:10 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: I will applie your patch and see if this hunk from find_next_zero_area() makes a difference: end = index + nr; - if (end size) + if (end = size) return -1; - that might still have made a difference, but ... - for (i = index + 1; i end; i++) { + for (i = index; i end; i++) { ... as you say below, the test for the index position is only needed if index is modified after find_next_zero_bit(). if (test_bit(i, map)) { The patch should not make a difference for X86_64. Hmm... arch/x86/kernel/pci-gart_64.c: alloc_iommu() calls iommu_area_alloc() lib/iommu-helper.c: iommu_area_alloc() calls find_next_zero_area() - so the above code should be called even on X86_64 Oops, I meant that the patch fixes the align allocation (non zero align_mask case). X86_64 doesn't use the align allocation. And the change in the for loop means that 'index' will now be tested, but with the old code it was not. With the old code, 'index' is tested by find_next_zero_bit. With the new code and non zero align_mask case, 'index' is not tested by find_next_zero_bit. So test_bit needs to start with 'index'. So If I understand the correctly, this patch should not make a difference for x86_64 though I might miss something. You did not miss anything. After 18 packages my system crashed again. And double using something does fit with the errors I'm seeing... Can you try the patch to revert my IOMMU changes? http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html Testing for this bug is a little bit slow, as I'm compiling ~100 packages trying to trigger it. If my current testrun with the patch from http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html crashes, I will revert the hole IOMMU changes with above patch and try again. Thanks for testing, OK, I'm still testing this, but after 95 completed packages I'm rather certain that reverting the IOMMU changes with this patch fixes my problem. I didn't have time to look more into this, so I can't offer any concrete ideas where the bug is. If you send more patches, I'm willing to test them, but it might take some more time during the next week. Can you try 2.6.24-rc7 + the IOMMU changes? The patches are available at: http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/ Or if you prefer the git tree: git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git iommu-sg-fixes I've looked at the changes to GART but they are straightforward and don't look wrong... Thanks, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd
On Sun, 23 Dec 2007 13:09:05 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote: On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt [EMAIL PROTECTED] wrote: The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly by some low level drivers (that typically happens with USB mass storage). This is a problem on non cache coherent architectures such as embedded PowerPCs where the sense buffer can share cache lines with other structure members, which leads to various forms of corruption. This uses the newly defined __dma_buffer annotation to enforce that on such platforms, the sense_buffer is contained within its own cache line. This has no effect on cache coherent architectures. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- include/scsi/scsi_cmnd.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 13:07:14.0 +1100 +++ linux-merge/include/scsi/scsi_cmnd.h2007-12-21 13:07:29.0 +1100 @@ -88,7 +88,7 @@ struct scsi_cmnd { working on */ #define SCSI_SENSE_BUFFERSIZE 96 - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE]; + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer; /* obtained by REQUEST SENSE when * CHECK CONDITION is received on original * command (auto-sense) */ This has the potential of leaving a big fat ugly hole in the middle of scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be the *first member* of struct scsi_cmnd. The command itself is already cache aligned, allocated by the proper flags to it's slab. And put a fat comment near it's definition. This is until a proper fix is sent. I have in my Q a proposition for a more prominent solution, which I will send next month. Do to short comings in the sense handling and optimizations, but should definitely cover this problem. The code should have time to be discussed and tested, so it is only 2.6.26 material. For the duration of the 2.6.25 kernel we can live with a reorder of scsi_cmnd members, if it solves such a grave bug for some ARCHs. Boaz [RFD below] My proposed solution will be has follows: 1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an error in effect the Q is frozen until the REQUEST_SENSE command returns. 2. The scsi-ml needs the sense buffer for its normal operation, independent from the ULD's request of the sence_buffer or not at request-sense. But in effect, 90% of all scsi-requests come with ULD's allocated buffer for sense, that is copied to, on command completion. 3. 99% of all commands complete successfully, so if an optimization is proposed for the successful case, sacrificing a few cycles for the error case than thats a good thing. My suggestion is to have a per Q, driver-overridable, sense buffer that is DMAed/written to by drivers. At the end of the REQUEST_SENSE command one of 2 things will be done. Either copy the sense to the ULD's supplied buffer, or if not available, allocate one from a dedicated mem_cache pool. So we are completely saving 92 bytes from scsi_cmnd by replacing the buffer with a pointer. We can always read the sense into a per Q buffer. And 10% of %1 of the times we will need to allocate a sense buffer from a dedicated mem_cache I would say thats a nice optimization. The changes to scsi_error/scsi_cmnd and friends, is pretty strait forward. But it depends on a conversion of 4/5 drivers to the new scsi_eh API for REQUEST_SENSE. I have only converted these drivers that interfered with the accessors effort + 1 other places. But there are a few more places that are not converted. Once done. The implementation can easily change with no affect on drivers. I think that removing the sense_buffer array from scsi_cmnd effects lots of LLDs. As I wrote in other mail, many LLDs assume that scsi_cmnd:sense_buffer is always available. Another big task is to take care about auto sense. Have you already had some patches? I've just started to work on this and I'd like to push that fix into 2.6.25. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, 6 Jan 2008 11:41:10 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: On Jan 6, 2008 4:28 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sat, 5 Jan 2008 17:25:24 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Sat, 5 Jan 2008 23:10:17 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: But the cause of my mail is the following question: Regarding my iommu-sg-merging-patches are new in -rc3-mm and could be the cause-suspicion I looked at these patches and came across these hunks: This is removed from arch/x86/lib/bitstr_64.c: -/* Find string of zero bits in a bitmap */ -unsigned long -find_next_zero_string(unsigned long *bitmap, long start, long nbits, int len) -{ - unsigned long n, end, i; - - again: - n = find_next_zero_bit(bitmap, nbits, start); - if (n == -1) - return -1; - - /* could test bitsliced, but it's hardly worth it */ - end = n+len; - if (end nbits) - return -1; - for (i = n+1; i end; i++) { - if (test_bit(i, bitmap)) { - start = i+1; - goto again; - } - } - return n; -} This is added to lib/iommu-helper.c: +static unsigned long find_next_zero_area(unsigned long *map, +unsigned long size, +unsigned long start, +unsigned int nr) +{ + unsigned long index, end, i; +again: + index = find_next_zero_bit(map, size, start); + end = index + nr; + if (end size) + return -1; + for (i = index + 1; i end; i++) { + if (test_bit(i, map)) { + start = i+1; + goto again; + } + } + return index; +} The old version checks, if find_next_zero_bit returns -1, the new version doesn't do this. Is this intended and can find_next_zero_bit never fail? Hmm... but in the worst case it should only loop forever if I'm reading this right (index = -1 = for-loop counts from 0 to nr, if any bit is set it will jump to again: and if the next call to find_next_zero_bit also fails, its an endless loop) find_next_zero_bit returns -1? It seems that x86_64 doesn't. I'm sorry. I didn't look into find_next_zero_bit, I only noted that the old version did check for -1 and the new one didn't. Obviously the old check was superfluous. POWER and SPARC64 IOMMUs use find_next_zero_bit too but both doesn't check if find_next_zero_bit returns -1. If find_next_zero_bit fails, it returns size. So it doesn't leads to an endless loop. Yes, this can't happen. It was a wrong assumption on my part. But this patch has other bugs that break POWER IOMMUs. If you use the IOMMUs on POWER, please try the following patch: I'm using CONFIG_GART_IOMMU=y on x86_64. http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html I also noted the line index = (index + align_mask) ~align_mask; in iommu_area_alloc() and didn't understand what this was trying to do and how this should work, but as arch/x86/kernel/pci-gart_64.c always uses 0 as align_mask I just ignored it. Yeah, it's for only POWER IOMMUs. It's meaningless for gart and calgary IOMMUs. I will applie your patch and see if this hunk from find_next_zero_area() makes a difference: end = index + nr; - if (end size) + if (end = size) return -1; - for (i = index + 1; i end; i++) { + for (i = index; i end; i++) { if (test_bit(i, map)) { The patch should not make a difference for X86_64. Can you try the patch to revert my IOMMU changes? http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sun, 6 Jan 2008 12:35:35 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: On Jan 6, 2008 12:23 PM, FUJITA Tomonori [EMAIL PROTECTED] wrote: On Sun, 6 Jan 2008 11:41:10 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: I will applie your patch and see if this hunk from find_next_zero_area() makes a difference: end = index + nr; - if (end size) + if (end = size) return -1; - for (i = index + 1; i end; i++) { + for (i = index; i end; i++) { if (test_bit(i, map)) { The patch should not make a difference for X86_64. Hmm... arch/x86/kernel/pci-gart_64.c: alloc_iommu() calls iommu_area_alloc() lib/iommu-helper.c: iommu_area_alloc() calls find_next_zero_area() - so the above code should be called even on X86_64 Oops, I meant that the patch fixes the align allocation (non zero align_mask case). X86_64 doesn't use the align allocation. And the change in the for loop means that 'index' will now be tested, but with the old code it was not. With the old code, 'index' is tested by find_next_zero_bit. With the new code and non zero align_mask case, 'index' is not tested by find_next_zero_bit. So test_bit needs to start with 'index'. So If I understand the correctly, this patch should not make a difference for x86_64 though I might miss something. And double using something does fit with the errors I'm seeing... Can you try the patch to revert my IOMMU changes? http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html Testing for this bug is a little bit slow, as I'm compiling ~100 packages trying to trigger it. If my current testrun with the patch from http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html crashes, I will revert the hole IOMMU changes with above patch and try again. Thanks for testing, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sat, 5 Jan 2008 17:25:24 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Sat, 5 Jan 2008 23:10:17 +0100 "Torsten Kaiser" <[EMAIL PROTECTED]> wrote: > > > On Jan 5, 2008 3:52 PM, Torsten Kaiser <[EMAIL PROTECTED]> wrote: > > > On Jan 5, 2008 11:13 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: > > > > On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote: > > > > > On Jan 5, 2008 1:07 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote: > > > > > > I think it would be easier just to start with this working -rc6 and > > > > > > simply check if we have 'right' suspects, so: git-net.patch and > > > > > > git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I > > > > > > hope, > > > > > > can compile - otherwise you could try the other way: add the whole > > > > > > -mm > > > > > > and revert these two). Using current gits could complicate this > > > > > > "investigation". > > > > > > > > > > OK, I will try this... > > > > > > still on the todo-list, I had no time to try this yet... > > > > working on it... > > 2.6.24-rc6 + mm-patches up to git.battery (includes git-net and > > git-netdev-all) worked for 110 packages, then I proclaimed it good. > > 2.6.24-rc6 + mm-patches up to (including) git.nfsd is currently > > getting testet (9 packages done...) > > > > But the cause of my mail is the following question: > > Regarding my "iommu-sg-merging-patches are new in -rc3-mm and could be > > the cause"-suspicion I looked at these patches and came across these > > hunks: > > > > This is removed from arch/x86/lib/bitstr_64.c: > > -/* Find string of zero bits in a bitmap */ > > -unsigned long > > -find_next_zero_string(unsigned long *bitmap, long start, long nbits, int > > len) > > -{ > > - unsigned long n, end, i; > > - > > - again: > > - n = find_next_zero_bit(bitmap, nbits, start); > > - if (n == -1) > > - return -1; > > - > > - /* could test bitsliced, but it's hardly worth it */ > > - end = n+len; > > - if (end > nbits) > > - return -1; > > - for (i = n+1; i < end; i++) { > > - if (test_bit(i, bitmap)) { > > - start = i+1; > > - goto again; > > - } > > - } > > - return n; > > -} > > > > This is added to lib/iommu-helper.c: > > +static unsigned long find_next_zero_area(unsigned long *map, > > +unsigned long size, > > +unsigned long start, > > +unsigned int nr) > > +{ > > + unsigned long index, end, i; > > +again: > > + index = find_next_zero_bit(map, size, start); > > + end = index + nr; > > + if (end > size) > > + return -1; > > + for (i = index + 1; i < end; i++) { > > + if (test_bit(i, map)) { > > + start = i+1; > > + goto again; > > + } > > + } > > + return index; > > +} > > > > The old version checks, if find_next_zero_bit returns -1, the new > > version doesn't do this. > > Is this intended and can find_next_zero_bit never fail? > > Hmm... but in the worst case it should only loop forever if I'm > > reading this right (index = -1 => for-loop counts from 0 to nr, if any > > bit is set it will jump to "again:" and if the next call to > > find_next_zero_bit also fails, its an endless loop) find_next_zero_bit returns -1? It seems that x86_64 doesn't. POWER and SPARC64 IOMMUs use find_next_zero_bit too but both doesn't check if find_next_zero_bit returns -1. If find_next_zero_bit fails, it returns size. So it doesn't leads to an endless loop. But this patch has other bugs that break POWER IOMMUs. If you use the IOMMUs on POWER, please try the following patch: http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html > > I susect these are doing different things. > iommu-sg-kill-__clear_bit_string-and-find_next_zero_string.patch says: > > This kills unused __clear_bit_string and find_next_zero_string (they > were used by only gart and calgary IOMMUs). > > iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch says: > > This adds IOMMU helper functions for the free area management. These > functions take care of LLD's segment boundary limit for IOMMUs. They > would be useful for IOMMUs that use bitmap for the free area management. > > > So even if this can not explain my bug, could somebody check if this > > is a real bug or not? > > Let's cc the author of those changes. > > Thanks for persisting with this bug. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc6-mm1
On Sat, 5 Jan 2008 17:25:24 -0800 Andrew Morton [EMAIL PROTECTED] wrote: On Sat, 5 Jan 2008 23:10:17 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote: On Jan 5, 2008 3:52 PM, Torsten Kaiser [EMAIL PROTECTED] wrote: On Jan 5, 2008 11:13 AM, Jarek Poplawski [EMAIL PROTECTED] wrote: On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote: On Jan 5, 2008 1:07 AM, Jarek Poplawski [EMAIL PROTECTED] wrote: I think it would be easier just to start with this working -rc6 and simply check if we have 'right' suspects, so: git-net.patch and git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I hope, can compile - otherwise you could try the other way: add the whole -mm and revert these two). Using current gits could complicate this investigation. OK, I will try this... still on the todo-list, I had no time to try this yet... working on it... 2.6.24-rc6 + mm-patches up to git.battery (includes git-net and git-netdev-all) worked for 110 packages, then I proclaimed it good. 2.6.24-rc6 + mm-patches up to (including) git.nfsd is currently getting testet (9 packages done...) But the cause of my mail is the following question: Regarding my iommu-sg-merging-patches are new in -rc3-mm and could be the cause-suspicion I looked at these patches and came across these hunks: This is removed from arch/x86/lib/bitstr_64.c: -/* Find string of zero bits in a bitmap */ -unsigned long -find_next_zero_string(unsigned long *bitmap, long start, long nbits, int len) -{ - unsigned long n, end, i; - - again: - n = find_next_zero_bit(bitmap, nbits, start); - if (n == -1) - return -1; - - /* could test bitsliced, but it's hardly worth it */ - end = n+len; - if (end nbits) - return -1; - for (i = n+1; i end; i++) { - if (test_bit(i, bitmap)) { - start = i+1; - goto again; - } - } - return n; -} This is added to lib/iommu-helper.c: +static unsigned long find_next_zero_area(unsigned long *map, +unsigned long size, +unsigned long start, +unsigned int nr) +{ + unsigned long index, end, i; +again: + index = find_next_zero_bit(map, size, start); + end = index + nr; + if (end size) + return -1; + for (i = index + 1; i end; i++) { + if (test_bit(i, map)) { + start = i+1; + goto again; + } + } + return index; +} The old version checks, if find_next_zero_bit returns -1, the new version doesn't do this. Is this intended and can find_next_zero_bit never fail? Hmm... but in the worst case it should only loop forever if I'm reading this right (index = -1 = for-loop counts from 0 to nr, if any bit is set it will jump to again: and if the next call to find_next_zero_bit also fails, its an endless loop) find_next_zero_bit returns -1? It seems that x86_64 doesn't. POWER and SPARC64 IOMMUs use find_next_zero_bit too but both doesn't check if find_next_zero_bit returns -1. If find_next_zero_bit fails, it returns size. So it doesn't leads to an endless loop. But this patch has other bugs that break POWER IOMMUs. If you use the IOMMUs on POWER, please try the following patch: http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html I susect these are doing different things. iommu-sg-kill-__clear_bit_string-and-find_next_zero_string.patch says: This kills unused __clear_bit_string and find_next_zero_string (they were used by only gart and calgary IOMMUs). iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch says: This adds IOMMU helper functions for the free area management. These functions take care of LLD's segment boundary limit for IOMMUs. They would be useful for IOMMUs that use bitmap for the free area management. So even if this can not explain my bug, could somebody check if this is a real bug or not? Let's cc the author of those changes. Thanks for persisting with this bug. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bsg : Add support for io vectors in bsg
From: Deepak Colluru <[EMAIL PROTECTED]> Subject: [PATCH] bsg : Add support for io vectors in bsg Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST) > From: Deepak Colluru <[EMAIL PROTECTED]> > > Add support for io vectors in bsg. > > Signed-off-by: Deepak Colluru <[EMAIL PROTECTED]> > --- > bsg.c | 52 +--- > 1 file changed, 49 insertions(+), 3 deletions(-) Thanks, but I have to NACK this. You can find the discussion about bsg io vector support and a similar patch in linux-scsi archive. I have no plan to support it since it needs the compat hack. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] bsg : Add support for io vectors in bsg
From: Deepak Colluru [EMAIL PROTECTED] Subject: [PATCH] bsg : Add support for io vectors in bsg Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST) From: Deepak Colluru [EMAIL PROTECTED] Add support for io vectors in bsg. Signed-off-by: Deepak Colluru [EMAIL PROTECTED] --- bsg.c | 52 +--- 1 file changed, 49 insertions(+), 3 deletions(-) Thanks, but I have to NACK this. You can find the discussion about bsg io vector support and a similar patch in linux-scsi archive. I have no plan to support it since it needs the compat hack. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.24-rc BUGFIX] IB/srp: release transport when removing host
On Thu, 03 Jan 2008 21:39:19 -0500 Dave Dillow <[EMAIL PROTECTED]> wrote: > When removing the ib_srp module, srp_remove_one() does not release the > SRP transport class when it is releasing the SCSI host. This leads to > dangling references to kfree()'d memory, and an eventual oops. > > Signed-off-by: David Dillow <[EMAIL PROTECTED]> Thanks again! Linus has already merged your previous patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b0e47c8b79154772a436f25bf7646733e1d6194c So please resend a patch to move srp_remove_host before scsi_remove_host instead of adding srp_remove_host. > --- > On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: > > I think that this is the root problem and the patch fixes it in the > > right way. Please send this patch to [EMAIL PROTECTED] and a > > patch to move srp_remove_host before scsi_remove_host in > > srp_remove_one to Roland. > > > > Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]> > > Not sure if your Acked-by was for this one as well, so I left it off. Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]> > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c > b/drivers/infiniband/ulp/srp/ib_srp.c > index 950228f..bdb6f85 100644 > --- a/drivers/infiniband/ulp/srp/ib_srp.c > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > @@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device) > > list_for_each_entry_safe(target, tmp_target, >>target_list, list) { > + srp_remove_host(target->scsi_host); > scsi_remove_host(target->scsi_host); > srp_disconnect_target(target); > ib_destroy_cm_id(target->cm_id); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.24-rc BUGFIX] SRP transport: only remove our own entries
On Thu, 03 Jan 2008 21:34:49 -0500 Dave Dillow <[EMAIL PROTECTED]> wrote: > The SCSI SRP transport class currently iterates over all children > devices of the host that is being removed in srp_remove_host(). However, > not all of those children were created by the SRP transport, and > removing them will cause corruption and an oops when their creator tries > to remove them. > > Signed-off-by: David Dillow <[EMAIL PROTECTED]> > Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]> > --- Thanks! James, please put this patch into scsi-rc-fixes. > On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: > > On Thu, 03 Jan 2008 15:51:25 -0500 > > I think that this is the root problem and the patch fixes it in the > > right way. Please send this patch to [EMAIL PROTECTED] and a > > patch to move srp_remove_host before scsi_remove_host in > > srp_remove_one to Roland. > > > > Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]> > > diff --git a/drivers/scsi/scsi_transport_srp.c > b/drivers/scsi/scsi_transport_srp.c > index 44a340b..65c584d 100644 > --- a/drivers/scsi/scsi_transport_srp.c > +++ b/drivers/scsi/scsi_transport_srp.c > @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); > > static int do_srp_rport_del(struct device *dev, void *data) > { > - srp_rport_del(dev_to_rport(dev)); > + if (scsi_is_srp_rport(dev)) > + srp_rport_del(dev_to_rport(dev)); > return 0; > } > > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
On Thu, 03 Jan 2008 15:51:25 -0500 David Dillow <[EMAIL PROTECTED]> wrote: > > On Thu, 2008-01-03 at 15:09 -0500, David Dillow wrote: > > As for a better fix, I'm not sure. > > Here's a better way than the strncmp. If this meets everyone's approval, > then I can roll up a proper commit. Thanks! I really apprecate it. I think that this is the root problem and the patch fixes it in the right way. Please send this patch to [EMAIL PROTECTED] and a patch to move srp_remove_host before scsi_remove_host in srp_remove_one to Roland. Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]> > diff --git a/drivers/scsi/scsi_transport_srp.c > b/drivers/scsi/scsi_transport_srp.c > index 44a340b..65c584d 100644 > --- a/drivers/scsi/scsi_transport_srp.c > +++ b/drivers/scsi/scsi_transport_srp.c > @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); > > static int do_srp_rport_del(struct device *dev, void *data) > { > - srp_rport_del(dev_to_rport(dev)); > + if (scsi_is_srp_rport(dev)) > + srp_rport_del(dev_to_rport(dev)); > return 0; > } > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: list corruption on ib_srp load in v2.6.24-rc5
On Wed, 02 Jan 2008 09:51:38 -0800 Roland Dreier <[EMAIL PROTECTED]> wrote: > > > Can you try this? > > > > That patched oopsed in scsi_remove_host(), but reversing the order has > > survived over 500 insert/probe/remove cycles. > > > > Tested-by: David Dillow <[EMAIL PROTECTED]> > > --- > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c > b/drivers/infiniband/ulp/srp/ib_srp.c > > index 950228f..77e8b90 100644 > > --- a/drivers/infiniband/ulp/srp/ib_srp.c > > +++ b/drivers/infiniband/ulp/srp/ib_srp.c > > @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device) > >list_for_each_entry_safe(target, tmp_target, > > >target_list, list) { > >scsi_remove_host(target->scsi_host); > > + srp_remove_host(target->scsi_host); > >srp_disconnect_target(target); > > Where do we stand on this? What is the right place to put the > srp_remove_host? Is there a bug somewhere else? {sas|fc}_remove_host is called before scsi_remove_host. And in srp_remove_work(), we call srp_remove_host and then scsi_remove_host. ibmvscsi also calls them in that order. I thought that I messed up something in srp_transport_class. But I can't figure out what's wrong. The above patch works and is unlikely to lead to critical problems so I'm fine with it for now. > I'd like to get this fixed before 2.6.24 final comes out... Yeah, it should be fixed. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: list corruption on ib_srp load in v2.6.24-rc5
On Wed, 02 Jan 2008 09:51:38 -0800 Roland Dreier [EMAIL PROTECTED] wrote: Can you try this? That patched oopsed in scsi_remove_host(), but reversing the order has survived over 500 insert/probe/remove cycles. Tested-by: David Dillow [EMAIL PROTECTED] --- diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 950228f..77e8b90 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device) list_for_each_entry_safe(target, tmp_target, host-target_list, list) { scsi_remove_host(target-scsi_host); + srp_remove_host(target-scsi_host); srp_disconnect_target(target); Where do we stand on this? What is the right place to put the srp_remove_host? Is there a bug somewhere else? {sas|fc}_remove_host is called before scsi_remove_host. And in srp_remove_work(), we call srp_remove_host and then scsi_remove_host. ibmvscsi also calls them in that order. I thought that I messed up something in srp_transport_class. But I can't figure out what's wrong. The above patch works and is unlikely to lead to critical problems so I'm fine with it for now. I'd like to get this fixed before 2.6.24 final comes out... Yeah, it should be fixed. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5
On Thu, 03 Jan 2008 15:51:25 -0500 David Dillow [EMAIL PROTECTED] wrote: On Thu, 2008-01-03 at 15:09 -0500, David Dillow wrote: As for a better fix, I'm not sure. Here's a better way than the strncmp. If this meets everyone's approval, then I can roll up a proper commit. Thanks! I really apprecate it. I think that this is the root problem and the patch fixes it in the right way. Please send this patch to [EMAIL PROTECTED] and a patch to move srp_remove_host before scsi_remove_host in srp_remove_one to Roland. Acked-by: FUJITA Tomonori [EMAIL PROTECTED] diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c index 44a340b..65c584d 100644 --- a/drivers/scsi/scsi_transport_srp.c +++ b/drivers/scsi/scsi_transport_srp.c @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); static int do_srp_rport_del(struct device *dev, void *data) { - srp_rport_del(dev_to_rport(dev)); + if (scsi_is_srp_rport(dev)) + srp_rport_del(dev_to_rport(dev)); return 0; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.24-rc BUGFIX] SRP transport: only remove our own entries
On Thu, 03 Jan 2008 21:34:49 -0500 Dave Dillow [EMAIL PROTECTED] wrote: The SCSI SRP transport class currently iterates over all children devices of the host that is being removed in srp_remove_host(). However, not all of those children were created by the SRP transport, and removing them will cause corruption and an oops when their creator tries to remove them. Signed-off-by: David Dillow [EMAIL PROTECTED] Acked-by: FUJITA Tomonori [EMAIL PROTECTED] --- Thanks! James, please put this patch into scsi-rc-fixes. On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: On Thu, 03 Jan 2008 15:51:25 -0500 I think that this is the root problem and the patch fixes it in the right way. Please send this patch to [EMAIL PROTECTED] and a patch to move srp_remove_host before scsi_remove_host in srp_remove_one to Roland. Acked-by: FUJITA Tomonori [EMAIL PROTECTED] diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c index 44a340b..65c584d 100644 --- a/drivers/scsi/scsi_transport_srp.c +++ b/drivers/scsi/scsi_transport_srp.c @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del); static int do_srp_rport_del(struct device *dev, void *data) { - srp_rport_del(dev_to_rport(dev)); + if (scsi_is_srp_rport(dev)) + srp_rport_del(dev_to_rport(dev)); return 0; } - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6.24-rc BUGFIX] IB/srp: release transport when removing host
On Thu, 03 Jan 2008 21:39:19 -0500 Dave Dillow [EMAIL PROTECTED] wrote: When removing the ib_srp module, srp_remove_one() does not release the SRP transport class when it is releasing the SCSI host. This leads to dangling references to kfree()'d memory, and an eventual oops. Signed-off-by: David Dillow [EMAIL PROTECTED] Thanks again! Linus has already merged your previous patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b0e47c8b79154772a436f25bf7646733e1d6194c So please resend a patch to move srp_remove_host before scsi_remove_host instead of adding srp_remove_host. --- On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote: I think that this is the root problem and the patch fixes it in the right way. Please send this patch to [EMAIL PROTECTED] and a patch to move srp_remove_host before scsi_remove_host in srp_remove_one to Roland. Acked-by: FUJITA Tomonori [EMAIL PROTECTED] Not sure if your Acked-by was for this one as well, so I left it off. Acked-by: FUJITA Tomonori [EMAIL PROTECTED] diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 950228f..bdb6f85 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device) list_for_each_entry_safe(target, tmp_target, host-target_list, list) { + srp_remove_host(target-scsi_host); scsi_remove_host(target-scsi_host); srp_disconnect_target(target); ib_destroy_cm_id(target-cm_id); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Mon, 31 Dec 2007 17:57:35 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > FUJITA Tomonori wrote: > > > > Oops, it's for -mm. > > > > Hi, > > I just tested this patch and it works fine for me so far. > > Tested-by: Balbir Singh <[EMAIL PROTECTED]> Thanks! I sent an updated patch including this fix several hours ago: http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Mon, 31 Dec 2007 17:57:35 +0530 Balbir Singh [EMAIL PROTECTED] wrote: FUJITA Tomonori wrote: Oops, it's for -mm. Hi, I just tested this patch and it works fine for me so far. Tested-by: Balbir Singh [EMAIL PROTECTED] Thanks! I sent an updated patch including this fix several hours ago: http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Fri, 28 Dec 2007 13:05:39 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > FUJITA Tomonori wrote: > [snip] > > Thanks, > > > > Can you try this? > > > > > > diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c > > index e7d8544..495575a 100644 > > --- a/lib/iommu-helper.c > > +++ b/lib/iommu-helper.c > > @@ -8,15 +8,20 @@ > > static unsigned long find_next_zero_area(unsigned long *map, > > unsigned long size, > > unsigned long start, > > -unsigned int nr) > > +unsigned int nr, > > +unsigned long align_mask) > > { > > unsigned long index, end, i; > > again: > > index = find_next_zero_bit(map, size, start); > > + > > + /* Align allocation */ > > + index = (index + align_mask) & ~align_mask; > > + > > end = index + nr; > > - if (end > size) > > + if (end >= size) > > return -1; > > - for (i = index + 1; i < end; i++) { > > + for (i = index; i < end; i++) { > > if (test_bit(i, map)) { > > start = i+1; > > goto again; > > @@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, > > unsigned long size, > > { > > unsigned long index; > > again: > > - index = find_next_zero_area(map, size, start, nr); > > + index = find_next_zero_area(map, size, start, nr, align_mask); > > if (index != -1) { > > - index = (index + align_mask) & ~align_mask; > > if (is_span_boundary(index, nr, shift, boundary_size)) { > > /* we could do more effectively */ > > start = index + 1; > > This on top of -mm? Or on top of the reverted iommu patch. Oops, it's for -mm. Thanks, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Thu, 27 Dec 2007 11:15:37 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > FUJITA Tomonori wrote: > > On Thu, 27 Dec 2007 10:08:25 +0530 > > Balbir Singh <[EMAIL PROTECTED]> wrote: > > > >> FUJITA Tomonori wrote: > >>> On Mon, 24 Dec 2007 10:18:50 +0530 > >>> Balbir Singh <[EMAIL PROTECTED]> wrote: > >>> > >> [snip] > >> > >>> I might break the IOMMU code. Can you reproduce it easily? If so, > >>> reverting my IOMMU patches (I've attached a patch to revert them) fix > >>> the problem? > >> [snip] > >> > >> > >> Yes, this patch fixes the problem for me. > > > > Thanks, so you can reproduce it easily, right? > > > > Yes, quite easily > > > The problem is that I don't want to revert these changes. I'll see > > how these changes cause the problem shortly. > > I'll try and find some bandwidth to review/test the patches and help you > figure out the right solution. Thanks, Can you try this? diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c index e7d8544..495575a 100644 --- a/lib/iommu-helper.c +++ b/lib/iommu-helper.c @@ -8,15 +8,20 @@ static unsigned long find_next_zero_area(unsigned long *map, unsigned long size, unsigned long start, -unsigned int nr) +unsigned int nr, +unsigned long align_mask) { unsigned long index, end, i; again: index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) & ~align_mask; + end = index + nr; - if (end > size) + if (end >= size) return -1; - for (i = index + 1; i < end; i++) { + for (i = index; i < end; i++) { if (test_bit(i, map)) { start = i+1; goto again; @@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, { unsigned long index; again: - index = find_next_zero_area(map, size, start, nr); + index = find_next_zero_area(map, size, start, nr, align_mask); if (index != -1) { - index = (index + align_mask) & ~align_mask; if (is_span_boundary(index, nr, shift, boundary_size)) { /* we could do more effectively */ start = index + 1; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Thu, 27 Dec 2007 11:15:37 +0530 Balbir Singh [EMAIL PROTECTED] wrote: FUJITA Tomonori wrote: On Thu, 27 Dec 2007 10:08:25 +0530 Balbir Singh [EMAIL PROTECTED] wrote: FUJITA Tomonori wrote: On Mon, 24 Dec 2007 10:18:50 +0530 Balbir Singh [EMAIL PROTECTED] wrote: [snip] I might break the IOMMU code. Can you reproduce it easily? If so, reverting my IOMMU patches (I've attached a patch to revert them) fix the problem? [snip] Yes, this patch fixes the problem for me. Thanks, so you can reproduce it easily, right? Yes, quite easily The problem is that I don't want to revert these changes. I'll see how these changes cause the problem shortly. I'll try and find some bandwidth to review/test the patches and help you figure out the right solution. Thanks, Can you try this? diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c index e7d8544..495575a 100644 --- a/lib/iommu-helper.c +++ b/lib/iommu-helper.c @@ -8,15 +8,20 @@ static unsigned long find_next_zero_area(unsigned long *map, unsigned long size, unsigned long start, -unsigned int nr) +unsigned int nr, +unsigned long align_mask) { unsigned long index, end, i; again: index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) ~align_mask; + end = index + nr; - if (end size) + if (end = size) return -1; - for (i = index + 1; i end; i++) { + for (i = index; i end; i++) { if (test_bit(i, map)) { start = i+1; goto again; @@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, { unsigned long index; again: - index = find_next_zero_area(map, size, start, nr); + index = find_next_zero_area(map, size, start, nr, align_mask); if (index != -1) { - index = (index + align_mask) ~align_mask; if (is_span_boundary(index, nr, shift, boundary_size)) { /* we could do more effectively */ start = index + 1; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Fri, 28 Dec 2007 13:05:39 +0530 Balbir Singh [EMAIL PROTECTED] wrote: FUJITA Tomonori wrote: [snip] Thanks, Can you try this? diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c index e7d8544..495575a 100644 --- a/lib/iommu-helper.c +++ b/lib/iommu-helper.c @@ -8,15 +8,20 @@ static unsigned long find_next_zero_area(unsigned long *map, unsigned long size, unsigned long start, -unsigned int nr) +unsigned int nr, +unsigned long align_mask) { unsigned long index, end, i; again: index = find_next_zero_bit(map, size, start); + + /* Align allocation */ + index = (index + align_mask) ~align_mask; + end = index + nr; - if (end size) + if (end = size) return -1; - for (i = index + 1; i end; i++) { + for (i = index; i end; i++) { if (test_bit(i, map)) { start = i+1; goto again; @@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned long size, { unsigned long index; again: - index = find_next_zero_area(map, size, start, nr); + index = find_next_zero_area(map, size, start, nr, align_mask); if (index != -1) { - index = (index + align_mask) ~align_mask; if (is_span_boundary(index, nr, shift, boundary_size)) { /* we could do more effectively */ start = index + 1; This on top of -mm? Or on top of the reverted iommu patch. Oops, it's for -mm. Thanks, -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Thu, 27 Dec 2007 10:08:25 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > FUJITA Tomonori wrote: > > On Mon, 24 Dec 2007 10:18:50 +0530 > > Balbir Singh <[EMAIL PROTECTED]> wrote: > > > [snip] > > > > > I might break the IOMMU code. Can you reproduce it easily? If so, > > reverting my IOMMU patches (I've attached a patch to revert them) fix > > the problem? > > [snip] > > > Yes, this patch fixes the problem for me. Thanks, so you can reproduce it easily, right? The problem is that I don't want to revert these changes. I'll see how these changes cause the problem shortly. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI errors on powerpc with 2.6.24-rc6-mm1
On Mon, 24 Dec 2007 10:18:50 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > Hi, > > I've just seen this on my dmesg, this is new, never seen this before on > this box and it happens only with this version of the kernel. > > In this configuration, the page size is set to 64K and I've enabled fake > NUMA nodes on PowerPC. > > tce_buildmulti_pSeriesLP: plpar_tce_put failed. rc=-4 > index = 0x402 > npages = 0x0 > tce[0] val = 0x15ad0001 > Call Trace: > [cffe74f0] [c00491a4] > .tce_buildmulti_pSeriesLP+0x26c/0x2ac (unreliable) > [cffe75c0] [c00295e4] .iommu_map_sg+0x1d4/0x418 > [cffe76d0] [c0028664] .dma_iommu_map_sg+0x3c/0x50 > [cffe7750] [c03b6c30] .scsi_dma_map+0x70/0x94 > [cffe77d0] [c03dedbc] .ipr_queuecommand+0x300/0x500 > [cffe7880] [c03ae964] .scsi_dispatch_cmd+0x21c/0x2b8 > [cffe7920] [c03b67a0] .scsi_request_fn+0x310/0x460 > [cffe79d0] [c024ab90] .blk_run_queue+0x94/0xec > [cffe7a70] [c03b3b08] .scsi_run_queue+0x24c/0x27c > [cffe7b20] [c03b4424] .scsi_next_command+0x48/0x70 > [cffe7bc0] [c03b4b48] .scsi_end_request+0xbc/0xe4 > [cffe7c60] [c03b5294] .scsi_io_completion+0x170/0x3e8 > [cffe7d40] [c03ae0e4] .scsi_finish_command+0xb4/0xd4 > [cffe7dd0] [c03b584c] .scsi_softirq_done+0x114/0x138 > [cffe7e60] [c024af70] .blk_done_softirq+0xa0/0xd0 > [cffe7ef0] [c007a2a0] .__do_softirq+0xa8/0x164 > [cffe7f90] [c0027edc] .call_do_softirq+0x14/0x24 > [c0003e183950] [c000bdcc] .do_softirq+0x74/0xc0 > [c0003e1839e0] [c007a450] .irq_exit+0x5c/0xac > [c0003e183a60] [c000c414] .do_IRQ+0x17c/0x1f4 > [c0003e183b00] [c0004c24] hardware_interrupt_entry+0x24/0x28 > --- Exception: 501 at .ppc64_runlatch_off+0x28/0x60 > LR = .pseries_dedicated_idle_sleep+0xd8/0x1a4 > [c0003e183df0] [c0048494] > .pseries_dedicated_idle_sleep+0x78/0x1a4 (unreliable) > [c0003e183e80] [c001110c] .cpu_idle+0x10c/0x1e8 > [c0003e183f00] [c002b5b0] .start_secondary+0x1b4/0x1d8 > [c0003e183f90] [c00083c4] .start_secondary_prolog+0xc/0x10 I might break the IOMMU code. Can you reproduce it easily? If so, reverting my IOMMU patches (I've attached a patch to revert them) fix the problem? Thanks, diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index ff2a62d..59899b2 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -244,9 +244,6 @@ config IOMMU_VMERGE Most drivers don't have this problem; it is safe to say Y here. -config IOMMU_HELPER - def_bool PPC64 - config HOTPLUG_CPU bool "Support for enabling/disabling CPUs" depends on SMP && HOTPLUG && EXPERIMENTAL && (PPC_PSERIES || PPC_PMAC) diff --git a/arch/powerpc/kernel/dma_64.c b/arch/powerpc/kernel/dma_64.c index 6fcb7cb..1806d96 100644 --- a/arch/powerpc/kernel/dma_64.c +++ b/arch/powerpc/kernel/dma_64.c @@ -31,8 +31,8 @@ static inline unsigned long device_to_mask(struct device *dev) static void *dma_iommu_alloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t flag) { - return iommu_alloc_coherent(dev, dev->archdata.dma_data, size, - dma_handle, device_to_mask(dev), flag, + return iommu_alloc_coherent(dev->archdata.dma_data, size, dma_handle, + device_to_mask(dev), flag, dev->archdata.numa_node); } @@ -52,7 +52,7 @@ static dma_addr_t dma_iommu_map_single(struct device *dev, void *vaddr, size_t size, enum dma_data_direction direction) { - return iommu_map_single(dev, dev->archdata.dma_data, vaddr, size, + return iommu_map_single(dev->archdata.dma_data, vaddr, size, device_to_mask(dev), direction); } diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index 18e8860..050e9ac 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -31,7 +31,6 @@ #include #include #include -#include #include #include #include @@ -82,19 +81,17 @@ static int __init setup_iommu(char *str) __setup("protect4gb=", setup_protect4gb); __setup("iommu=", setup_iommu); -static unsigned long iommu_range_alloc(struct device *dev, - struct iommu_table *tbl, +static unsigned long iommu_range_alloc(struct iommu_table *tbl, unsigned long npages, unsigned long *handle, unsigned long mask, unsigned int align_order) { -