from:"FUJITA Tomonori"

Re: [patch] [SCSI] bnx2fc: zero out sense buffer properly

2012-08-18 Thread FUJITA Tomonori

On Sat, 18 Aug 2012 11:46:37 +0300
Dan Carpenter  wrote:

> ->sense_buffer used to be an array but it changed to pointer in
> de25deb180 "[SCSI] use dynamically allocated sense buffer".  This call
> to memset() needs to be updated as well.
> 
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c
> index 73f231c..8d4626c 100644
> --- a/drivers/scsi/bnx2fc/bnx2fc_io.c
> +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c
> @@ -1807,7 +1807,7 @@ static void bnx2fc_parse_fcp_rsp(struct bnx2fc_cmd 
> *io_req,
>   fcp_sns_len = SCSI_SENSE_BUFFERSIZE;
>   }
>  
> - memset(sc_cmd->sense_buffer, 0, sizeof(sc_cmd->sense_buffer));
> + memset(sc_cmd->sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);

I guess that you can remove the line instead. IIRC, scsi-ml does it
for LLDs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] [SCSI] bnx2fc: zero out sense buffer properly

2012-08-18 Thread FUJITA Tomonori

On Sat, 18 Aug 2012 11:46:37 +0300
Dan Carpenter dan.carpen...@oracle.com wrote:

 -sense_buffer used to be an array but it changed to pointer in
 de25deb180 [SCSI] use dynamically allocated sense buffer.  This call
 to memset() needs to be updated as well.
 
 Signed-off-by: Dan Carpenter dan.carpen...@oracle.com
 
 diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c
 index 73f231c..8d4626c 100644
 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c
 +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c
 @@ -1807,7 +1807,7 @@ static void bnx2fc_parse_fcp_rsp(struct bnx2fc_cmd 
 *io_req,
   fcp_sns_len = SCSI_SENSE_BUFFERSIZE;
   }
  
 - memset(sc_cmd-sense_buffer, 0, sizeof(sc_cmd-sense_buffer));
 + memset(sc_cmd-sense_buffer, 0, SCSI_SENSE_BUFFERSIZE);

I guess that you can remove the line instead. IIRC, scsi-ml does it
for LLDs.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] swiotlb: Remove SWIOTLB overflow buffer support

2012-07-10 Thread FUJITA Tomonori

On Mon, 9 Jul 2012 16:25:05 -0400
Konrad Rzeszutek Wilk  wrote:

> On Fri, Jul 06, 2012 at 05:06:12PM -0600, Shuah Khan wrote:
>> Remove SWIOTLB overflow buffer support and return DMA_ERROR_CODE
>> (a value of zero) to make it consistent with iommu implementation
>> on Intel, AMD, and swiotlb-xen.
> 
> While this is a good forward step and this needs to be done eventually,
> you should first send out patches for the drivers that don't check
> for the DMA_ERROR_CODE when doing mapping. In other words for the
> drivers that map but don't call dma_mapping_error to check.
> 
> When that is fixed and *all the drivers that don't call dma_mapping_error
> are fixed, then this patch makes sense. 
> 
> So for right now, NACK.

Yeah, I'm not sure we could fix (or remove) *all the drivers though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] swiotlb: Remove SWIOTLB overflow buffer support

2012-07-10 Thread FUJITA Tomonori

On Mon, 9 Jul 2012 16:25:05 -0400
Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote:

 On Fri, Jul 06, 2012 at 05:06:12PM -0600, Shuah Khan wrote:
 Remove SWIOTLB overflow buffer support and return DMA_ERROR_CODE
 (a value of zero) to make it consistent with iommu implementation
 on Intel, AMD, and swiotlb-xen.
 
 While this is a good forward step and this needs to be done eventually,
 you should first send out patches for the drivers that don't check
 for the DMA_ERROR_CODE when doing mapping. In other words for the
 drivers that map but don't call dma_mapping_error to check.
 
 When that is fixed and *all the drivers that don't call dma_mapping_error
 are fixed, then this patch makes sense. 
 
 So for right now, NACK.

Yeah, I'm not sure we could fix (or remove) *all the drivers though.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 4/4] alpha: remove unused DEBUG_FORCEDAC define in IOMMU

2008-02-22 Thread FUJITA Tomonori

This just removes unused DEBUG_FORCEDAC define in the IOMMU code.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
Cc: Richard Henderson <[EMAIL PROTECTED]>
Cc: Ivan Kokshaysky <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
---
 arch/alpha/kernel/pci_iommu.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 54540c3..be6fa10 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -31,7 +31,6 @@
 #endif
 
 #define DEBUG_NODIRECT 0
-#define DEBUG_FORCEDAC 0
 
 #define ISA_DMA_MASK   0x00ff
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 3/4] alpha: make IOMMU respect the segment boundary limits

2008-02-22 Thread FUJITA Tomonori

This patch makes the IOMMU code not allocate a memory area spanning
LLD's segment boundary.

is_span_boundary() judges whether a memory area spans LLD's segment
boundary. If iommu_arena_find_pages() finds such a area, it tries to
find the next available memory area.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
Cc: Richard Henderson <[EMAIL PROTECTED]>
Cc: Ivan Kokshaysky <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
---
 arch/alpha/kernel/pci_iommu.c |   40 ++--
 1 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index e54f829..54540c3 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -126,13 +126,34 @@ iommu_arena_new(struct pci_controller *hose, dma_addr_t 
base,
return iommu_arena_new_node(0, hose, base, window_size, align);
 }
 
+static inline int is_span_boundary(unsigned int index, unsigned int nr,
+  unsigned long shift,
+  unsigned long boundary_size)
+{
+   shift = (shift + index) & (boundary_size - 1);
+   return shift + nr > boundary_size;
+}
+
 /* Must be called with the arena lock held */
 static long
-iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask)
+iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena,
+  long n, long mask)
 {
unsigned long *ptes;
long i, p, nent;
int pass = 0;
+   unsigned long base;
+   unsigned long boundary_size;
+
+   BUG_ON(arena->dma_base & ~PAGE_MASK);
+   base = arena->dma_base >> PAGE_SHIFT;
+   if (dev)
+   boundary_size = ALIGN(dma_get_max_seg_size(dev) + 1, PAGE_SIZE)
+   >> PAGE_SHIFT;
+   else
+   boundary_size = ALIGN(1UL << 32, PAGE_SIZE) >> PAGE_SHIFT;
+
+   BUG_ON(!is_power_of_2(boundary_size));
 
/* Search forward for the first mask-aligned sequence of N free ptes */
ptes = arena->ptes;
@@ -142,6 +163,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long 
n, long mask)
 
 again:
while (i < n && p+i < nent) {
+   if (!i && is_span_boundary(p, n, base, boundary_size)) {
+   p = ALIGN(p + 1, mask + 1);
+   goto again;
+   }
+
if (ptes[p+i])
p = ALIGN(p + i + 1, mask + 1), i = 0;
else
@@ -170,7 +196,8 @@ again:
 }
 
 static long
-iommu_arena_alloc(struct pci_iommu_arena *arena, long n, unsigned int align)
+iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n,
+ unsigned int align)
 {
unsigned long flags;
unsigned long *ptes;
@@ -181,7 +208,7 @@ iommu_arena_alloc(struct pci_iommu_arena *arena, long n, 
unsigned int align)
/* Search for N empty ptes */
ptes = arena->ptes;
mask = max(align, arena->align_entry) - 1;
-   p = iommu_arena_find_pages(arena, n, mask);
+   p = iommu_arena_find_pages(dev, arena, n, mask);
if (p < 0) {
spin_unlock_irqrestore(>lock, flags);
return -1;
@@ -231,6 +258,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
unsigned long paddr;
dma_addr_t ret;
unsigned int align = 0;
+   struct device *dev = pdev ? >dev : NULL;
 
paddr = __pa(cpu_addr);
 
@@ -278,7 +306,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
/* Force allocation to 64KB boundary for ISA bridges. */
if (pdev && pdev == isa_bridge)
align = 8;
-   dma_ofs = iommu_arena_alloc(arena, npages, align);
+   dma_ofs = iommu_arena_alloc(dev, arena, npages, align);
if (dma_ofs < 0) {
printk(KERN_WARNING "pci_map_single failed: "
   "could not allocate dma page tables\n");
@@ -565,7 +593,7 @@ sg_fill(struct device *dev, struct scatterlist *leader, 
struct scatterlist *end,
 
paddr &= ~PAGE_MASK;
npages = calc_npages(paddr + size);
-   dma_ofs = iommu_arena_alloc(arena, npages, 0);
+   dma_ofs = iommu_arena_alloc(dev, arena, npages, 0);
if (dma_ofs < 0) {
/* If we attempted a direct map above but failed, die.  */
if (leader->dma_address == 0)
@@ -832,7 +860,7 @@ iommu_reserve(struct pci_iommu_arena *arena, long pg_count, 
long align_mask)
 
/* Search for N empty ptes.  */
ptes = arena->ptes;
-   p = iommu_arena_find_pages(arena, pg_count, align_mask);
+   p = iommu_arena_find_pages(NULL, arena, pg_count, align_mask);
if (p < 0) {
spin_unlock_irqrestore(>lock, flags);

[PATCH -mm 1/4] alpha: convert IOMMU to use ALIGN()

2008-02-22 Thread FUJITA Tomonori

This patch is preparation for modifications to fix the IOMMU segment
boundary problem.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
Cc: Richard Henderson <[EMAIL PROTECTED]>
Cc: Ivan Kokshaysky <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
---
 arch/alpha/kernel/pci_iommu.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 26d3789..bbf9990 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -136,11 +136,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, 
long n, long mask)
/* Search forward for the first mask-aligned sequence of N free ptes */
ptes = arena->ptes;
nent = arena->size >> PAGE_SHIFT;
-   p = (arena->next_entry + mask) & ~mask;
+   p = ALIGN(arena->next_entry, mask + 1);
i = 0;
while (i < n && p+i < nent) {
if (ptes[p+i])
-   p = (p + i + 1 + mask) & ~mask, i = 0;
+   p = ALIGN(p + i + 1, mask + 1), i = 0;
else
i = i + 1;
}
@@ -153,7 +153,7 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long 
n, long mask)
p = 0, i = 0;
while (i < n && p+i < nent) {
if (ptes[p+i])
-   p = (p + i + 1 + mask) & ~mask, i = 0;
+   p = ALIGN(p + i + 1, mask + 1), i = 0;
else
i = i + 1;
}
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 0/4] fix iommu segment boundary problems (alpha)

2008-02-22 Thread FUJITA Tomonori


This patchset is another sequel to my patchset to fix iommu segment
boundary problems, IOMMUs allocate memory areas without considering a
low level driver's segment boundary limits:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html

This patchset fixes the Alpha IOMMU code.

There are four patches in this patchset. The first two patches are
preparation for the third patch, which fixes the IOMMU segment
boundary problem. The fourth patch just a cleanup, which removes an
unused code.

This is against 2.6.25-rc2-mm1.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 2/4] alpha: IOMMU had better access to the free space bitmap at only one place

2008-02-22 Thread FUJITA Tomonori

iommu_arena_find_pages duplicates the code to access to the bitmap for
free space management. This patch convert the IOMMU code to have only
one place to access the bitmap, in the popular way that other IOMMUs
(e.g. POWER and SPARC) do.

This patch is preparation for modifications to fix the IOMMU segment
boundary problem.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
Cc: Richard Henderson <[EMAIL PROTECTED]>
Cc: Ivan Kokshaysky <[EMAIL PROTECTED]>
Cc: Andrew Morton <[EMAIL PROTECTED]>
---
 arch/alpha/kernel/pci_iommu.c |   28 +++-
 1 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index bbf9990..e54f829 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -132,12 +132,15 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, 
long n, long mask)
 {
unsigned long *ptes;
long i, p, nent;
+   int pass = 0;
 
/* Search forward for the first mask-aligned sequence of N free ptes */
ptes = arena->ptes;
nent = arena->size >> PAGE_SHIFT;
p = ALIGN(arena->next_entry, mask + 1);
i = 0;
+
+again:
while (i < n && p+i < nent) {
if (ptes[p+i])
p = ALIGN(p + i + 1, mask + 1), i = 0;
@@ -146,19 +149,18 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, 
long n, long mask)
}
 
if (i < n) {
-/* Reached the end.  Flush the TLB and restart the
-   search from the beginning.  */
-   alpha_mv.mv_pci_tbi(arena->hose, 0, -1);
-
-   p = 0, i = 0;
-   while (i < n && p+i < nent) {
-   if (ptes[p+i])
-   p = ALIGN(p + i + 1, mask + 1), i = 0;
-   else
-   i = i + 1;
-   }
-
-   if (i < n)
+   if (pass < 1) {
+   /*
+* Reached the end.  Flush the TLB and restart
+* the search from the beginning.
+   */
+   alpha_mv.mv_pci_tbi(arena->hose, 0, -1);
+
+   pass++;
+   p = 0;
+   i = 0;
+   goto again;
+   } else
return -1;
}
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 2/4] alpha: IOMMU had better access to the free space bitmap at only one place

2008-02-22 Thread FUJITA Tomonori

iommu_arena_find_pages duplicates the code to access to the bitmap for
free space management. This patch convert the IOMMU code to have only
one place to access the bitmap, in the popular way that other IOMMUs
(e.g. POWER and SPARC) do.

This patch is preparation for modifications to fix the IOMMU segment
boundary problem.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
Cc: Richard Henderson [EMAIL PROTECTED]
Cc: Ivan Kokshaysky [EMAIL PROTECTED]
Cc: Andrew Morton [EMAIL PROTECTED]
---
 arch/alpha/kernel/pci_iommu.c |   28 +++-
 1 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index bbf9990..e54f829 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -132,12 +132,15 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, 
long n, long mask)
 {
unsigned long *ptes;
long i, p, nent;
+   int pass = 0;
 
/* Search forward for the first mask-aligned sequence of N free ptes */
ptes = arena-ptes;
nent = arena-size  PAGE_SHIFT;
p = ALIGN(arena-next_entry, mask + 1);
i = 0;
+
+again:
while (i  n  p+i  nent) {
if (ptes[p+i])
p = ALIGN(p + i + 1, mask + 1), i = 0;
@@ -146,19 +149,18 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, 
long n, long mask)
}
 
if (i  n) {
-/* Reached the end.  Flush the TLB and restart the
-   search from the beginning.  */
-   alpha_mv.mv_pci_tbi(arena-hose, 0, -1);
-
-   p = 0, i = 0;
-   while (i  n  p+i  nent) {
-   if (ptes[p+i])
-   p = ALIGN(p + i + 1, mask + 1), i = 0;
-   else
-   i = i + 1;
-   }
-
-   if (i  n)
+   if (pass  1) {
+   /*
+* Reached the end.  Flush the TLB and restart
+* the search from the beginning.
+   */
+   alpha_mv.mv_pci_tbi(arena-hose, 0, -1);
+
+   pass++;
+   p = 0;
+   i = 0;
+   goto again;
+   } else
return -1;
}
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 0/4] fix iommu segment boundary problems (alpha)

2008-02-22 Thread FUJITA Tomonori


This patchset is another sequel to my patchset to fix iommu segment
boundary problems, IOMMUs allocate memory areas without considering a
low level driver's segment boundary limits:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html

This patchset fixes the Alpha IOMMU code.

There are four patches in this patchset. The first two patches are
preparation for the third patch, which fixes the IOMMU segment
boundary problem. The fourth patch just a cleanup, which removes an
unused code.

This is against 2.6.25-rc2-mm1.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 1/4] alpha: convert IOMMU to use ALIGN()

2008-02-22 Thread FUJITA Tomonori

This patch is preparation for modifications to fix the IOMMU segment
boundary problem.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
Cc: Richard Henderson [EMAIL PROTECTED]
Cc: Ivan Kokshaysky [EMAIL PROTECTED]
Cc: Andrew Morton [EMAIL PROTECTED]
---
 arch/alpha/kernel/pci_iommu.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 26d3789..bbf9990 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -136,11 +136,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, 
long n, long mask)
/* Search forward for the first mask-aligned sequence of N free ptes */
ptes = arena-ptes;
nent = arena-size  PAGE_SHIFT;
-   p = (arena-next_entry + mask)  ~mask;
+   p = ALIGN(arena-next_entry, mask + 1);
i = 0;
while (i  n  p+i  nent) {
if (ptes[p+i])
-   p = (p + i + 1 + mask)  ~mask, i = 0;
+   p = ALIGN(p + i + 1, mask + 1), i = 0;
else
i = i + 1;
}
@@ -153,7 +153,7 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long 
n, long mask)
p = 0, i = 0;
while (i  n  p+i  nent) {
if (ptes[p+i])
-   p = (p + i + 1 + mask)  ~mask, i = 0;
+   p = ALIGN(p + i + 1, mask + 1), i = 0;
else
i = i + 1;
}
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 3/4] alpha: make IOMMU respect the segment boundary limits

2008-02-22 Thread FUJITA Tomonori

This patch makes the IOMMU code not allocate a memory area spanning
LLD's segment boundary.

is_span_boundary() judges whether a memory area spans LLD's segment
boundary. If iommu_arena_find_pages() finds such a area, it tries to
find the next available memory area.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
Cc: Richard Henderson [EMAIL PROTECTED]
Cc: Ivan Kokshaysky [EMAIL PROTECTED]
Cc: Andrew Morton [EMAIL PROTECTED]
---
 arch/alpha/kernel/pci_iommu.c |   40 ++--
 1 files changed, 34 insertions(+), 6 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index e54f829..54540c3 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -126,13 +126,34 @@ iommu_arena_new(struct pci_controller *hose, dma_addr_t 
base,
return iommu_arena_new_node(0, hose, base, window_size, align);
 }
 
+static inline int is_span_boundary(unsigned int index, unsigned int nr,
+  unsigned long shift,
+  unsigned long boundary_size)
+{
+   shift = (shift + index)  (boundary_size - 1);
+   return shift + nr  boundary_size;
+}
+
 /* Must be called with the arena lock held */
 static long
-iommu_arena_find_pages(struct pci_iommu_arena *arena, long n, long mask)
+iommu_arena_find_pages(struct device *dev, struct pci_iommu_arena *arena,
+  long n, long mask)
 {
unsigned long *ptes;
long i, p, nent;
int pass = 0;
+   unsigned long base;
+   unsigned long boundary_size;
+
+   BUG_ON(arena-dma_base  ~PAGE_MASK);
+   base = arena-dma_base  PAGE_SHIFT;
+   if (dev)
+   boundary_size = ALIGN(dma_get_max_seg_size(dev) + 1, PAGE_SIZE)
+PAGE_SHIFT;
+   else
+   boundary_size = ALIGN(1UL  32, PAGE_SIZE)  PAGE_SHIFT;
+
+   BUG_ON(!is_power_of_2(boundary_size));
 
/* Search forward for the first mask-aligned sequence of N free ptes */
ptes = arena-ptes;
@@ -142,6 +163,11 @@ iommu_arena_find_pages(struct pci_iommu_arena *arena, long 
n, long mask)
 
 again:
while (i  n  p+i  nent) {
+   if (!i  is_span_boundary(p, n, base, boundary_size)) {
+   p = ALIGN(p + 1, mask + 1);
+   goto again;
+   }
+
if (ptes[p+i])
p = ALIGN(p + i + 1, mask + 1), i = 0;
else
@@ -170,7 +196,8 @@ again:
 }
 
 static long
-iommu_arena_alloc(struct pci_iommu_arena *arena, long n, unsigned int align)
+iommu_arena_alloc(struct device *dev, struct pci_iommu_arena *arena, long n,
+ unsigned int align)
 {
unsigned long flags;
unsigned long *ptes;
@@ -181,7 +208,7 @@ iommu_arena_alloc(struct pci_iommu_arena *arena, long n, 
unsigned int align)
/* Search for N empty ptes */
ptes = arena-ptes;
mask = max(align, arena-align_entry) - 1;
-   p = iommu_arena_find_pages(arena, n, mask);
+   p = iommu_arena_find_pages(dev, arena, n, mask);
if (p  0) {
spin_unlock_irqrestore(arena-lock, flags);
return -1;
@@ -231,6 +258,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
unsigned long paddr;
dma_addr_t ret;
unsigned int align = 0;
+   struct device *dev = pdev ? pdev-dev : NULL;
 
paddr = __pa(cpu_addr);
 
@@ -278,7 +306,7 @@ pci_map_single_1(struct pci_dev *pdev, void *cpu_addr, 
size_t size,
/* Force allocation to 64KB boundary for ISA bridges. */
if (pdev  pdev == isa_bridge)
align = 8;
-   dma_ofs = iommu_arena_alloc(arena, npages, align);
+   dma_ofs = iommu_arena_alloc(dev, arena, npages, align);
if (dma_ofs  0) {
printk(KERN_WARNING pci_map_single failed: 
   could not allocate dma page tables\n);
@@ -565,7 +593,7 @@ sg_fill(struct device *dev, struct scatterlist *leader, 
struct scatterlist *end,
 
paddr = ~PAGE_MASK;
npages = calc_npages(paddr + size);
-   dma_ofs = iommu_arena_alloc(arena, npages, 0);
+   dma_ofs = iommu_arena_alloc(dev, arena, npages, 0);
if (dma_ofs  0) {
/* If we attempted a direct map above but failed, die.  */
if (leader-dma_address == 0)
@@ -832,7 +860,7 @@ iommu_reserve(struct pci_iommu_arena *arena, long pg_count, 
long align_mask)
 
/* Search for N empty ptes.  */
ptes = arena-ptes;
-   p = iommu_arena_find_pages(arena, pg_count, align_mask);
+   p = iommu_arena_find_pages(NULL, arena, pg_count, align_mask);
if (p  0) {
spin_unlock_irqrestore(arena-lock, flags);
return -1;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http

[PATCH -mm 4/4] alpha: remove unused DEBUG_FORCEDAC define in IOMMU

2008-02-22 Thread FUJITA Tomonori

This just removes unused DEBUG_FORCEDAC define in the IOMMU code.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
Cc: Richard Henderson [EMAIL PROTECTED]
Cc: Ivan Kokshaysky [EMAIL PROTECTED]
Cc: Andrew Morton [EMAIL PROTECTED]
---
 arch/alpha/kernel/pci_iommu.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/alpha/kernel/pci_iommu.c b/arch/alpha/kernel/pci_iommu.c
index 54540c3..be6fa10 100644
--- a/arch/alpha/kernel/pci_iommu.c
+++ b/arch/alpha/kernel/pci_iommu.c
@@ -31,7 +31,6 @@
 #endif
 
 #define DEBUG_NODIRECT 0
-#define DEBUG_FORCEDAC 0
 
 #define ISA_DMA_MASK   0x00ff
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-19 Thread FUJITA Tomonori

On Sun, 17 Feb 2008 23:41:42 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> From: FUJITA Tomonori <[EMAIL PROTECTED]>
> Date: Sat, 16 Feb 2008 15:03:43 +0900
> 
> > [PATCH] sparc64: make IOMMU code respect the segment boundary limits
> > 
> > Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
> 
> Looks good, but I think it will break sound for some ALI chips.
> 
> Please see arch/sparc64/kernel/pci.c:ali_sound_dma_hack()
> and it's caller pci_dma_supported().

Could you explain the problem a little more?

The shift argument is only used as an offset when iommu-helper decides
whether a memory area (index plus npages) spanning LLD's segment
boudnary size or not.

For example, if a device's segment boudary size is 64K, the helper see
the following value is larger than 64K or not:

((the offset + index of the IOMMU table) ((64K / 8K) - 1) + npages
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-19 Thread FUJITA Tomonori

On Sun, 17 Feb 2008 23:41:42 -0800 (PST)
David Miller [EMAIL PROTECTED] wrote:

 From: FUJITA Tomonori [EMAIL PROTECTED]
 Date: Sat, 16 Feb 2008 15:03:43 +0900

  [PATCH] sparc64: make IOMMU code respect the segment boundary limits

  Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]

 Looks good, but I think it will break sound for some ALI chips.

 Please see arch/sparc64/kernel/pci.c:ali_sound_dma_hack()
 and it's caller pci_dma_supported().

Could you explain the problem a little more?

The shift argument is only used as an offset when iommu-helper decides
whether a memory area (index plus npages) spanning LLD's segment
boudnary size or not.

For example, if a device's segment boudary size is 64K, the helper see
the following value is larger than 64K or not:

((the offset + index of the IOMMU table) ((64K / 8K) - 1) + npages
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] iommu-helper: segment boundary limit should be a power of 2

2008-02-16 Thread FUJITA Tomonori

The segment boundary limit should be a power of 2 so let's make it
clear.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 lib/iommu-helper.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
index 495575a..de0eced 100644
--- a/lib/iommu-helper.c
+++ b/lib/iommu-helper.c
@@ -54,6 +54,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned 
long size,
   unsigned long align_mask)
 {
unsigned long index;
+
+   BUG_ON(!is_power_of_2(boundary_size));
 again:
index = find_next_zero_area(map, size, start, nr, align_mask);
if (index != -1) {
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] iommu-helper: segment boundary limit should be a power of 2

2008-02-16 Thread FUJITA Tomonori

The segment boundary limit should be a power of 2 so let's make it
clear.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 lib/iommu-helper.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
index 495575a..de0eced 100644
--- a/lib/iommu-helper.c
+++ b/lib/iommu-helper.c
@@ -54,6 +54,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned 
long size,
   unsigned long align_mask)
 {
unsigned long index;
+
+   BUG_ON(!is_power_of_2(boundary_size));
 again:
index = find_next_zero_area(map, size, start, nr, align_mask);
if (index != -1) {
-- 
1.5.3.7

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-15 Thread FUJITA Tomonori

Sorry for the late response,

On Mon, 11 Feb 2008 21:40:36 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> From: FUJITA Tomonori <[EMAIL PROTECTED]>
> Date: Thu, 07 Feb 2008 10:38:01 +0900
> 
> > Great, thanks! SPARC IOMMUs use bitmap for the free area management
> > like POWER IOMMUs so it could use lib/iommu-helper as POWER does.
> 
> Please look at Linus's current tree, I believe I have things
> in a working state, and the DMA mask portions should be easier
> to add now.

Thanks! It looks great.

iommu->page_table_map_base is on IO_PAGE_SIZE boundary? If so, the
following patch works, I think (sorry, it's only compile tested
again).

Thanks,

=
>From 91af83802c30d071f98444846e85310a8d58ab3d Mon Sep 17 00:00:00 2001
From: FUJITA Tomonori <[EMAIL PROTECTED]>
Date: Sat, 16 Feb 2008 14:29:02 +0900
Subject: [PATCH] sparc64: make IOMMU code respect the segment boundary limits

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 arch/sparc64/kernel/iommu.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c
index d3276eb..ffefbf7 100644
--- a/arch/sparc64/kernel/iommu.c
+++ b/arch/sparc64/kernel/iommu.c
@@ -134,7 +134,8 @@ unsigned long iommu_range_alloc(struct device *dev,
else
boundary_size = ALIGN(1UL << 32, 1 << IO_PAGE_SHIFT);
 
-   n = iommu_area_alloc(arena->map, limit, start, npages, 0,
+   n = iommu_area_alloc(arena->map, limit, start, npages,
+iommu->page_table_map_base >> IO_PAGE_SHIFT,
 boundary_size >> IO_PAGE_SHIFT, 0);
if (n == -1) {
if (likely(pass < 1)) {
-- 
1.5.3.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-15 Thread FUJITA Tomonori

Sorry for the late response,

On Mon, 11 Feb 2008 21:40:36 -0800 (PST)
David Miller [EMAIL PROTECTED] wrote:

 From: FUJITA Tomonori [EMAIL PROTECTED]
 Date: Thu, 07 Feb 2008 10:38:01 +0900
 
  Great, thanks! SPARC IOMMUs use bitmap for the free area management
  like POWER IOMMUs so it could use lib/iommu-helper as POWER does.
 
 Please look at Linus's current tree, I believe I have things
 in a working state, and the DMA mask portions should be easier
 to add now.

Thanks! It looks great.

iommu-page_table_map_base is on IO_PAGE_SIZE boundary? If so, the
following patch works, I think (sorry, it's only compile tested
again).

Thanks,

=
From 91af83802c30d071f98444846e85310a8d58ab3d Mon Sep 17 00:00:00 2001
From: FUJITA Tomonori [EMAIL PROTECTED]
Date: Sat, 16 Feb 2008 14:29:02 +0900
Subject: [PATCH] sparc64: make IOMMU code respect the segment boundary limits

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 arch/sparc64/kernel/iommu.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c
index d3276eb..ffefbf7 100644
--- a/arch/sparc64/kernel/iommu.c
+++ b/arch/sparc64/kernel/iommu.c
@@ -134,7 +134,8 @@ unsigned long iommu_range_alloc(struct device *dev,
else
boundary_size = ALIGN(1UL  32, 1  IO_PAGE_SHIFT);
 
-   n = iommu_area_alloc(arena-map, limit, start, npages, 0,
+   n = iommu_area_alloc(arena-map, limit, start, npages,
+iommu-page_table_map_base  IO_PAGE_SHIFT,
 boundary_size  IO_PAGE_SHIFT, 0);
if (n == -1) {
if (likely(pass  1)) {
-- 
1.5.3.4


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24 REGRESSION] BUG: Soft lockup - with VFS

2008-02-11 Thread FUJITA Tomonori

On Fri, 8 Feb 2008 23:46:19 -0800
Pete Zaitcev <[EMAIL PROTECTED]> wrote:

> On Tue, 5 Feb 2008 14:05:06 -0800, Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > > > http://students.zipernowsky.hu/~oliverp/kernel/regression_2624/
> 
> > I think ub.c is basically abandoned in favour of usb-storage.
> > If so, perhaps we should remove or disble ub.c?
> 
> Looks like it's just Tomo or Jens made a mistake when converting to
> the new s/g API. Nothing to be too concerned about. I know I should've
> reviewed their patch closer, but it seemed too simple...

I guess I can put the blame for this on Jens' commit (45711f1a) ;)

On a serious note, it seems that two scatter lists per request leaded
to this bug. Can the scatter list in struct ub_request be removed?

Thanks,

> -- Pete
> 
> Fix up the conversion to sg_init_table().
> 
> Signed-off-by: Pete Zaitcev <[EMAIL PROTECTED]>
> 
> --- a/drivers/block/ub.c
> +++ b/drivers/block/ub.c
> @@ -657,7 +657,6 @@ static int ub_request_fn_1(struct ub_lun *lun, struct 
> request *rq)
>   if ((cmd = ub_get_cmd(lun)) == NULL)
>   return -1;
>   memset(cmd, 0, sizeof(struct ub_scsi_cmd));
> - sg_init_table(cmd->sgv, UB_MAX_REQ_SG);
>  
>   blkdev_dequeue_request(rq);
>  
> @@ -668,6 +667,7 @@ static int ub_request_fn_1(struct ub_lun *lun, struct 
> request *rq)
>   /*
>* get scatterlist from block layer
>*/
> + sg_init_table(>sgv[0], UB_MAX_REQ_SG);
>   n_elem = blk_rq_map_sg(lun->disk->queue, rq, >sgv[0]);
>   if (n_elem < 0) {
>   /* Impossible, because blk_rq_map_sg should not hit ENOMEM. */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24 REGRESSION] BUG: Soft lockup - with VFS

2008-02-11 Thread FUJITA Tomonori

On Fri, 8 Feb 2008 23:46:19 -0800
Pete Zaitcev [EMAIL PROTECTED] wrote:

 On Tue, 5 Feb 2008 14:05:06 -0800, Andrew Morton [EMAIL PROTECTED] wrote:
 
http://students.zipernowsky.hu/~oliverp/kernel/regression_2624/
 
  I think ub.c is basically abandoned in favour of usb-storage.
  If so, perhaps we should remove or disble ub.c?
 
 Looks like it's just Tomo or Jens made a mistake when converting to
 the new s/g API. Nothing to be too concerned about. I know I should've
 reviewed their patch closer, but it seemed too simple...

I guess I can put the blame for this on Jens' commit (45711f1a) ;)

On a serious note, it seems that two scatter lists per request leaded
to this bug. Can the scatter list in struct ub_request be removed?

Thanks,

 -- Pete
 
 Fix up the conversion to sg_init_table().
 
 Signed-off-by: Pete Zaitcev [EMAIL PROTECTED]
 
 --- a/drivers/block/ub.c
 +++ b/drivers/block/ub.c
 @@ -657,7 +657,6 @@ static int ub_request_fn_1(struct ub_lun *lun, struct 
 request *rq)
   if ((cmd = ub_get_cmd(lun)) == NULL)
   return -1;
   memset(cmd, 0, sizeof(struct ub_scsi_cmd));
 - sg_init_table(cmd-sgv, UB_MAX_REQ_SG);
  
   blkdev_dequeue_request(rq);
  
 @@ -668,6 +667,7 @@ static int ub_request_fn_1(struct ub_lun *lun, struct 
 request *rq)
   /*
* get scatterlist from block layer
*/
 + sg_init_table(urq-sgv[0], UB_MAX_REQ_SG);
   n_elem = blk_rq_map_sg(lun-disk-queue, rq, urq-sgv[0]);
   if (n_elem  0) {
   /* Impossible, because blk_rq_map_sg should not hit ENOMEM. */
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PATCH] final SCSI updates for 2.6.24 merge window

2008-02-07 Thread FUJITA Tomonori

On Thu, 07 Feb 2008 19:37:07 -0600
James Bottomley <[EMAIL PROTECTED]> wrote:

> 
> On Thu, 2008-02-07 at 18:56 -0600, James Bottomley wrote:
> > Quite a bit of this is fixing things broken previously (the advansys fix
> > is still pending resolution, but I'll send it as an -rc fix when we have
> > it).  There's the final elimination of all drivers that are esp based
> > but don't use the scsi_esp core (that's mostly m68k and alpha).  Plus
> > the usual bunch of driver updates and the addition of a new enclosure
> > services driver and the corresponding ULD.
> 
> OK, the advansys fix came in.  I've added it to the patch.
> 
> James
> 
> 
> 
> >From f983323fea178352ed3b69c70561a13825a3ce59 Mon Sep 17 00:00:00 2001
> From: FUJITA Tomonori <[EMAIL PROTECTED]>
> Date: Fri, 8 Feb 2008 09:50:08 +0900
> Subject: [SCSI] advansys: fix overrun_buf aligned bug

Seems that it was a bit late, Linus pulled scsi-misc before the patch
was added.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Latest git oopses during boot

2008-02-07 Thread FUJITA Tomonori

On Thu, 7 Feb 2008 23:24:00 +0100
"Harald Arnesen" <[EMAIL PROTECTED]> wrote:

> On 2/7/08, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> >
> >
> > On Thu, 7 Feb 2008, Harald Arnesen wrote:
> > >
> > > I'll try applying the patch to a freshly downloaded git-tree.
> >
> > Ok, good.
> >
> > > Shall I try another compiler? I have at least these two:
> > >
> > > gcc version 3.4.6 (Ubuntu 3.4.6-6ubuntu2)
> > > gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
> >
> > I would suggest a patch mis-application problem first (or possibly even
> > the patch itself being broken - I simply didn't look very closely at the
> > patch, but it *looked* ok).
> >
> > If it's a compiler bug, it's a pretty big one, and quite frankly, I doubt
> > it. Compiler bugs do happen, but they are pretty rare, and they tend to
> > have more subtle effects than the one you see.
> >
> > However:
> >
> > > in addition to the self-compiled 4.2.3 I used for the tests.
> >
> > 4.2.3? Really? That's pretty damn recent, and so almost totally untested.
> > That does make a compiler bug at least more likely.
> >
> > So yes, if you already have other compilers installed, you should try
> > them. If it really is a compiler bug, it's a really bad one, and you would
> > want to let the gcc people know.
> >
> > Still, I'd double-check that the
> >
> > asc_dvc_varp->overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL);
> >
> > line was added properly first. You should see it way after the point where
> > it did
> >
> > asc_dvc_varp = >dvc_var.asc_dvc_var;
> >
> > to initialize it (and both statements should be inside a
> >
> > if (ASC_NARROW_BOARD(boardp)) {
> >
> > conditional - please check that the source code looks sane too).
> >
> > Linus
> 
> I just re-downloaded an re-patched and re-compiled (with gcc 4.2.3),
> and now the kernel boots. I must have screwed up the previous
> patching.
> 
> It now works, with Fujita's patch applied.

Thanks Harald and Linus,

The bug has been in the advansys driver. 2.6.23 and 2.6.24 works just
because the size of Scsi_Host structure was multiples of 8. After
2.6.24, some patches change Scsi_Host structure and now the size is
not multiples of 8. So we hit this bug.


I'll resend the patch with a proper description.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Latest git oopses during boot

2008-02-07 Thread FUJITA Tomonori

On Thu, 7 Feb 2008 12:14:56 +0100
"Harald Arnesen" <[EMAIL PROTECTED]> wrote:

> On 2/7/08, Andrew Morton <[EMAIL PROTECTED]> wrote:
> >
> > (cc's restored, and expanded a bit)
> 
> Ah, sorry, not used to gmail's web interface. Clicked the wrong button.
> 
> > > Seems to be the advansys driver, so I tried to remove it - and indeed,
> > > the kernel now boots. So I guess it's either that driver or my ancient
> > > Nikon Coolscan II that is the only thing attached to the board.
> >
> > Thanks.  I uploaded the oops picture to
> > http://userweb.kernel.org/~akpm/oops.jpg
> >
> > > Cc to the Matthew Wilcox added.
> >
> > mm...  looks like all Matthew's changes were in 2.6.23.  And 2.6.23 worked
> > OK, yes?
> 
> Both 2.6.23 and 2.6.24 are ok.
> 
> > The only recent changes to drivers/scsi/advansys.c are
> >
> > commit b80ca4f7ee36c26d300c5a8f429e73372d153379
> > Author: FUJITA Tomonori <[EMAIL PROTECTED]>
> > Date:   Sun Jan 13 15:46:13 2008 +0900
> >
> > [SCSI] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE
> >
> > This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in
> > several LLDs. It's a preparation for the future changes to remove
> > sense_buffer array in scsi_cmnd structure.
> >
> > Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
> > Signed-off-by: James Bottomley <[EMAIL PROTECTED]>
> >
> > :100644 100644 9dd3952... 492702b... M  drivers/scsi/advansys.c
> >
> > commit 747d016e7e25e216b31022fe2b012508d99fb682
> > Author: Randy Dunlap <[EMAIL PROTECTED]>
> > Date:   Mon Jan 14 00:55:18 2008 -0800
> >
> > advansys: fix section mismatch warning
> >
> > Fix section mismatch warning:
> >
> > WARNING: vmlinux.o(.exit.text+0x152a): Section mismatch: reference to 
> > .init.
> >
> > Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
> > Cc: Matthew Wilcox <[EMAIL PROTECTED]>
> > Cc: James Bottomley <[EMAIL PROTECTED]>
> > Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> > Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
> >
> > which seem fairly benign.
> >
> >
> > gcc inlining is going to make it rather a lot of work to find out which
> > statement has actually oopsed there.
> -- 

Can you try this?

Thanks,

diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c
index 374ed02..f5dde12 100644
--- a/drivers/scsi/advansys.c
+++ b/drivers/scsi/advansys.c
@@ -566,7 +566,7 @@ typedef struct asc_dvc_var {
ASC_SCSI_BIT_ID_TYPE unit_not_ready;
ASC_SCSI_BIT_ID_TYPE queue_full_or_busy;
ASC_SCSI_BIT_ID_TYPE start_motor;
-   uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8);
+   uchar *overrun_buf;
dma_addr_t overrun_dma;
uchar scsi_reset_wait;
uchar chip_no;
@@ -13833,6 +13833,12 @@ static int __devinit advansys_board_found(struct 
Scsi_Host *shost,
 */
if (ASC_NARROW_BOARD(boardp)) {
ASC_DBG(2, "AscInitAsc1000Driver()\n");
+
+   asc_dvc_varp->overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, 
GFP_KERNEL);
+   if (!asc_dvc_varp->overrun_buf) {
+   ret = -ENOMEM;
+   goto err_free_wide_mem;
+   }
warn_code = AscInitAsc1000Driver(asc_dvc_varp);
 
if (warn_code || asc_dvc_varp->err_code) {
@@ -13840,8 +13846,10 @@ static int __devinit advansys_board_found(struct 
Scsi_Host *shost,
"warn 0x%x, error 0x%x\n",
asc_dvc_varp->init_state, warn_code,
asc_dvc_varp->err_code);
-   if (asc_dvc_varp->err_code)
+   if (asc_dvc_varp->err_code) {
ret = -ENODEV;
+   kfree(asc_dvc_varp->overrun_buf);
+   }
}
} else {
if (advansys_wide_init_chip(shost))
@@ -13891,9 +13899,11 @@ static int advansys_release(struct Scsi_Host *shost)
free_dma(shost->dma_channel);
}
if (ASC_NARROW_BOARD(board)) {
+   ASC_DVC_VAR *asc_dvc_varp = >dvc_var.asc_dvc_var;
dma_unmap_single(board->dev,
board->dvc_var.asc_dvc_var.overrun_dma,
ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE);
+   kfree(asc_dvc_varp->overrun_buf);
} else {
iounmap(board->ioremap_addr);
advansys_wide_free_mem(board);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Latest git oopses during boot

2008-02-07 Thread FUJITA Tomonori

On Thu, 7 Feb 2008 12:14:56 +0100
Harald Arnesen [EMAIL PROTECTED] wrote:

 On 2/7/08, Andrew Morton [EMAIL PROTECTED] wrote:
 
  (cc's restored, and expanded a bit)
 
 Ah, sorry, not used to gmail's web interface. Clicked the wrong button.
 
   Seems to be the advansys driver, so I tried to remove it - and indeed,
   the kernel now boots. So I guess it's either that driver or my ancient
   Nikon Coolscan II that is the only thing attached to the board.
 
  Thanks.  I uploaded the oops picture to
  http://userweb.kernel.org/~akpm/oops.jpg
 
   Cc to the Matthew Wilcox added.
 
  mm...  looks like all Matthew's changes were in 2.6.23.  And 2.6.23 worked
  OK, yes?
 
 Both 2.6.23 and 2.6.24 are ok.
 
  The only recent changes to drivers/scsi/advansys.c are
 
  commit b80ca4f7ee36c26d300c5a8f429e73372d153379
  Author: FUJITA Tomonori [EMAIL PROTECTED]
  Date:   Sun Jan 13 15:46:13 2008 +0900
 
  [SCSI] replace sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE
 
  This replaces sizeof sense_buffer with SCSI_SENSE_BUFFERSIZE in
  several LLDs. It's a preparation for the future changes to remove
  sense_buffer array in scsi_cmnd structure.
 
  Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
  Signed-off-by: James Bottomley [EMAIL PROTECTED]
 
  :100644 100644 9dd3952... 492702b... M  drivers/scsi/advansys.c
 
  commit 747d016e7e25e216b31022fe2b012508d99fb682
  Author: Randy Dunlap [EMAIL PROTECTED]
  Date:   Mon Jan 14 00:55:18 2008 -0800
 
  advansys: fix section mismatch warning
 
  Fix section mismatch warning:
 
  WARNING: vmlinux.o(.exit.text+0x152a): Section mismatch: reference to 
  .init.
 
  Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
  Cc: Matthew Wilcox [EMAIL PROTECTED]
  Cc: James Bottomley [EMAIL PROTECTED]
  Signed-off-by: Andrew Morton [EMAIL PROTECTED]
  Signed-off-by: Linus Torvalds [EMAIL PROTECTED]
 
  which seem fairly benign.
 
 
  gcc inlining is going to make it rather a lot of work to find out which
  statement has actually oopsed there.
 -- 

Can you try this?

Thanks,

diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c
index 374ed02..f5dde12 100644
--- a/drivers/scsi/advansys.c
+++ b/drivers/scsi/advansys.c
@@ -566,7 +566,7 @@ typedef struct asc_dvc_var {
ASC_SCSI_BIT_ID_TYPE unit_not_ready;
ASC_SCSI_BIT_ID_TYPE queue_full_or_busy;
ASC_SCSI_BIT_ID_TYPE start_motor;
-   uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8);
+   uchar *overrun_buf;
dma_addr_t overrun_dma;
uchar scsi_reset_wait;
uchar chip_no;
@@ -13833,6 +13833,12 @@ static int __devinit advansys_board_found(struct 
Scsi_Host *shost,
 */
if (ASC_NARROW_BOARD(boardp)) {
ASC_DBG(2, AscInitAsc1000Driver()\n);
+
+   asc_dvc_varp-overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, 
GFP_KERNEL);
+   if (!asc_dvc_varp-overrun_buf) {
+   ret = -ENOMEM;
+   goto err_free_wide_mem;
+   }
warn_code = AscInitAsc1000Driver(asc_dvc_varp);
 
if (warn_code || asc_dvc_varp-err_code) {
@@ -13840,8 +13846,10 @@ static int __devinit advansys_board_found(struct 
Scsi_Host *shost,
warn 0x%x, error 0x%x\n,
asc_dvc_varp-init_state, warn_code,
asc_dvc_varp-err_code);
-   if (asc_dvc_varp-err_code)
+   if (asc_dvc_varp-err_code) {
ret = -ENODEV;
+   kfree(asc_dvc_varp-overrun_buf);
+   }
}
} else {
if (advansys_wide_init_chip(shost))
@@ -13891,9 +13899,11 @@ static int advansys_release(struct Scsi_Host *shost)
free_dma(shost-dma_channel);
}
if (ASC_NARROW_BOARD(board)) {
+   ASC_DVC_VAR *asc_dvc_varp = board-dvc_var.asc_dvc_var;
dma_unmap_single(board-dev,
board-dvc_var.asc_dvc_var.overrun_dma,
ASC_OVERRUN_BSIZE, DMA_FROM_DEVICE);
+   kfree(asc_dvc_varp-overrun_buf);
} else {
iounmap(board-ioremap_addr);
advansys_wide_free_mem(board);
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Latest git oopses during boot

2008-02-07 Thread FUJITA Tomonori

On Thu, 7 Feb 2008 23:24:00 +0100
Harald Arnesen [EMAIL PROTECTED] wrote:

 On 2/7/08, Linus Torvalds [EMAIL PROTECTED] wrote:
 
 
  On Thu, 7 Feb 2008, Harald Arnesen wrote:
  
   I'll try applying the patch to a freshly downloaded git-tree.
 
  Ok, good.
 
   Shall I try another compiler? I have at least these two:
  
   gcc version 3.4.6 (Ubuntu 3.4.6-6ubuntu2)
   gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
 
  I would suggest a patch mis-application problem first (or possibly even
  the patch itself being broken - I simply didn't look very closely at the
  patch, but it *looked* ok).
 
  If it's a compiler bug, it's a pretty big one, and quite frankly, I doubt
  it. Compiler bugs do happen, but they are pretty rare, and they tend to
  have more subtle effects than the one you see.
 
  However:
 
   in addition to the self-compiled 4.2.3 I used for the tests.
 
  4.2.3? Really? That's pretty damn recent, and so almost totally untested.
  That does make a compiler bug at least more likely.
 
  So yes, if you already have other compilers installed, you should try
  them. If it really is a compiler bug, it's a really bad one, and you would
  want to let the gcc people know.
 
  Still, I'd double-check that the
 
  asc_dvc_varp-overrun_buf = kzalloc(ASC_OVERRUN_BSIZE, GFP_KERNEL);
 
  line was added properly first. You should see it way after the point where
  it did
 
  asc_dvc_varp = boardp-dvc_var.asc_dvc_var;
 
  to initialize it (and both statements should be inside a
 
  if (ASC_NARROW_BOARD(boardp)) {
 
  conditional - please check that the source code looks sane too).
 
  Linus
 
 I just re-downloaded an re-patched and re-compiled (with gcc 4.2.3),
 and now the kernel boots. I must have screwed up the previous
 patching.
 
 It now works, with Fujita's patch applied.

Thanks Harald and Linus,

The bug has been in the advansys driver. 2.6.23 and 2.6.24 works just
because the size of Scsi_Host structure was multiples of 8. After
2.6.24, some patches change Scsi_Host structure and now the size is
not multiples of 8. So we hit this bug.


I'll resend the patch with a proper description.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PATCH] final SCSI updates for 2.6.24 merge window

2008-02-07 Thread FUJITA Tomonori

On Thu, 07 Feb 2008 19:37:07 -0600
James Bottomley [EMAIL PROTECTED] wrote:

 
 On Thu, 2008-02-07 at 18:56 -0600, James Bottomley wrote:
  Quite a bit of this is fixing things broken previously (the advansys fix
  is still pending resolution, but I'll send it as an -rc fix when we have
  it).  There's the final elimination of all drivers that are esp based
  but don't use the scsi_esp core (that's mostly m68k and alpha).  Plus
  the usual bunch of driver updates and the addition of a new enclosure
  services driver and the corresponding ULD.
 
 OK, the advansys fix came in.  I've added it to the patch.
 
 James
 
 
 
 From f983323fea178352ed3b69c70561a13825a3ce59 Mon Sep 17 00:00:00 2001
 From: FUJITA Tomonori [EMAIL PROTECTED]
 Date: Fri, 8 Feb 2008 09:50:08 +0900
 Subject: [SCSI] advansys: fix overrun_buf aligned bug

Seems that it was a bit late, Linus pulled scsi-misc before the patch
was added.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-06 Thread FUJITA Tomonori

On Wed, 06 Feb 2008 16:01:47 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> From: FUJITA Tomonori <[EMAIL PROTECTED]>
> Date: Thu, 07 Feb 2008 08:53:33 +0900
> 
> > On Wed, 06 Feb 2008 15:18:55 -0800 (PST)
> > David Miller <[EMAIL PROTECTED]> wrote:
> > 
> > > I intend to put merging back in, perhaps something similar to
> > > powerpc's merging logic but without the expensive (in my opinion)
> > > IOMMU allocation every loop.  I think it is better to determine the
> > > segment breaks in one pass, allocate that many IOMMU entries in one
> > > allocation, then fill them all in.
> > 
> > I thought about asking you if I can modify the SPARC IOMMUs to do
> > allocation in every loop.
> > 
> > The reason why I need the allocation in every loop is that I also need
> > to fix the problem that IOMMUs allocate memory areas without
> > considering a low level driver's segment boundary limits.
> > 
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html
> > 
> > As far as I know, all the IOMMUs except for SPARC allocate a free area
> > in every loop but if it's too expensive for SPARC, then we need to
> > find a different way to handle segment boundary limits.
> 
> Ok then what I'll do is adopt some version of powerpc's merging
> allocator into the sparc64 code.

Great, thanks! SPARC IOMMUs use bitmap for the free area management
like POWER IOMMUs so it could use lib/iommu-helper as POWER does.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-06 Thread FUJITA Tomonori

On Wed, 06 Feb 2008 15:18:55 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> From: FUJITA Tomonori <[EMAIL PROTECTED]>
> Date: Thu, 07 Feb 2008 08:12:36 +0900
> 
> > Really sorry about it.
> 
> I am happy to test patches you send to me in the future :-)

Thanks a lot.


> > PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC
> > but their first pass decides how to merge sg entires (and stores that
> > information in the sg entries), then the second pass simpliy follows
> > it (Hopefully I understand these IOMMUs correctly, or else I break
> > them too).
> 
> For now I've removed all of the merging code from the sparc64 IOMMU
> support so that other users do not get corrupt filesystems.  It
> basically mimicks how the intel-iommu code works, ie. no attempts to
> merge anything.

I've just saw it.


> I intend to put merging back in, perhaps something similar to
> powerpc's merging logic but without the expensive (in my opinion)
> IOMMU allocation every loop.  I think it is better to determine the
> segment breaks in one pass, allocate that many IOMMU entries in one
> allocation, then fill them all in.

I thought about asking you if I can modify the SPARC IOMMUs to do
allocation in every loop.

The reason why I need the allocation in every loop is that I also need
to fix the problem that IOMMUs allocate memory areas without
considering a low level driver's segment boundary limits.

http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html
http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html

As far as I know, all the IOMMUs except for SPARC allocate a free area
in every loop but if it's too expensive for SPARC, then we need to
find a different way to handle segment boundary limits.


> Ideally, we should have some generic code that does all of this.
> Then you would only need to test one implementation.
> 
> It is definitely doable and increasingly necessary as we have so
> many reimplementations of what is essentially identical core code.

Agreed though it's a very hard task.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ordering of commits and breaking bisecting

2008-02-06 Thread FUJITA Tomonori

On Wed, 6 Feb 2008 09:38:17 +1100
Stephen Rothwell <[EMAIL PROTECTED]> wrote:

> Hi all,
> 
> Can we be a bit more careful with the ordering of commits, please?
> Commit d22a6966b8029913fac37d078ab2403898d94c63 (iommu sg merging: add
> accessors for segment_boundary_mask in device_dma_parameters())
> introduces the dma_get_seg_boundary accessor which is used in the
> *preceding* commits fb3475e9b6bfa666107512fbd6006c26014f04b8 (for
> powerpc), and 1b39b077789955c8389488d53d075518fdcd582e and
> fde9a1094ddf2892188a8a0eccda527de47cba8e (for x86).
> 
> Thus we have another set of commits that will break bisection.  This one
> was easily avoidable.
> 
> A little more care is needed.

Sorry about it.

I submitted the patches in the proper order but the order seemed to be
reversed in the -mm.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html

I could have noticed the issue since it has been in the -mm for a long
time, sorry.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-06 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 20:41:38 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> 
> The sparc64 change:
> 
> commit fde6a3c82d67f592eb587be4d1b0ae6d4321
> Author: FUJITA Tomonori <[EMAIL PROTECTED]>
> Date:   Mon Feb 4 22:28:02 2008 -0800
> 
> iommu sg merging: sparc64: make iommu respect the segment size limits
> 
> This patch makes iommu respect segment size limits when merging sg
>     lists.
> 
> Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
> Cc: Jeff Garzik <[EMAIL PROTECTED]>
> Cc: James Bottomley <[EMAIL PROTECTED]>
> Acked-by: Jens Axboe <[EMAIL PROTECTED]>
> Cc: "David S. Miller" <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
> 
> has significant errors and is going to eat people's disks, as it just
> nearly did to mine.

Really sorry about it.


> Typically what you'll see are NULL pointer derefernces in
> dma_4v_map_sg() and dma_4u_map_sg() and then the kernel usually craps
> on your superblock very shortly thereafter.
> 
> The changeset above modified only prepare_sg() but that is only the
> first pass of the SG mapping algorithm of the sparc64 IOMMU layer.
> 
> The second pass that fills in the entries depends upon how the first
> pass does things.  So if you change the first pass decision making you
> have to update the second pass's as well.
>
> That second pass is implemented in fill_sg() (there is a version in
> both arch/sparc64/kernel/iommu.c and arch/sparc64/kernel/pci_sun4v.c),
> which probably needs new logic as was added to prepare_sg() to handle
> dma_get_max_seg_size().

PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC
but their first pass decides how to merge sg entires (and stores that
information in the sg entries), then the second pass simpliy follows
it (Hopefully I understand these IOMMUs correctly, or else I break
them too).

I'll try this work again for 2.6.26. The SPARC IOMMUs are the most
complicated and seems that few people tests the -mm SPARC code. So I
guess that I need to find a SPARC box...

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-06 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 20:41:38 -0800 (PST)
David Miller [EMAIL PROTECTED] wrote:

 
 The sparc64 change:
 
 commit fde6a3c82d67f592eb587be4d1b0ae6d4321
 Author: FUJITA Tomonori [EMAIL PROTECTED]
 Date:   Mon Feb 4 22:28:02 2008 -0800
 
 iommu sg merging: sparc64: make iommu respect the segment size limits
 
 This patch makes iommu respect segment size limits when merging sg
 lists.
 
 Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
 Cc: Jeff Garzik [EMAIL PROTECTED]
 Cc: James Bottomley [EMAIL PROTECTED]
 Acked-by: Jens Axboe [EMAIL PROTECTED]
 Cc: David S. Miller [EMAIL PROTECTED]
 Signed-off-by: Andrew Morton [EMAIL PROTECTED]
 Signed-off-by: Linus Torvalds [EMAIL PROTECTED]
 
 has significant errors and is going to eat people's disks, as it just
 nearly did to mine.

Really sorry about it.


 Typically what you'll see are NULL pointer derefernces in
 dma_4v_map_sg() and dma_4u_map_sg() and then the kernel usually craps
 on your superblock very shortly thereafter.
 
 The changeset above modified only prepare_sg() but that is only the
 first pass of the SG mapping algorithm of the sparc64 IOMMU layer.
 
 The second pass that fills in the entries depends upon how the first
 pass does things.  So if you change the first pass decision making you
 have to update the second pass's as well.

 That second pass is implemented in fill_sg() (there is a version in
 both arch/sparc64/kernel/iommu.c and arch/sparc64/kernel/pci_sun4v.c),
 which probably needs new logic as was added to prepare_sg() to handle
 dma_get_max_seg_size().

PARISC, Alpha, and IA64 IOMMUs use the two-pass algorithm like SPARC
but their first pass decides how to merge sg entires (and stores that
information in the sg entries), then the second pass simpliy follows
it (Hopefully I understand these IOMMUs correctly, or else I break
them too).

I'll try this work again for 2.6.26. The SPARC IOMMUs are the most
complicated and seems that few people tests the -mm SPARC code. So I
guess that I need to find a SPARC box...

Thanks,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Ordering of commits and breaking bisecting

2008-02-06 Thread FUJITA Tomonori

On Wed, 6 Feb 2008 09:38:17 +1100
Stephen Rothwell [EMAIL PROTECTED] wrote:

 Hi all,
 
 Can we be a bit more careful with the ordering of commits, please?
 Commit d22a6966b8029913fac37d078ab2403898d94c63 (iommu sg merging: add
 accessors for segment_boundary_mask in device_dma_parameters())
 introduces the dma_get_seg_boundary accessor which is used in the
 *preceding* commits fb3475e9b6bfa666107512fbd6006c26014f04b8 (for
 powerpc), and 1b39b077789955c8389488d53d075518fdcd582e and
 fde9a1094ddf2892188a8a0eccda527de47cba8e (for x86).
 
 Thus we have another set of commits that will break bisection.  This one
 was easily avoidable.
 
 A little more care is needed.

Sorry about it.

I submitted the patches in the proper order but the order seemed to be
reversed in the -mm.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html

I could have noticed the issue since it has been in the -mm for a long
time, sorry.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: more iommu sg merging fallout

2008-02-06 Thread FUJITA Tomonori

On Wed, 06 Feb 2008 16:01:47 -0800 (PST)
David Miller [EMAIL PROTECTED] wrote:

 From: FUJITA Tomonori [EMAIL PROTECTED]
 Date: Thu, 07 Feb 2008 08:53:33 +0900

  On Wed, 06 Feb 2008 15:18:55 -0800 (PST)
  David Miller [EMAIL PROTECTED] wrote:

   I intend to put merging back in, perhaps something similar to
   powerpc's merging logic but without the expensive (in my opinion)
   IOMMU allocation every loop.  I think it is better to determine the
   segment breaks in one pass, allocate that many IOMMU entries in one
   allocation, then fill them all in.

  I thought about asking you if I can modify the SPARC IOMMUs to do
  allocation in every loop.

  The reason why I need the allocation in every loop is that I also need
  to fix the problem that IOMMUs allocate memory areas without
  considering a low level driver's segment boundary limits.

  http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-11/msg07616.html
  http://linux.derkeiler.com/Mailing-Lists/Kernel/2007-12/msg02286.html

  As far as I know, all the IOMMUs except for SPARC allocate a free area
  in every loop but if it's too expensive for SPARC, then we need to
  find a different way to handle segment boundary limits.

 Ok then what I'll do is adopt some version of powerpc's merging
 allocator into the sparc64 code.

Great, thanks! SPARC IOMMUs use bitmap for the free area management
like POWER IOMMUs so it could use lib/iommu-helper as POWER does.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 18:09:15 +0100
Matteo Tescione <[EMAIL PROTECTED]> wrote:

> On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >> 
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> 
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >> 
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> > 
> > 
> >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> >> server. Not only that - your data is probably corrupted, or at least the
> >> filesystem deserves checking...
> 
> Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> manages stop/crash, by sending unit attention to clients on reconnect.
> Drbd+heartbeat correctly manages those things too.
> Still from an end-user POV, i was able to reboot/survive a crash only with
> SCST, IETD still has reconnect problems and STGT are even worst.

Please tell us on stgt-devel mailing list if you see problems. We will
try to fix them.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 17:07:07 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:

> FUJITA Tomonori schrieb:
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >>
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >>
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> 
> The problem with tgtd is that you can't start it (configured) in an
> "atomic" way.
> Usually, one will start tgtd and it's configuration in a script (I 
> replaced some parameters with "..." to make it shorter and more readable):

Thanks for the details. So the way to stop the daemon is not related
with your problem.

It's easily fixable. Can you start a new thread about this on
stgt-devel mailing list? When we agree on the interface to start the
daemon, I'll implement it.


> tgtd
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...

(snip)

> So the only way to start/restart tgtd reliably is to do hacks which are 
> needed with yet another iSCSI kernel implementation (IET): use iptables.
> 
> iptables 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> iptables 
> 
> 
> A bit ugly, isn't it?
> Having to tinker with a firewall in order to start a daemon is by no 
> means a sign of a well-tested and mature project.
> 
> That's why I asked how many people use stgt in a production environment 
> - James was worried about a potential migration path for current users.

I don't know how many people use stgt in a production environment but
I'm not sure that this problem prevents many people from using it in a
production environment.

You want to reboot a server running target devices while initiators
connect to it. Rebooting the target server behind the initiators
seldom works. System adminstorators in my workplace reboot storage
devices once a year and tell us to shut down the initiator machines
that use them before that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Mon, 4 Feb 2008 20:07:01 -0600
"Chris Weiss" <[EMAIL PROTECTED]> wrote:

> On Feb 4, 2008 11:30 AM, Douglas Gilbert <[EMAIL PROTECTED]> wrote:
> > Alan Cox wrote:
> > >> better. So for example, I personally suspect that ATA-over-ethernet is 
> > >> way
> > >> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> > >> low-level, and against those crazy SCSI people to begin with.
> > >
> > > Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > > would probably trash iSCSI for latency if nothing else.
> >
> > And a variant that doesn't do ATA or IP:
> > http://www.fcoe.com/
> >
> 
> however, and interestingly enough, the open-fcoe software target
> depends on scst (for now anyway)

STGT also supports software FCoE target driver though it's still
experimental stuff.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12705.html

It works in user space like STGT's iSCSI (and iSER) target driver
(i.e. no kernel/user space interaction).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 05:43:10 +0100
Matteo Tescione <[EMAIL PROTECTED]> wrote:

> Hi all,
> And sorry for intrusion, i am not a developer but i work everyday with iscsi
> and i found it fantastic.
> Altough Aoe, Fcoe and so on could be better, we have to look in real world
> implementations what is needed *now*, and if we look at vmware world,
> virtual iron, microsoft clustering etc, the answer is iSCSI.
> And now, SCST is the best open-source iSCSI target. So, from an end-user
> point of view, what are the really problems to not integrate scst in the
> mainstream kernel?

Currently, the best open-source iSCSI target implemenation in Linux is
Nicholas's LIO, I guess.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:

> James Bottomley schrieb:
> 
> > These are both features being independently worked on, are they not?
> > Even if they weren't, the combination of the size of SCST in kernel plus
> > the problem of having to find a migration path for the current STGT
> > users still looks to me to involve the greater amount of work.
> 
> I don't want to be mean, but does anyone actually use STGT in
> production? Seriously?
> 
> In the latest development version of STGT, it's only possible to stop
> the tgtd target daemon using KILL / 9 signal - which also means all
> iSCSI initiator connections are corrupted when tgtd target daemon is
> started again (kernel upgrade, target daemon upgrade, server reboot etc.).

I don't know what "iSCSI initiator connections are corrupted"
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


> Imagine you have to reboot all your NFS clients when you reboot your NFS
> server. Not only that - your data is probably corrupted, or at least the
> filesystem deserves checking...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 05:43:10 +0100
Matteo Tescione [EMAIL PROTECTED] wrote:

 Hi all,
 And sorry for intrusion, i am not a developer but i work everyday with iscsi
 and i found it fantastic.
 Altough Aoe, Fcoe and so on could be better, we have to look in real world
 implementations what is needed *now*, and if we look at vmware world,
 virtual iron, microsoft clustering etc, the answer is iSCSI.
 And now, SCST is the best open-source iSCSI target. So, from an end-user
 point of view, what are the really problems to not integrate scst in the
 mainstream kernel?

Currently, the best open-source iSCSI target implemenation in Linux is
Nicholas's LIO, I guess.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski [EMAIL PROTECTED] wrote:

 James Bottomley schrieb:
 
  These are both features being independently worked on, are they not?
  Even if they weren't, the combination of the size of SCST in kernel plus
  the problem of having to find a migration path for the current STGT
  users still looks to me to involve the greater amount of work.
 
 I don't want to be mean, but does anyone actually use STGT in
 production? Seriously?
 
 In the latest development version of STGT, it's only possible to stop
 the tgtd target daemon using KILL / 9 signal - which also means all
 iSCSI initiator connections are corrupted when tgtd target daemon is
 started again (kernel upgrade, target daemon upgrade, server reboot etc.).

I don't know what iSCSI initiator connections are corrupted
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


 Imagine you have to reboot all your NFS clients when you reboot your NFS
 server. Not only that - your data is probably corrupted, or at least the
 filesystem deserves checking...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Mon, 4 Feb 2008 20:07:01 -0600
Chris Weiss [EMAIL PROTECTED] wrote:

 On Feb 4, 2008 11:30 AM, Douglas Gilbert [EMAIL PROTECTED] wrote:
  Alan Cox wrote:
   better. So for example, I personally suspect that ATA-over-ethernet is 
   way
   better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
   low-level, and against those crazy SCSI people to begin with.
  
   Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
   would probably trash iSCSI for latency if nothing else.
 
  And a variant that doesn't do ATA or IP:
  http://www.fcoe.com/
 
 
 however, and interestingly enough, the open-fcoe software target
 depends on scst (for now anyway)

STGT also supports software FCoE target driver though it's still
experimental stuff.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12705.html

It works in user space like STGT's iSCSI (and iSER) target driver
(i.e. no kernel/user space interaction).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 17:07:07 +0100
Tomasz Chmielewski [EMAIL PROTECTED] wrote:

 FUJITA Tomonori schrieb:
  On Tue, 05 Feb 2008 08:14:01 +0100
  Tomasz Chmielewski [EMAIL PROTECTED] wrote:
  
  James Bottomley schrieb:
 
  These are both features being independently worked on, are they not?
  Even if they weren't, the combination of the size of SCST in kernel plus
  the problem of having to find a migration path for the current STGT
  users still looks to me to involve the greater amount of work.
  I don't want to be mean, but does anyone actually use STGT in
  production? Seriously?
 
  In the latest development version of STGT, it's only possible to stop
  the tgtd target daemon using KILL / 9 signal - which also means all
  iSCSI initiator connections are corrupted when tgtd target daemon is
  started again (kernel upgrade, target daemon upgrade, server reboot etc.).
  
  I don't know what iSCSI initiator connections are corrupted
  mean. But if you reboot a server, how can an iSCSI target
  implementation keep iSCSI tcp connections?
 
 The problem with tgtd is that you can't start it (configured) in an
 atomic way.
 Usually, one will start tgtd and it's configuration in a script (I 
 replaced some parameters with ... to make it shorter and more readable):

Thanks for the details. So the way to stop the daemon is not related
with your problem.

It's easily fixable. Can you start a new thread about this on
stgt-devel mailing list? When we agree on the interface to start the
daemon, I'll implement it.


 tgtd
 tgtadm --op new ...
 tgtadm --lld iscsi --op new ...

(snip)

 So the only way to start/restart tgtd reliably is to do hacks which are 
 needed with yet another iSCSI kernel implementation (IET): use iptables.
 
 iptables block iSCSI traffic
 tgtd
 sleep 1
 tgtadm --op new ...
 tgtadm --lld iscsi --op new ...
 iptables unblock iSCSI traffic
 
 
 A bit ugly, isn't it?
 Having to tinker with a firewall in order to start a daemon is by no 
 means a sign of a well-tested and mature project.
 
 That's why I asked how many people use stgt in a production environment 
 - James was worried about a potential migration path for current users.

I don't know how many people use stgt in a production environment but
I'm not sure that this problem prevents many people from using it in a
production environment.

You want to reboot a server running target devices while initiators
connect to it. Rebooting the target server behind the initiators
seldom works. System adminstorators in my workplace reboot storage
devices once a year and tell us to shut down the initiator machines
that use them before that.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori

On Tue, 05 Feb 2008 18:09:15 +0100
Matteo Tescione [EMAIL PROTECTED] wrote:

 On 5-02-2008 14:38, FUJITA Tomonori [EMAIL PROTECTED] wrote:
 
  On Tue, 05 Feb 2008 08:14:01 +0100
  Tomasz Chmielewski [EMAIL PROTECTED] wrote:
  
  James Bottomley schrieb:
  
  These are both features being independently worked on, are they not?
  Even if they weren't, the combination of the size of SCST in kernel plus
  the problem of having to find a migration path for the current STGT
  users still looks to me to involve the greater amount of work.
  
  I don't want to be mean, but does anyone actually use STGT in
  production? Seriously?
  
  In the latest development version of STGT, it's only possible to stop
  the tgtd target daemon using KILL / 9 signal - which also means all
  iSCSI initiator connections are corrupted when tgtd target daemon is
  started again (kernel upgrade, target daemon upgrade, server reboot etc.).
  
  I don't know what iSCSI initiator connections are corrupted
  mean. But if you reboot a server, how can an iSCSI target
  implementation keep iSCSI tcp connections?
  
  
  Imagine you have to reboot all your NFS clients when you reboot your NFS
  server. Not only that - your data is probably corrupted, or at least the
  filesystem deserves checking...
 
 Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
 rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
 manages stop/crash, by sending unit attention to clients on reconnect.
 Drbd+heartbeat correctly manages those things too.
 Still from an end-user POV, i was able to reboot/survive a crash only with
 SCST, IETD still has reconnect problems and STGT are even worst.

Please tell us on stgt-devel mailing list if you see problems. We will
try to fix them.

Thanks,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-01-30 Thread FUJITA Tomonori

On Wed, 30 Jan 2008 14:10:47 +0100
"Bart Van Assche" <[EMAIL PROTECTED]> wrote:

> On Jan 30, 2008 11:56 AM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > On Wed, 30 Jan 2008 09:38:04 +0100
> > "Bart Van Assche" <[EMAIL PROTECTED]> wrote:
> > >
> > > Please specify which parameters you are referring to. As you know I
> >
> > Sorry, I can't say. I don't know much about iSER. But seems that Pete
> > and Robin can get the better I/O performance - line speed ratiwo with
> > STGT.
> 
> Robin Humble was using a DDR InfiniBand network, while my tests were
> performed with an SDR InfiniBand network. Robin's results can't be
> directly compared to my results.

I know that you use different hardware. I used 'ratio' word.


BTW, you said the performance difference of dio READ is 38% but I
think it's 27.3 %, though it's still large.


> Pete Wyckoff's results
> (http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf) are hard
> to interpret. I have asked Pete which of the numbers in his test can
> be compared with what I measured, but Pete did not reply.
> 
> > The version of OpenIB might matters too. For example, Pete said that
> > STGT reads loses about 100 MB/s for some transfer sizes for some
> > transfer sizes due to the OpenIB version difference or other unclear
> > reasons.
> >
> > http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135
> 
> Pete wrote about a degradation from 600 MB/s to 500 MB/s for reads
> with STGT+iSER. In my tests I measured 589 MB/s for reads (direct
> I/O), which matches with the better result obtained by Pete.

I don't know he used the same benchmark software so I don't think that
we can compare them.

All I tried to say is the OFED version might has big effect on the
performance. So you might need to find the best one.


> Note: the InfiniBand kernel modules I used were those from the
> 2.6.22.9 kernel, not from the OFED distribution.

I'm talking about a target machine (I think that Pete was also talking
about OFED on his target machine). STGT uses OFED libraries, I think.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-01-30 Thread FUJITA Tomonori

On Wed, 30 Jan 2008 09:38:04 +0100
"Bart Van Assche" <[EMAIL PROTECTED]> wrote:

> On Jan 30, 2008 12:32 AM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> >
> > iSER has parameters to limit the maximum size of RDMA (it needs to
> > repeat RDMA with a poor configuration)?
> 
> Please specify which parameters you are referring to. As you know I

Sorry, I can't say. I don't know much about iSER. But seems that Pete
and Robin can get the better I/O performance - line speed ratio with
STGT.

The version of OpenIB might matters too. For example, Pete said that
STGT reads loses about 100 MB/s for some transfer sizes for some
transfer sizes due to the OpenIB version difference or other unclear
reasons.

http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135

It's fair to say that it takes long time and need lots of knowledge to
get the maximum performance of SAN, I think.

I think that it would be easier to convince James with the detailed
analysis (e.g. where does it take so long, like Pete did), not just
'dd' performance results.

Pushing iSCSI target code into mainline failed four times: IET, SCST,
STGT (doing I/Os in kernel in the past), and PyX's one (*1). iSCSI
target code is huge. You said SCST comprises 14,000 lines, but it's
not iSCSI target code. The SCSI engine code comprises 14,000
lines. You need another 10,000 lines for the iSCSI driver. Note that
SCST's iSCSI driver provides only basic iSCSI features. PyX's iSCSI
target code implemenents more iSCSI features (like MC/S, ERL2, etc)
and comprises about 60,000 lines and it still lacks some features like
iSER, bidi, etc.

I think that it's reasonable to say that we need more than 'dd'
results before pushing about possible more than 60,000 lines to
mainline.

(*1) http://linux-iscsi.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-01-30 Thread FUJITA Tomonori

On Wed, 30 Jan 2008 09:38:04 +0100
Bart Van Assche [EMAIL PROTECTED] wrote:

 On Jan 30, 2008 12:32 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote:
 
  iSER has parameters to limit the maximum size of RDMA (it needs to
  repeat RDMA with a poor configuration)?
 
 Please specify which parameters you are referring to. As you know I

Sorry, I can't say. I don't know much about iSER. But seems that Pete
and Robin can get the better I/O performance - line speed ratio with
STGT.

The version of OpenIB might matters too. For example, Pete said that
STGT reads loses about 100 MB/s for some transfer sizes for some
transfer sizes due to the OpenIB version difference or other unclear
reasons.

http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135

It's fair to say that it takes long time and need lots of knowledge to
get the maximum performance of SAN, I think.

I think that it would be easier to convince James with the detailed
analysis (e.g. where does it take so long, like Pete did), not just
'dd' performance results.

Pushing iSCSI target code into mainline failed four times: IET, SCST,
STGT (doing I/Os in kernel in the past), and PyX's one (*1). iSCSI
target code is huge. You said SCST comprises 14,000 lines, but it's
not iSCSI target code. The SCSI engine code comprises 14,000
lines. You need another 10,000 lines for the iSCSI driver. Note that
SCST's iSCSI driver provides only basic iSCSI features. PyX's iSCSI
target code implemenents more iSCSI features (like MC/S, ERL2, etc)
and comprises about 60,000 lines and it still lacks some features like
iSER, bidi, etc.

I think that it's reasonable to say that we need more than 'dd'
results before pushing about possible more than 60,000 lines to
mainline.

(*1) http://linux-iscsi.org/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-01-30 Thread FUJITA Tomonori

On Wed, 30 Jan 2008 14:10:47 +0100
Bart Van Assche [EMAIL PROTECTED] wrote:

 On Jan 30, 2008 11:56 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote:
  On Wed, 30 Jan 2008 09:38:04 +0100
  Bart Van Assche [EMAIL PROTECTED] wrote:
  
   Please specify which parameters you are referring to. As you know I
 
  Sorry, I can't say. I don't know much about iSER. But seems that Pete
  and Robin can get the better I/O performance - line speed ratiwo with
  STGT.
 
 Robin Humble was using a DDR InfiniBand network, while my tests were
 performed with an SDR InfiniBand network. Robin's results can't be
 directly compared to my results.

I know that you use different hardware. I used 'ratio' word.


BTW, you said the performance difference of dio READ is 38% but I
think it's 27.3 %, though it's still large.


 Pete Wyckoff's results
 (http://www.osc.edu/~pw/papers/wyckoff-iser-snapi07-talk.pdf) are hard
 to interpret. I have asked Pete which of the numbers in his test can
 be compared with what I measured, but Pete did not reply.
 
  The version of OpenIB might matters too. For example, Pete said that
  STGT reads loses about 100 MB/s for some transfer sizes for some
  transfer sizes due to the OpenIB version difference or other unclear
  reasons.
 
  http://article.gmane.org/gmane.linux.iscsi.tgt.devel/135
 
 Pete wrote about a degradation from 600 MB/s to 500 MB/s for reads
 with STGT+iSER. In my tests I measured 589 MB/s for reads (direct
 I/O), which matches with the better result obtained by Pete.

I don't know he used the same benchmark software so I don't think that
we can compare them.

All I tried to say is the OFED version might has big effect on the
performance. So you might need to find the best one.


 Note: the InfiniBand kernel modules I used were those from the
 2.6.22.9 kernel, not from the OFED distribution.

I'm talking about a target machine (I think that Pete was also talking
about OFED on his target machine). STGT uses OFED libraries, I think.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-01-29 Thread FUJITA Tomonori

On Tue, 29 Jan 2008 13:31:52 -0800
Roland Dreier <[EMAIL PROTECTED]> wrote:

>  > .   .   STGT read SCST read.STGT read  
> SCST read.
>  > .   .  performance   performance   . performance   
>  performance   .
>  > .   .  (0.5K, MB/s)  (0.5K, MB/s)  .   (1 MB, 
> MB/s)   (1 MB, MB/s)  .
>  > . iSER (8 Gb/s network) . 250N/A   .   360 
>   N/A   .
>  > . SRP  (8 Gb/s network) . N/A421   .   N/A 
>   683   .
> 
>  > On the comparable figures, which only seem to be IPoIB they're showing a
>  > 13-18% variance, aren't they?  Which isn't an incredible difference.
> 
> Maybe I'm all wet, but I think iSER vs. SRP should be roughly
> comparable.  The exact formatting of various messages etc. is
> different but the data path using RDMA is pretty much identical.  So
> the big difference between STGT iSER and SCST SRP hints at some big
> difference in the efficiency of the two implementations.

iSER has parameters to limit the maximum size of RDMA (it needs to
repeat RDMA with a poor configuration)?


Anyway, here's the results from Robin Humble:

iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd,
initiator end booted with mem=512M, target with 8G ram

 direct i/o dd
  write/read  800/751 MB/s
dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct

http://www.mail-archive.com/[EMAIL PROTECTED]/msg13502.html

I think that STGT is pretty fast with the fast backing storage. 


I don't think that there is the notable perfornace difference between
kernel-space and user-space SRP (or ISER) implementations about moving
data between hosts. IB is expected to enable user-space applications
to move data between hosts quickly (if not, what can IB provide us?).

I think that the question is how fast user-space applications can do
I/Os ccompared with I/Os in kernel space. STGT is eager for the advent
of good asynchronous I/O and event notification interfances.


One more possible optimization for STGT is zero-copy data
transfer. STGT uses pre-registered buffers and move data between page
cache and thsse buffers, and then does RDMA transfer. If we implement
own caching mechanism to use pre-registered buffers directly with (AIO
and O_DIRECT), then STGT can move data without data copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Integration of SCST in the mainstream Linux kernel

2008-01-29 Thread FUJITA Tomonori

On Tue, 29 Jan 2008 13:31:52 -0800
Roland Dreier [EMAIL PROTECTED] wrote:

   .   .   STGT read SCST read.STGT read  
 SCST read.
   .   .  performance   performance   . performance   
  performance   .
   .   .  (0.5K, MB/s)  (0.5K, MB/s)  .   (1 MB, 
 MB/s)   (1 MB, MB/s)  .
   . iSER (8 Gb/s network) . 250N/A   .   360 
   N/A   .
   . SRP  (8 Gb/s network) . N/A421   .   N/A 
   683   .
 
   On the comparable figures, which only seem to be IPoIB they're showing a
   13-18% variance, aren't they?  Which isn't an incredible difference.
 
 Maybe I'm all wet, but I think iSER vs. SRP should be roughly
 comparable.  The exact formatting of various messages etc. is
 different but the data path using RDMA is pretty much identical.  So
 the big difference between STGT iSER and SCST SRP hints at some big
 difference in the efficiency of the two implementations.

iSER has parameters to limit the maximum size of RDMA (it needs to
repeat RDMA with a poor configuration)?


Anyway, here's the results from Robin Humble:

iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd,
initiator end booted with mem=512M, target with 8G ram

 direct i/o dd
  write/read  800/751 MB/s
dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct

http://www.mail-archive.com/[EMAIL PROTECTED]/msg13502.html

I think that STGT is pretty fast with the fast backing storage. 


I don't think that there is the notable perfornace difference between
kernel-space and user-space SRP (or ISER) implementations about moving
data between hosts. IB is expected to enable user-space applications
to move data between hosts quickly (if not, what can IB provide us?).

I think that the question is how fast user-space applications can do
I/Os ccompared with I/Os in kernel space. STGT is eager for the advent
of good asynchronous I/O and event notification interfances.


One more possible optimization for STGT is zero-copy data
transfer. STGT uses pre-registered buffers and move data between page
cache and thsse buffers, and then does RDMA transfer. If we implement
own caching mechanism to use pre-registered buffers directly with (AIO
and O_DIRECT), then STGT can move data without data copies.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] ch: remove forward declarations

2008-01-24 Thread FUJITA Tomonori

This moves ch_template and changer_fops structs to the end of file and
removes forward declarations.

This also removes some trailing whitespace.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 drivers/scsi/ch.c |  120 -
 1 files changed, 54 insertions(+), 66 deletions(-)

diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
index 2b07014..7aad154 100644
--- a/drivers/scsi/ch.c
+++ b/drivers/scsi/ch.c
@@ -90,16 +90,6 @@ static const char * vendor_labels[CH_TYPES-4] = {
 
 #define MAX_RETRIES   1
 
-static int  ch_probe(struct device *);
-static int  ch_remove(struct device *);
-static int  ch_open(struct inode * inode, struct file * filp);
-static int  ch_release(struct inode * inode, struct file * filp);
-static long ch_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
-#ifdef CONFIG_COMPAT
-static long ch_ioctl_compat(struct file * filp,
-   unsigned int cmd, unsigned long arg);
-#endif
-
 static struct class * ch_sysfs_class;
 
 typedef struct {
@@ -118,27 +108,6 @@ typedef struct {
 static DEFINE_IDR(ch_index_idr);
 static DEFINE_SPINLOCK(ch_index_lock);
 
-static struct scsi_driver ch_template =
-{
-   .owner  = THIS_MODULE,
-   .gendrv = {
-   .name   = "ch",
-   .probe  = ch_probe,
-   .remove = ch_remove,
-   },
-};
-
-static const struct file_operations changer_fops =
-{
-   .owner  = THIS_MODULE,
-   .open   = ch_open,
-   .release= ch_release,
-   .unlocked_ioctl = ch_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = ch_ioctl_compat,
-#endif
-};
-
 static const struct {
unsigned char  sense;
unsigned char  asc;
@@ -207,7 +176,7 @@ ch_do_scsi(scsi_changer *ch, unsigned char *cmd,
 {
int errno, retries = 0, timeout, result;
struct scsi_sense_hdr sshdr;
-   
+
timeout = (cmd[0] == INITIALIZE_ELEMENT_STATUS)
? timeout_init : timeout_move;
 
@@ -245,7 +214,7 @@ static int
 ch_elem_to_typecode(scsi_changer *ch, u_int elem)
 {
int i;
-   
+
for (i = 0; i < CH_TYPES; i++) {
if (elem >= ch->firsts[i]  &&
elem <  ch->firsts[i] +
@@ -261,15 +230,15 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char 
*data)
u_char  cmd[12];
u_char  *buffer;
int result;
-   
+
buffer = kmalloc(512, GFP_KERNEL | GFP_DMA);
if(!buffer)
return -ENOMEM;
-   
+
  retry:
memset(cmd,0,sizeof(cmd));
cmd[0] = READ_ELEMENT_STATUS;
-   cmd[1] = (ch->device->lun << 5) | 
+   cmd[1] = (ch->device->lun << 5) |
(ch->voltags ? 0x10 : 0) |
ch_elem_to_typecode(ch,elem);
cmd[2] = (elem >> 8) & 0xff;
@@ -296,7 +265,7 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char 
*data)
return result;
 }
 
-static int 
+static int
 ch_init_elem(scsi_changer *ch)
 {
int err;
@@ -322,7 +291,7 @@ ch_readconfig(scsi_changer *ch)
buffer = kzalloc(512, GFP_KERNEL | GFP_DMA);
if (!buffer)
return -ENOMEM;
-   
+
memset(cmd,0,sizeof(cmd));
cmd[0] = MODE_SENSE;
cmd[1] = ch->device->lun << 5;
@@ -365,7 +334,7 @@ ch_readconfig(scsi_changer *ch)
} else {
vprintk("reading element address assigment page failed!\n");
}
-   
+
/* vendor specific element types */
for (i = 0; i < 4; i++) {
if (0 == vendor_counts[i])
@@ -443,7 +412,7 @@ static int
 ch_position(scsi_changer *ch, u_int trans, u_int elem, int rotate)
 {
u_char  cmd[10];
-   
+
dprintk("position: 0x%x\n",elem);
if (0 == trans)
trans = ch->firsts[CHET_MT];
@@ -462,7 +431,7 @@ static int
 ch_move(scsi_changer *ch, u_int trans, u_int src, u_int dest, int rotate)
 {
u_char  cmd[12];
-   
+
dprintk("move: 0x%x => 0x%x\n",src,dest);
if (0 == trans)
trans = ch->firsts[CHET_MT];
@@ -484,7 +453,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src,
u_int dest1, u_int dest2, int rotate1, int rotate2)
 {
u_char  cmd[12];
-   
+
dprintk("exchange: 0x%x => 0x%x => 0x%x\n",
src,dest1,dest2);
if (0 == trans)
@@ -501,7 +470,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src,
cmd[8]  = (dest2 >> 8) & 0xff;
cmd[9]  =  dest2   & 0xff;
cmd[10] = (rotate1 ? 1 : 0) | (rotate2 ? 2 : 0);
-   
+
return ch_do_scsi(ch, cmd, NULL,0, DMA_NONE);
 }
 
@@ -539,14 +508,14 @@ ch_set_voltag(scsi_changer *ch, u_int elem,
elem, tag);
memset(cmd,0,sizeof(cmd));

[PATCH 1/2] ch: fix device minor number management bug

2008-01-24 Thread FUJITA Tomonori

ch_probe uses the total number of ch devices as minor.

ch_probe:
ch->minor = ch_devcount;
...
ch_devcount++;

Then ch_remove decreases ch_devcount:

ch_remove:
ch_devcount--;

If you have two ch devices, e.g. sch0 and sch1, and remove sch0,
ch_devcount is 1. Then if you add another ch device, ch_probe tries to
create sch1. So you get a warning and fail to create sch1:

Jan 24 16:01:05 nice kernel: sysfs: duplicate filename 'sch1' can not be created
Jan 24 16:01:05 nice kernel: WARNING: at fs/sysfs/dir.c:424 sysfs_add_one()
Jan 24 16:01:05 nice kernel: Pid: 2571, comm: iscsid Not tainted 
2.6.24-rc7-ga3d2c2e8-dirty #1
Jan 24 16:01:05 nice kernel:
Jan 24 16:01:05 nice kernel: Call Trace:
Jan 24 16:01:05 nice kernel:  [] sysfs_add_one+0x54/0xbd
Jan 24 16:01:05 nice kernel:  [] create_dir+0x4f/0x87
Jan 24 16:01:05 nice kernel:  [] sysfs_create_dir+0x35/0x4a
Jan 24 16:01:05 nice kernel:  [] kobject_get+0x12/0x17
Jan 24 16:01:05 nice kernel:  [] kobject_add+0xf3/0x1a6
Jan 24 16:01:05 nice kernel:  [] class_device_add+0xaa/0x39d
Jan 24 16:01:05 nice kernel:  [] class_device_create+0xcb/0xfa
Jan 24 16:01:05 nice kernel:  [] printk+0x4e/0x56
Jan 24 16:01:05 nice kernel:  [] sysfs_ilookup_test+0x0/0xf
Jan 24 16:01:05 nice kernel:  [] :ch:ch_probe+0xbe/0x61a

(snip)

And you can't add new ch device any more.

This patch converts ch to use a common minor number management way,
idr like sg and bsg do.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 drivers/scsi/ch.c |   71 +---
 1 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
index 765f2fc..2b07014 100644
--- a/drivers/scsi/ch.c
+++ b/drivers/scsi/ch.c
@@ -21,6 +21,7 @@
 #include 
 #include /* here are all the ioctls */
 #include 
+#include 
 
 #include 
 #include 
@@ -33,6 +34,7 @@
 
 #define CH_DT_MAX   16
 #define CH_TYPES8
+#define CH_MAX_DEVS 128
 
 MODULE_DESCRIPTION("device driver for scsi media changer devices");
 MODULE_AUTHOR("Gerd Knorr <[EMAIL PROTECTED]>");
@@ -113,9 +115,8 @@ typedef struct {
struct mutexlock;
 } scsi_changer;
 
-static LIST_HEAD(ch_devlist);
-static DEFINE_SPINLOCK(ch_devlist_lock);
-static int ch_devcount;
+static DEFINE_IDR(ch_index_idr);
+static DEFINE_SPINLOCK(ch_index_lock);
 
 static struct scsi_driver ch_template =
 {
@@ -598,20 +599,17 @@ ch_release(struct inode *inode, struct file *file)
 static int
 ch_open(struct inode *inode, struct file *file)
 {
-   scsi_changer *tmp, *ch;
+   scsi_changer *ch;
int minor = iminor(inode);
 
-   spin_lock(_devlist_lock);
-   ch = NULL;
-   list_for_each_entry(tmp,_devlist,list) {
-   if (tmp->minor == minor)
-   ch = tmp;
-   }
+   spin_lock(_index_lock);
+   ch = idr_find(_index_idr, minor);
+
if (NULL == ch || scsi_device_get(ch->device)) {
-   spin_unlock(_devlist_lock);
+   spin_unlock(_index_lock);
return -ENXIO;
}
-   spin_unlock(_devlist_lock);
+   spin_unlock(_index_lock);
 
file->private_data = ch;
return 0;
@@ -914,6 +912,7 @@ static int ch_probe(struct device *dev)
 {
struct scsi_device *sd = to_scsi_device(dev);
struct class_device *class_dev;
+   int minor, ret = -ENOMEM;
scsi_changer *ch;
 
if (sd->type != TYPE_MEDIUM_CHANGER)
@@ -923,7 +922,22 @@ static int ch_probe(struct device *dev)
if (NULL == ch)
return -ENOMEM;
 
-   ch->minor = ch_devcount;
+   if (!idr_pre_get(_index_idr, GFP_KERNEL))
+   goto free_ch;
+
+   spin_lock(_index_lock);
+   ret = idr_get_new(_index_idr, ch, );
+   spin_unlock(_index_lock);
+
+   if (ret)
+   goto free_ch;
+
+   if (minor > CH_MAX_DEVS) {
+   ret = -ENODEV;
+   goto remove_idr;
+   }
+
+   ch->minor = minor;
sprintf(ch->name,"ch%d",ch->minor);
 
class_dev = class_device_create(ch_sysfs_class, NULL,
@@ -932,8 +946,8 @@ static int ch_probe(struct device *dev)
if (IS_ERR(class_dev)) {
printk(KERN_WARNING "ch%d: class_device_create failed\n",
   ch->minor);
-   kfree(ch);
-   return PTR_ERR(class_dev);
+   ret = PTR_ERR(class_dev);
+   goto remove_idr;
}
 
mutex_init(>lock);
@@ -942,35 +956,29 @@ static int ch_probe(struct device *dev)
if (init)
ch_init_elem(ch);
 
+   dev_set_drvdata(dev, ch);
sdev_printk(KERN_INFO, sd, "Attached scsi changer %s\n", ch->name);
 
-   spin_lock(_devlist_lock);
-   list_add_tail(>list,_devlist);
-   ch_devcount++;
-   spin_unlock(_devlist_lock);
retur

[PATCH 1/2] ch: fix device minor number management bug

2008-01-24 Thread FUJITA Tomonori

ch_probe uses the total number of ch devices as minor.

ch_probe:
ch-minor = ch_devcount;
...
ch_devcount++;

Then ch_remove decreases ch_devcount:

ch_remove:
ch_devcount--;

If you have two ch devices, e.g. sch0 and sch1, and remove sch0,
ch_devcount is 1. Then if you add another ch device, ch_probe tries to
create sch1. So you get a warning and fail to create sch1:

Jan 24 16:01:05 nice kernel: sysfs: duplicate filename 'sch1' can not be created
Jan 24 16:01:05 nice kernel: WARNING: at fs/sysfs/dir.c:424 sysfs_add_one()
Jan 24 16:01:05 nice kernel: Pid: 2571, comm: iscsid Not tainted 
2.6.24-rc7-ga3d2c2e8-dirty #1
Jan 24 16:01:05 nice kernel:
Jan 24 16:01:05 nice kernel: Call Trace:
Jan 24 16:01:05 nice kernel:  [802a22b8] sysfs_add_one+0x54/0xbd
Jan 24 16:01:05 nice kernel:  [802a283c] create_dir+0x4f/0x87
Jan 24 16:01:05 nice kernel:  [802a28a9] sysfs_create_dir+0x35/0x4a
Jan 24 16:01:05 nice kernel:  [803069a1] kobject_get+0x12/0x17
Jan 24 16:01:05 nice kernel:  [80306ece] kobject_add+0xf3/0x1a6
Jan 24 16:01:05 nice kernel:  [8034252b] class_device_add+0xaa/0x39d
Jan 24 16:01:05 nice kernel:  [803428fb] class_device_create+0xcb/0xfa
Jan 24 16:01:05 nice kernel:  [80229e09] printk+0x4e/0x56
Jan 24 16:01:05 nice kernel:  [802a2054] sysfs_ilookup_test+0x0/0xf
Jan 24 16:01:05 nice kernel:  [88022580] :ch:ch_probe+0xbe/0x61a

(snip)

And you can't add new ch device any more.

This patch converts ch to use a common minor number management way,
idr like sg and bsg do.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 drivers/scsi/ch.c |   71 +---
 1 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
index 765f2fc..2b07014 100644
--- a/drivers/scsi/ch.c
+++ b/drivers/scsi/ch.c
@@ -21,6 +21,7 @@
 #include linux/compat.h
 #include linux/chio.h/* here are all the ioctls */
 #include linux/mutex.h
+#include linux/idr.h
 
 #include scsi/scsi.h
 #include scsi/scsi_cmnd.h
@@ -33,6 +34,7 @@
 
 #define CH_DT_MAX   16
 #define CH_TYPES8
+#define CH_MAX_DEVS 128
 
 MODULE_DESCRIPTION(device driver for scsi media changer devices);
 MODULE_AUTHOR(Gerd Knorr [EMAIL PROTECTED]);
@@ -113,9 +115,8 @@ typedef struct {
struct mutexlock;
 } scsi_changer;
 
-static LIST_HEAD(ch_devlist);
-static DEFINE_SPINLOCK(ch_devlist_lock);
-static int ch_devcount;
+static DEFINE_IDR(ch_index_idr);
+static DEFINE_SPINLOCK(ch_index_lock);
 
 static struct scsi_driver ch_template =
 {
@@ -598,20 +599,17 @@ ch_release(struct inode *inode, struct file *file)
 static int
 ch_open(struct inode *inode, struct file *file)
 {
-   scsi_changer *tmp, *ch;
+   scsi_changer *ch;
int minor = iminor(inode);
 
-   spin_lock(ch_devlist_lock);
-   ch = NULL;
-   list_for_each_entry(tmp,ch_devlist,list) {
-   if (tmp-minor == minor)
-   ch = tmp;
-   }
+   spin_lock(ch_index_lock);
+   ch = idr_find(ch_index_idr, minor);
+
if (NULL == ch || scsi_device_get(ch-device)) {
-   spin_unlock(ch_devlist_lock);
+   spin_unlock(ch_index_lock);
return -ENXIO;
}
-   spin_unlock(ch_devlist_lock);
+   spin_unlock(ch_index_lock);
 
file-private_data = ch;
return 0;
@@ -914,6 +912,7 @@ static int ch_probe(struct device *dev)
 {
struct scsi_device *sd = to_scsi_device(dev);
struct class_device *class_dev;
+   int minor, ret = -ENOMEM;
scsi_changer *ch;
 
if (sd-type != TYPE_MEDIUM_CHANGER)
@@ -923,7 +922,22 @@ static int ch_probe(struct device *dev)
if (NULL == ch)
return -ENOMEM;
 
-   ch-minor = ch_devcount;
+   if (!idr_pre_get(ch_index_idr, GFP_KERNEL))
+   goto free_ch;
+
+   spin_lock(ch_index_lock);
+   ret = idr_get_new(ch_index_idr, ch, minor);
+   spin_unlock(ch_index_lock);
+
+   if (ret)
+   goto free_ch;
+
+   if (minor  CH_MAX_DEVS) {
+   ret = -ENODEV;
+   goto remove_idr;
+   }
+
+   ch-minor = minor;
sprintf(ch-name,ch%d,ch-minor);
 
class_dev = class_device_create(ch_sysfs_class, NULL,
@@ -932,8 +946,8 @@ static int ch_probe(struct device *dev)
if (IS_ERR(class_dev)) {
printk(KERN_WARNING ch%d: class_device_create failed\n,
   ch-minor);
-   kfree(ch);
-   return PTR_ERR(class_dev);
+   ret = PTR_ERR(class_dev);
+   goto remove_idr;
}
 
mutex_init(ch-lock);
@@ -942,35 +956,29 @@ static int ch_probe(struct device *dev)
if (init)
ch_init_elem(ch);
 
+   dev_set_drvdata(dev, ch);
sdev_printk(KERN_INFO, sd, Attached scsi changer %s\n, ch-name

[PATCH 2/2] ch: remove forward declarations

2008-01-24 Thread FUJITA Tomonori

This moves ch_template and changer_fops structs to the end of file and
removes forward declarations.

This also removes some trailing whitespace.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 drivers/scsi/ch.c |  120 -
 1 files changed, 54 insertions(+), 66 deletions(-)

diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
index 2b07014..7aad154 100644
--- a/drivers/scsi/ch.c
+++ b/drivers/scsi/ch.c
@@ -90,16 +90,6 @@ static const char * vendor_labels[CH_TYPES-4] = {
 
 #define MAX_RETRIES   1
 
-static int  ch_probe(struct device *);
-static int  ch_remove(struct device *);
-static int  ch_open(struct inode * inode, struct file * filp);
-static int  ch_release(struct inode * inode, struct file * filp);
-static long ch_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
-#ifdef CONFIG_COMPAT
-static long ch_ioctl_compat(struct file * filp,
-   unsigned int cmd, unsigned long arg);
-#endif
-
 static struct class * ch_sysfs_class;
 
 typedef struct {
@@ -118,27 +108,6 @@ typedef struct {
 static DEFINE_IDR(ch_index_idr);
 static DEFINE_SPINLOCK(ch_index_lock);
 
-static struct scsi_driver ch_template =
-{
-   .owner  = THIS_MODULE,
-   .gendrv = {
-   .name   = ch,
-   .probe  = ch_probe,
-   .remove = ch_remove,
-   },
-};
-
-static const struct file_operations changer_fops =
-{
-   .owner  = THIS_MODULE,
-   .open   = ch_open,
-   .release= ch_release,
-   .unlocked_ioctl = ch_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = ch_ioctl_compat,
-#endif
-};
-
 static const struct {
unsigned char  sense;
unsigned char  asc;
@@ -207,7 +176,7 @@ ch_do_scsi(scsi_changer *ch, unsigned char *cmd,
 {
int errno, retries = 0, timeout, result;
struct scsi_sense_hdr sshdr;
-   
+
timeout = (cmd[0] == INITIALIZE_ELEMENT_STATUS)
? timeout_init : timeout_move;
 
@@ -245,7 +214,7 @@ static int
 ch_elem_to_typecode(scsi_changer *ch, u_int elem)
 {
int i;
-   
+
for (i = 0; i  CH_TYPES; i++) {
if (elem = ch-firsts[i]  
elem   ch-firsts[i] +
@@ -261,15 +230,15 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char 
*data)
u_char  cmd[12];
u_char  *buffer;
int result;
-   
+
buffer = kmalloc(512, GFP_KERNEL | GFP_DMA);
if(!buffer)
return -ENOMEM;
-   
+
  retry:
memset(cmd,0,sizeof(cmd));
cmd[0] = READ_ELEMENT_STATUS;
-   cmd[1] = (ch-device-lun  5) | 
+   cmd[1] = (ch-device-lun  5) |
(ch-voltags ? 0x10 : 0) |
ch_elem_to_typecode(ch,elem);
cmd[2] = (elem  8)  0xff;
@@ -296,7 +265,7 @@ ch_read_element_status(scsi_changer *ch, u_int elem, char 
*data)
return result;
 }
 
-static int 
+static int
 ch_init_elem(scsi_changer *ch)
 {
int err;
@@ -322,7 +291,7 @@ ch_readconfig(scsi_changer *ch)
buffer = kzalloc(512, GFP_KERNEL | GFP_DMA);
if (!buffer)
return -ENOMEM;
-   
+
memset(cmd,0,sizeof(cmd));
cmd[0] = MODE_SENSE;
cmd[1] = ch-device-lun  5;
@@ -365,7 +334,7 @@ ch_readconfig(scsi_changer *ch)
} else {
vprintk(reading element address assigment page failed!\n);
}
-   
+
/* vendor specific element types */
for (i = 0; i  4; i++) {
if (0 == vendor_counts[i])
@@ -443,7 +412,7 @@ static int
 ch_position(scsi_changer *ch, u_int trans, u_int elem, int rotate)
 {
u_char  cmd[10];
-   
+
dprintk(position: 0x%x\n,elem);
if (0 == trans)
trans = ch-firsts[CHET_MT];
@@ -462,7 +431,7 @@ static int
 ch_move(scsi_changer *ch, u_int trans, u_int src, u_int dest, int rotate)
 {
u_char  cmd[12];
-   
+
dprintk(move: 0x%x = 0x%x\n,src,dest);
if (0 == trans)
trans = ch-firsts[CHET_MT];
@@ -484,7 +453,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src,
u_int dest1, u_int dest2, int rotate1, int rotate2)
 {
u_char  cmd[12];
-   
+
dprintk(exchange: 0x%x = 0x%x = 0x%x\n,
src,dest1,dest2);
if (0 == trans)
@@ -501,7 +470,7 @@ ch_exchange(scsi_changer *ch, u_int trans, u_int src,
cmd[8]  = (dest2  8)  0xff;
cmd[9]  =  dest20xff;
cmd[10] = (rotate1 ? 1 : 0) | (rotate2 ? 2 : 0);
-   
+
return ch_do_scsi(ch, cmd, NULL,0, DMA_NONE);
 }
 
@@ -539,14 +508,14 @@ ch_set_voltag(scsi_changer *ch, u_int elem,
elem, tag);
memset(cmd,0,sizeof(cmd));
cmd[0]  = SEND_VOLUME_TAG;
-   cmd[1] = (ch-device-lun  5) | 
+   cmd[1] = (ch-device-lun  5) |
ch_elem_to_typecode(ch,elem);
cmd[2] = (elem  8)  0xff;
cmd[3] = elem

Re: INITIO scsi driver fails to work properly

2008-01-15 Thread FUJITA Tomonori

On Tue, 15 Jan 2008 09:16:06 -0600
James Bottomley <[EMAIL PROTECTED]> wrote:

> 
> On Sun, 2008-01-13 at 14:28 +0200, Filippos Papadopoulos wrote:
> > On 1/11/08, James Bottomley <[EMAIL PROTECTED]> wrote:
> > >
> > > On Fri, 2008-01-11 at 18:44 +0200, Filippos Papadopoulos wrote:
> > > > On Jan 11, 2008 5:44 PM, James Bottomley
> > > > <[EMAIL PROTECTED]> wrote:
> > > > > >
> > > > > > I havent reported "initio: I/O port range 0x0 is busy."
> > > > >
> > > > > Sorry ... we appear to have several reporters of different bugs in 
> > > > > this
> > > > > thread.  That message was copied by Chuck Ebbert from a Red Hat
> > > > > bugzilla ... I was assuming it was the same problem.
> > > > >
> > > > > > I applied the patch on 2.6.24-rc6-git9 but unfortunatelly same 
> > > > > > thing happens.
> > > > >
> > > > > First off, has this driver ever worked for you in 2.6?  Just booting
> > > > > SLES9 (2.6.5) or RHEL4 (2.6.9) ... or one of their open equivalents to
> > > > > check a really old kernel would be helpful.  If you can get it to 
> > > > > work,
> > > > > then we can proceed with a patch reversion regime based on the
> > > > > assumption that the problem is a recent commit.
> > > >
> > > > Yes it works under 2.6.16.13.  See the beginning of this thread, i
> > > > mention there some things about newer versions.
> > >
> > > Thanks, actually, I see this:
> > >
> > > > I tried to install OpenSUSE 10.3 (kernel 2.6.22.5) and the latest
> > > > OpenSUSE 11.0 Alpha 0  (kernel 2.6.24-rc4) but although the initio
> > > > drivergets loaded during the installation process, yast reports that no 
> > > > hard
> > > > disk is found.
> > >
> > > Could you try with a vanilla 2.6.22 kernel?  The reason for all of this
> > > is that 2.6.22 predates Alan's conversion of this driver (which was my
> > > 95% candidate for the source of the bug).  I want you to try the vanilla
> > > kernel just in case the opensuse one contains a backport.
> > 
> > 
> > Yes you are right. I compiled the vanilla 2.6.22 and initio driver works.
> > Tell me if you want to apply any patch to it.
> 
> 
> That's good news ... at least we know where the issue lies; now the
> problem comes: there are two candidate patches for this issue: Alan's
> driver update patch and Tomo's accessors patch.  Unfortunately, due to
> merge conflicts the two are pretty hopelessly intertwined.  I think I
> already spotted one bug in the accessor conversion, so I'll look at that
> again.  Alan's also going to acquire an inito board and retest his
> conversions.
> 
> I'm afraid it might be a while before we have anything for you to test.

Can you try this patch?

Thanks,

diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 01bf018..6891d2b 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2609,6 +2609,7 @@ static void initio_build_scb(struct initio_host * host, 
struct scsi_ctrl_blk * c
cblk->bufptr = cpu_to_le32((u32)dma_addr);
cmnd->SCp.dma_handle = dma_addr;
 
+   cblk->sglen = nseg;
 
cblk->flags |= SCF_SG;  /* Turn on SG list flag   */
total_len = 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: INITIO scsi driver fails to work properly

2008-01-15 Thread FUJITA Tomonori

On Tue, 15 Jan 2008 09:16:06 -0600
James Bottomley [EMAIL PROTECTED] wrote:

 
 On Sun, 2008-01-13 at 14:28 +0200, Filippos Papadopoulos wrote:
  On 1/11/08, James Bottomley [EMAIL PROTECTED] wrote:
  
   On Fri, 2008-01-11 at 18:44 +0200, Filippos Papadopoulos wrote:
On Jan 11, 2008 5:44 PM, James Bottomley
[EMAIL PROTECTED] wrote:
 
  I havent reported initio: I/O port range 0x0 is busy.

 Sorry ... we appear to have several reporters of different bugs in 
 this
 thread.  That message was copied by Chuck Ebbert from a Red Hat
 bugzilla ... I was assuming it was the same problem.

  I applied the patch on 2.6.24-rc6-git9 but unfortunatelly same 
  thing happens.

 First off, has this driver ever worked for you in 2.6?  Just booting
 SLES9 (2.6.5) or RHEL4 (2.6.9) ... or one of their open equivalents to
 check a really old kernel would be helpful.  If you can get it to 
 work,
 then we can proceed with a patch reversion regime based on the
 assumption that the problem is a recent commit.
   
Yes it works under 2.6.16.13.  See the beginning of this thread, i
mention there some things about newer versions.
  
   Thanks, actually, I see this:
  
I tried to install OpenSUSE 10.3 (kernel 2.6.22.5) and the latest
OpenSUSE 11.0 Alpha 0  (kernel 2.6.24-rc4) but although the initio
drivergets loaded during the installation process, yast reports that no 
hard
disk is found.
  
   Could you try with a vanilla 2.6.22 kernel?  The reason for all of this
   is that 2.6.22 predates Alan's conversion of this driver (which was my
   95% candidate for the source of the bug).  I want you to try the vanilla
   kernel just in case the opensuse one contains a backport.
  
  
  Yes you are right. I compiled the vanilla 2.6.22 and initio driver works.
  Tell me if you want to apply any patch to it.
 
 
 That's good news ... at least we know where the issue lies; now the
 problem comes: there are two candidate patches for this issue: Alan's
 driver update patch and Tomo's accessors patch.  Unfortunately, due to
 merge conflicts the two are pretty hopelessly intertwined.  I think I
 already spotted one bug in the accessor conversion, so I'll look at that
 again.  Alan's also going to acquire an inito board and retest his
 conversions.
 
 I'm afraid it might be a while before we have anything for you to test.

Can you try this patch?

Thanks,

diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 01bf018..6891d2b 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2609,6 +2609,7 @@ static void initio_build_scb(struct initio_host * host, 
struct scsi_ctrl_blk * c
cblk-bufptr = cpu_to_le32((u32)dma_addr);
cmnd-SCp.dma_handle = dma_addr;
 
+   cblk-sglen = nseg;
 
cblk-flags |= SCF_SG;  /* Turn on SG list flag   */
total_len = 0;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bsg : Add support for io vectors in bsg

2008-01-10 Thread FUJITA Tomonori

On Thu, 10 Jan 2008 16:46:05 -0500
Pete Wyckoff <[EMAIL PROTECTED]> wrote:

> Is there another async I/O mechanism?  Userspace builds the CDBs,
> just needs some way to drop them in SCSI ML.  BSG is almost perfect
> for this, but doesn't do iovec, leading to lots of memcpy.

syslets?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Driver 'sd' needs updating

2008-01-10 Thread FUJITA Tomonori

CC'ed linux-scsi and James,

On Thu, 10 Jan 2008 08:51:50 +
Nick Warne <[EMAIL PROTECTED]> wrote:

> 
> Hi everybody - Happy New Year to you all!
> 
> OK, updated to git rc7 yesterday - I now see this in syslog:
> 
>"Driver 'sd' needs updating - please use bus_type methods"
> 
> The warning never appeared in RC6, and all google reveals are other
> peoples logs that are posted about other issues.
> 
> Do I need to fix up something here?

No, you don't. It's harmless, a side effect of:

commit 751bf4d7865e4ced406be93b04c7436d866d3684
Author: James Bottomley <[EMAIL PROTECTED]>
Date:   Wed Jan 2 11:14:30 2008 -0600

[SCSI] scsi_sysfs: restore prep_fn when ULD is removed


It would be better to silence this warning.

James, we need to reset prep_fn in each ULD? though it's not nice...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Driver 'sd' needs updating

2008-01-10 Thread FUJITA Tomonori

CC'ed linux-scsi and James,

On Thu, 10 Jan 2008 08:51:50 +
Nick Warne [EMAIL PROTECTED] wrote:

 
 Hi everybody - Happy New Year to you all!
 
 OK, updated to git rc7 yesterday - I now see this in syslog:
 
Driver 'sd' needs updating - please use bus_type methods
 
 The warning never appeared in RC6, and all google reveals are other
 peoples logs that are posted about other issues.
 
 Do I need to fix up something here?

No, you don't. It's harmless, a side effect of:

commit 751bf4d7865e4ced406be93b04c7436d866d3684
Author: James Bottomley [EMAIL PROTECTED]
Date:   Wed Jan 2 11:14:30 2008 -0600

[SCSI] scsi_sysfs: restore prep_fn when ULD is removed


It would be better to silence this warning.

James, we need to reset prep_fn in each ULD? though it's not nice...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bsg : Add support for io vectors in bsg

2008-01-10 Thread FUJITA Tomonori

On Thu, 10 Jan 2008 16:46:05 -0500
Pete Wyckoff [EMAIL PROTECTED] wrote:

 Is there another async I/O mechanism?  Userspace builds the CDBs,
 just needs some way to drop them in SCSI ML.  BSG is almost perfect
 for this, but doesn't do iovec, leading to lots of memcpy.

syslets?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-09 Thread FUJITA Tomonori

On Wed, 9 Jan 2008 10:04:42 +0100
Jarek Poplawski <[EMAIL PROTECTED]> wrote:

> On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote:
> ...
> > diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
> > new file mode 100644
> > index 000..495575a
> > --- /dev/null
> > +++ b/lib/iommu-helper.c
> > @@ -0,0 +1,80 @@
> > +/*
> > + * IOMMU helper functions for the free area management
> > + */
> > +
> > +#include 
> > +#include 
> > +
> > +static unsigned long find_next_zero_area(unsigned long *map,
> > +unsigned long size,
> > +unsigned long start,
> > +unsigned int nr,
> > +unsigned long align_mask)
> > +{
> > +   unsigned long index, end, i;
> > +again:
> > +   index = find_next_zero_bit(map, size, start);
> > +
> > +   /* Align allocation */
> > +   index = (index + align_mask) & ~align_mask;
> > +
> > +   end = index + nr;
> > +   if (end >= size)
> > +   return -1;
> 
> This '>=' looks doubtful to me, e.g.:
> map points to 0s only,  size = 64, nr = 64,
> we get: index = 0; end = 64;
> and: return -1 ?!

You are right. I did it only because I didn't want to change the
original code (iommu_range_alloc in arch/powerpc/kernel/iommu.c). I
thought that there might be a mysterious reason for it so I let it
alone since it's tiny loss.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-09 Thread FUJITA Tomonori

On Wed, 9 Jan 2008 10:04:42 +0100
Jarek Poplawski [EMAIL PROTECTED] wrote:

 On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote:
 ...
  diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
  new file mode 100644
  index 000..495575a
  --- /dev/null
  +++ b/lib/iommu-helper.c
  @@ -0,0 +1,80 @@
  +/*
  + * IOMMU helper functions for the free area management
  + */
  +
  +#include linux/module.h
  +#include linux/bitops.h
  +
  +static unsigned long find_next_zero_area(unsigned long *map,
  +unsigned long size,
  +unsigned long start,
  +unsigned int nr,
  +unsigned long align_mask)
  +{
  +   unsigned long index, end, i;
  +again:
  +   index = find_next_zero_bit(map, size, start);
  +
  +   /* Align allocation */
  +   index = (index + align_mask)  ~align_mask;
  +
  +   end = index + nr;
  +   if (end = size)
  +   return -1;
 
 This '=' looks doubtful to me, e.g.:
 map points to 0s only,  size = 64, nr = 64,
 we get: index = 0; end = 64;
 and: return -1 ?!

You are right. I did it only because I didn't want to change the
original code (iommu_range_alloc in arch/powerpc/kernel/iommu.c). I
thought that there might be a mysterious reason for it so I let it
alone since it's tiny loss.

Thanks,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-08 Thread FUJITA Tomonori

On Tue, 8 Jan 2008 16:27:39 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Wed, 09 Jan 2008 08:57:53 +0900
> FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> 
> > Andrew, can you replace
> > 
> > iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch
> > 
> > with the updated patch:
> > 
> > http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html
> > 
> > For your convenience I've attached the updated patch too.
> 
> 

Thanks for putting the fix to -mm.


> > --- a/lib/iommu-helper.c~a
> > +++ a/lib/iommu-helper.c
> > @@ -8,15 +8,20 @@
> >  static unsigned long find_next_zero_area(unsigned long *map,
> >  unsigned long size,
> >  unsigned long start,
> > -unsigned int nr)
> > +unsigned int nr,
> > +unsigned long align_mask)
> >  {
> > unsigned long index, end, i;
> >  again:
> > index = find_next_zero_bit(map, size, start);
> > +
> > +   /* Align allocation */
> > +   index = (index + align_mask) & ~align_mask;
> 
> The ALIGN() macro is the approved way of doing this.
> 
> (I don't think ALIGN adds much value really, especially given that you've
> commented what's going on, but I guess it does make reviewing and reading a
> little easier).

Would be better to use __ALIGN_MASK? I can find only one user who
directly use __ALIGN_MASK. The POWER IOMMU calculates align_mask by
itself so it's easier to pass align_mask as an argument.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bsg : Add support for io vectors in bsg

2008-01-08 Thread FUJITA Tomonori

On Tue, 8 Jan 2008 17:09:18 -0500
Pete Wyckoff <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] wrote on Sat, 05 Jan 2008 14:01 +0900:
> > From: Deepak Colluru <[EMAIL PROTECTED]>
> > Subject: [PATCH] bsg : Add support for io vectors in bsg
> > Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST)
> > 
> > > From: Deepak Colluru <[EMAIL PROTECTED]>
> > > 
> > > Add support for io vectors in bsg.
> > > 
> > > Signed-off-by: Deepak Colluru <[EMAIL PROTECTED]>
> > > ---
> > >   bsg.c |   52 +---
> > >   1 file changed, 49 insertions(+), 3 deletions(-)
> > 
> > Thanks, but I have to NACK this.
> > 
> > You can find the discussion about bsg io vector support and a similar
> > patch in linux-scsi archive. I have no plan to support it since it
> > needs the compat hack.
> 
> You may recall this is one of the patches I need to use bsg with OSD
> devices.  OSDs overload the SCSI buffer model to put mulitple fields
> in dataout and datain.  Some is user data, but some is more
> logically created by a library.  Memcpying in userspace to wedge all
> the segments into a single buffer is painful, and is required both
> on outgoing and incoming data buffers.
> 
> There are two approaches to add iovec to bsg.
> 
> 1.  Define a new sg_iovec_v4 that uses constant width types.  Both
> 32- and 64-bit userspace would hand arrays of this to the kernel.
> 
> struct sg_v4_iovec {
>   __u64 iov_base;
>   __u32 iov_len;
>   __u32 __pad1;
> };
> 
> Old patch here:  http://article.gmane.org/gmane.linux.scsi/30461/

As I said before, I don't think that inventing a new "iovec" is a good
idea. sgv3 use the common "iovec". In addition, sg_io_v4 can be used
by other OSes like sg_io_v3.


> 2.  Do as Deepak has done, using the existing sg_iovec, but then
> also work around the compat issue.  Old v3 sg_iovec is:
> 
> typedef struct sg_iovec /* same structure as used by readv() Linux system 
> */
> {   /* call. It defines one scatter-gather element. */
>   void __user *iov_base;  /* Starting address  */
>   size_t iov_len; /* Length in bytes  */
> } sg_iovec_t;
> 
> Old patch here:  http://article.gmane.org/gmane.linux.scsi/30460/
> 
> I took another look at the compat approach, to see if it is feasible
> to keep the compat handling somewhere else, without the use of #ifdef
> CONFIG_COMPAT and size-comparison code inside bsg.c.  I don't see how.
> The use of iovec is within a write operation on a char device.  It's
> not amenable to a compat_sys_ or a .compat_ioctl approach.
> 
> I'm partial to #1 because the use of architecture-independent fields
> matches the rest of struct sg_io_v4.  But if you don't want to have
> another iovec type in the kernel, could we do #2 but just return
> -EINVAL if the need for compat is detected?  I.e. change
> dout_iovec_count to dout_iovec_length and do the math?

If you are ok with removing the write/read interface and just have
ioctl, we could can handle comapt stuff like others do. But I think
that you (OSD people) really want to keep the write/read
interface. Sorry, I think that there is no workaround to support iovec
in bsg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-08 Thread FUJITA Tomonori

On Tue, 8 Jan 2008 16:59:48 +0100
Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> 
> > The patches are available at:
> > 
> > http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/
> > 
> > Or if you prefer the git tree:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git 
> > iommu-sg-fixes
> 
> btw., these improvements to the IOMMU code are in -mm and will go into 
> v2.6.25, right? The changes look robust to me.

Thanks, they have been in -mm though the iommu helper fix hasn't
yet. Balbir Singh found the bug in 2.6.24-rc6-mm1. I've just check
mmotm and found that the IOMMU helper patch doesn't include the fix.

Andrew, can you replace

iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch

with the updated patch:

http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html

For your convenience I've attached the updated patch too.

Hopefully, they will go into v2.6.25. At least, I hope that the
patches (0001-0011) that make the IOMMUs respect segment size limits
when merging sg lists will be merged. They are simple and I got ACKs
on POWER and PARISC.


Thanks,

=
From: FUJITA Tomonori <[EMAIL PROTECTED]>
Subject: [PATCH] add IOMMU helper functions for the free area management

This adds IOMMU helper functions for the free area management.  These
functions take care of LLD's segment boundary limit for IOMMUs.  They would be
useful for IOMMUs that use bitmap for the free area management.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 include/linux/iommu-helper.h |7 
 lib/Makefile |1 +
 lib/iommu-helper.c   |   80 ++
 3 files changed, 88 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/iommu-helper.h
 create mode 100644 lib/iommu-helper.c

diff --git a/include/linux/iommu-helper.h b/include/linux/iommu-helper.h
new file mode 100644
index 000..4dd4c04
--- /dev/null
+++ b/include/linux/iommu-helper.h
@@ -0,0 +1,7 @@
+extern unsigned long iommu_area_alloc(unsigned long *map, unsigned long size,
+ unsigned long start, unsigned int nr,
+ unsigned long shift,
+ unsigned long boundary_size,
+ unsigned long align_mask);
+extern void iommu_area_free(unsigned long *map, unsigned long start,
+   unsigned int nr);
diff --git a/lib/Makefile b/lib/Makefile
index b6793ed..0e7383f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -64,6 +64,7 @@ obj-$(CONFIG_SMP) += percpu_counter.o
 obj-$(CONFIG_AUDIT_GENERIC) += audit.o
 
 obj-$(CONFIG_SWIOTLB) += swiotlb.o
+obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o
 obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
new file mode 100644
index 000..495575a
--- /dev/null
+++ b/lib/iommu-helper.c
@@ -0,0 +1,80 @@
+/*
+ * IOMMU helper functions for the free area management
+ */
+
+#include 
+#include 
+
+static unsigned long find_next_zero_area(unsigned long *map,
+unsigned long size,
+unsigned long start,
+unsigned int nr,
+unsigned long align_mask)
+{
+   unsigned long index, end, i;
+again:
+   index = find_next_zero_bit(map, size, start);
+
+   /* Align allocation */
+   index = (index + align_mask) & ~align_mask;
+
+   end = index + nr;
+   if (end >= size)
+   return -1;
+   for (i = index; i < end; i++) {
+   if (test_bit(i, map)) {
+   start = i+1;
+   goto again;
+   }
+   }
+   return index;
+}
+
+static inline void set_bit_area(unsigned long *map, unsigned long i,
+   int len)
+{
+   unsigned long end = i + len;
+   while (i < end) {
+   __set_bit(i, map);
+   i++;
+   }
+}
+
+static inline int is_span_boundary(unsigned int index, unsigned int nr,
+  unsigned long shift,
+  unsigned long boundary_size)
+{
+   shift = (shift + index) & (boundary_size - 1);
+   return shift + nr > boundary_size;
+}
+
+unsigned long iommu_area_alloc(unsigned long *map, unsigned long size,
+  unsigned long start, unsigned int nr,
+  unsigned long shift, unsigned long boundary_size,
+  unsigned long align_mask)
+{
+   unsigned long index;
+again:
+   index = find_next_zero_area(map, size, start, nr, align_mask);
+   if (index != -1) {
+

Re: [GIT PATCH] SCSI bug fixes for 2.6.24-rc3

2008-01-08 Thread FUJITA Tomonori

CC'ed Jes,

On Tue, 8 Jan 2008 14:15:53 +0300
Evgeniy Dushistov <[EMAIL PROTECTED]> wrote:

> On Sun, Nov 25, 2007 at 01:24:25PM +0200, James Bottomley wrote:
> > This is a bit of a rash of bug fixes.  The qla1280 is actually a bug fix
> > (in spite of the title---it's actually correcting an existing problem
> > with the qla1280 implementation of accessors that broke the current
> > driver).
> > 
> 
> Recently I build last Linus's git tree, and got:
> req_cnt is used uninitialized in this function,
> and see bellow
> > The patch is available here:
> > 
> > master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git
> > 
> > The short changelog is:
> ...
> > Jes Sorensen (1):
> >   qla1280: convert to use the data buffer accessors
> > 
> ...
> >  scsi/qla1280.c |  387 
> > +
> ...
> > /* Calculate number of entries and segments required. */
> > -   req_cnt = 1;
> > 
> 
> Initilization of req_cnt was removed, but in this function
> there are places like 
> req_cnt += (seg_cnt - 4) / 7;
> or
> req_cnt++;
> This is should be so?

req_cnt should not be removed. Jes tested the patch but this critical
bug appears only with BITS_PER_LONG != 64 && CONFIG_HIGHMEM=n case. So
he didn't see this, I think.

qla1280_32bit_start_scsi also gives the following warning:

drivers/scsi/qla1280.c: In function 'qla1280_32bit_start_scsi':
drivers/scsi/qla1280.c:3044: warning: unused variable 'dma_handle'


diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
index 146d540..2886407 100644
--- a/drivers/scsi/qla1280.c
+++ b/drivers/scsi/qla1280.c
@@ -3041,7 +3041,6 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct 
srb * sp)
int cnt;
int req_cnt;
int seg_cnt;
-   dma_addr_t dma_handle;
u8 dir;
 
ENTER("qla1280_32bit_start_scsi");
@@ -3050,6 +3049,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct 
srb * sp)
cmd->cmnd[0]);
 
/* Calculate number of entries and segments required. */
+   req_cnt = 1;
seg_cnt = scsi_dma_map(cmd);
if (seg_cnt) {
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PATCH] SCSI bug fixes for 2.6.24-rc3

2008-01-08 Thread FUJITA Tomonori

CC'ed Jes,

On Tue, 8 Jan 2008 14:15:53 +0300
Evgeniy Dushistov [EMAIL PROTECTED] wrote:

 On Sun, Nov 25, 2007 at 01:24:25PM +0200, James Bottomley wrote:
  This is a bit of a rash of bug fixes.  The qla1280 is actually a bug fix
  (in spite of the title---it's actually correcting an existing problem
  with the qla1280 implementation of accessors that broke the current
  driver).
  
 
 Recently I build last Linus's git tree, and got:
 req_cnt is used uninitialized in this function,
 and see bellow
  The patch is available here:
  
  master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6.git
  
  The short changelog is:
 ...
  Jes Sorensen (1):
qla1280: convert to use the data buffer accessors
  
 ...
   scsi/qla1280.c |  387 
  +
 ...
  /* Calculate number of entries and segments required. */
  -   req_cnt = 1;
  
 
 Initilization of req_cnt was removed, but in this function
 there are places like 
 req_cnt += (seg_cnt - 4) / 7;
 or
 req_cnt++;
 This is should be so?

req_cnt should not be removed. Jes tested the patch but this critical
bug appears only with BITS_PER_LONG != 64  CONFIG_HIGHMEM=n case. So
he didn't see this, I think.

qla1280_32bit_start_scsi also gives the following warning:

drivers/scsi/qla1280.c: In function 'qla1280_32bit_start_scsi':
drivers/scsi/qla1280.c:3044: warning: unused variable 'dma_handle'


diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
index 146d540..2886407 100644
--- a/drivers/scsi/qla1280.c
+++ b/drivers/scsi/qla1280.c
@@ -3041,7 +3041,6 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct 
srb * sp)
int cnt;
int req_cnt;
int seg_cnt;
-   dma_addr_t dma_handle;
u8 dir;
 
ENTER(qla1280_32bit_start_scsi);
@@ -3050,6 +3049,7 @@ qla1280_32bit_start_scsi(struct scsi_qla_host *ha, struct 
srb * sp)
cmd-cmnd[0]);
 
/* Calculate number of entries and segments required. */
+   req_cnt = 1;
seg_cnt = scsi_dma_map(cmd);
if (seg_cnt) {
/*
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-08 Thread FUJITA Tomonori

On Tue, 8 Jan 2008 16:59:48 +0100
Ingo Molnar [EMAIL PROTECTED] wrote:

 
 * FUJITA Tomonori [EMAIL PROTECTED] wrote:
 
  The patches are available at:
  
  http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/
  
  Or if you prefer the git tree:
  
  git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git 
  iommu-sg-fixes
 
 btw., these improvements to the IOMMU code are in -mm and will go into 
 v2.6.25, right? The changes look robust to me.

Thanks, they have been in -mm though the iommu helper fix hasn't
yet. Balbir Singh found the bug in 2.6.24-rc6-mm1. I've just check
mmotm and found that the IOMMU helper patch doesn't include the fix.

Andrew, can you replace

iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch

with the updated patch:

http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html

For your convenience I've attached the updated patch too.

Hopefully, they will go into v2.6.25. At least, I hope that the
patches (0001-0011) that make the IOMMUs respect segment size limits
when merging sg lists will be merged. They are simple and I got ACKs
on POWER and PARISC.


Thanks,

=
From: FUJITA Tomonori [EMAIL PROTECTED]
Subject: [PATCH] add IOMMU helper functions for the free area management

This adds IOMMU helper functions for the free area management.  These
functions take care of LLD's segment boundary limit for IOMMUs.  They would be
useful for IOMMUs that use bitmap for the free area management.

Signed-off-by: FUJITA Tomonori [EMAIL PROTECTED]
---
 include/linux/iommu-helper.h |7 
 lib/Makefile |1 +
 lib/iommu-helper.c   |   80 ++
 3 files changed, 88 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/iommu-helper.h
 create mode 100644 lib/iommu-helper.c

diff --git a/include/linux/iommu-helper.h b/include/linux/iommu-helper.h
new file mode 100644
index 000..4dd4c04
--- /dev/null
+++ b/include/linux/iommu-helper.h
@@ -0,0 +1,7 @@
+extern unsigned long iommu_area_alloc(unsigned long *map, unsigned long size,
+ unsigned long start, unsigned int nr,
+ unsigned long shift,
+ unsigned long boundary_size,
+ unsigned long align_mask);
+extern void iommu_area_free(unsigned long *map, unsigned long start,
+   unsigned int nr);
diff --git a/lib/Makefile b/lib/Makefile
index b6793ed..0e7383f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -64,6 +64,7 @@ obj-$(CONFIG_SMP) += percpu_counter.o
 obj-$(CONFIG_AUDIT_GENERIC) += audit.o
 
 obj-$(CONFIG_SWIOTLB) += swiotlb.o
+obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o
 obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
new file mode 100644
index 000..495575a
--- /dev/null
+++ b/lib/iommu-helper.c
@@ -0,0 +1,80 @@
+/*
+ * IOMMU helper functions for the free area management
+ */
+
+#include linux/module.h
+#include linux/bitops.h
+
+static unsigned long find_next_zero_area(unsigned long *map,
+unsigned long size,
+unsigned long start,
+unsigned int nr,
+unsigned long align_mask)
+{
+   unsigned long index, end, i;
+again:
+   index = find_next_zero_bit(map, size, start);
+
+   /* Align allocation */
+   index = (index + align_mask)  ~align_mask;
+
+   end = index + nr;
+   if (end = size)
+   return -1;
+   for (i = index; i  end; i++) {
+   if (test_bit(i, map)) {
+   start = i+1;
+   goto again;
+   }
+   }
+   return index;
+}
+
+static inline void set_bit_area(unsigned long *map, unsigned long i,
+   int len)
+{
+   unsigned long end = i + len;
+   while (i  end) {
+   __set_bit(i, map);
+   i++;
+   }
+}
+
+static inline int is_span_boundary(unsigned int index, unsigned int nr,
+  unsigned long shift,
+  unsigned long boundary_size)
+{
+   shift = (shift + index)  (boundary_size - 1);
+   return shift + nr  boundary_size;
+}
+
+unsigned long iommu_area_alloc(unsigned long *map, unsigned long size,
+  unsigned long start, unsigned int nr,
+  unsigned long shift, unsigned long boundary_size,
+  unsigned long align_mask)
+{
+   unsigned long index;
+again:
+   index = find_next_zero_area(map, size, start, nr, align_mask);
+   if (index != -1) {
+   if (is_span_boundary(index, nr, shift, boundary_size)) {
+   /* we could do more

Re: 2.6.24-rc6-mm1

2008-01-08 Thread FUJITA Tomonori

On Tue, 8 Jan 2008 16:27:39 -0800
Andrew Morton [EMAIL PROTECTED] wrote:

 On Wed, 09 Jan 2008 08:57:53 +0900
 FUJITA Tomonori [EMAIL PROTECTED] wrote:
 
  Andrew, can you replace
  
  iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch
  
  with the updated patch:
  
  http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html
  
  For your convenience I've attached the updated patch too.
 
 generates the incremental

Thanks for putting the fix to -mm.


  --- a/lib/iommu-helper.c~a
  +++ a/lib/iommu-helper.c
  @@ -8,15 +8,20 @@
   static unsigned long find_next_zero_area(unsigned long *map,
   unsigned long size,
   unsigned long start,
  -unsigned int nr)
  +unsigned int nr,
  +unsigned long align_mask)
   {
  unsigned long index, end, i;
   again:
  index = find_next_zero_bit(map, size, start);
  +
  +   /* Align allocation */
  +   index = (index + align_mask)  ~align_mask;
 
 The ALIGN() macro is the approved way of doing this.
 
 (I don't think ALIGN adds much value really, especially given that you've
 commented what's going on, but I guess it does make reviewing and reading a
 little easier).

Would be better to use __ALIGN_MASK? I can find only one user who
directly use __ALIGN_MASK. The POWER IOMMU calculates align_mask by
itself so it's easier to pass align_mask as an argument.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd

2008-01-07 Thread FUJITA Tomonori

On Mon, 07 Jan 2008 15:25:36 +0200
Boaz Harrosh <[EMAIL PROTECTED]> wrote:

> On Mon, Jan 07 2008 at 8:53 +0200, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > On Sun, 23 Dec 2007 13:09:05 +0200
> > Boaz Harrosh <[EMAIL PROTECTED]> wrote:
> > 
> >> On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt <[EMAIL 
> >> PROTECTED]> wrote:
> >>> The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly
> >>> by some low level drivers (that typically happens with USB mass
> >>> storage).
> >>>
> >>> This is a problem on non cache coherent architectures such as
> >>> embedded PowerPCs where the sense buffer can share cache lines with
> >>> other structure members, which leads to various forms of corruption.
> >>>
> >>> This uses the newly defined __dma_buffer annotation to enforce that
> >>> on such platforms, the sense_buffer is contained within its own
> >>> cache line. This has no effect on cache coherent architectures.
> >>>
> >>> Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
> >>> ---
> >>>
> >>>  include/scsi/scsi_cmnd.h |2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 
> >>> 13:07:14.0 +1100
> >>> +++ linux-merge/include/scsi/scsi_cmnd.h  2007-12-21 13:07:29.0 
> >>> +1100
> >>> @@ -88,7 +88,7 @@ struct scsi_cmnd {
> >>>  working on */
> >>>  
> >>>  #define SCSI_SENSE_BUFFERSIZE96
> >>> - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE];
> >>> + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer;
> >>>   /* obtained by REQUEST SENSE when
> >>>* CHECK CONDITION is received on original
> >>>* command (auto-sense) */
> >> This has the potential of leaving a big fat ugly hole in the middle of 
> >> scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be
> >> the *first member* of struct scsi_cmnd. The command itself is already cache
> >> aligned, allocated by the proper flags to it's slab. And put a fat comment
> >> near it's definition.
> >>
> >> This is until a proper fix is sent. I have in my Q a proposition for a 
> >> more prominent solution, which I will send next month. Do to short comings
> >> in the sense handling and optimizations, but should definitely cover this
> >> problem.
> >>
> >> The code should have time to be discussed and tested, so it is only 2.6.26
> >> material. For the duration of the 2.6.25 kernel we can live with a reorder
> >> of scsi_cmnd members, if it solves such a grave bug for some ARCHs.
> >>
> >> Boaz
> >> 
> >> [RFD below]
> >> My proposed solution will be has follows:
> >>
> >>  1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an 
> >> error
> >> in effect the Q is frozen until the REQUEST_SENSE command returns.
> >>
> >>  2. The scsi-ml needs the sense buffer for its normal operation, 
> >> independent 
> >> from the ULD's request of the sence_buffer or not at request->sense. 
> >> But
> >> in effect, 90% of all scsi-requests come with ULD's allocated buffer 
> >> for
> >> sense, that is copied to, on command completion.
> >>
> >>  3. 99% of all commands complete successfully, so if an optimization is 
> >> proposed for the successful case, sacrificing a few cycles for the 
> >> error
> >> case than thats a good thing.
> >>
> >>  My suggestion is to have a per Q, driver-overridable, sense buffer that 
> >> is 
> >>  DMAed/written to by drivers. At the end of the REQUEST_SENSE command one
> >>  of 2 things will be done. Either copy the sense to the ULD's supplied 
> >> buffer,
> >>  or if not available, allocate one from a dedicated mem_cache pool.
> >>  
> >>  So we are completely saving 92 bytes from scsi_cmnd by replacing the 
> >> buffer
> >>  with a pointer. We can always read the sense into a per Q buffer. And 10% 
> >> of
> >>  %1 of the times we will need to allocate a sense buffer from a dedicated 
> >> mem_cache
> >>  I would say thats a nice opt

Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd

2008-01-07 Thread FUJITA Tomonori

On Mon, 07 Jan 2008 15:25:36 +0200
Boaz Harrosh [EMAIL PROTECTED] wrote:

 On Mon, Jan 07 2008 at 8:53 +0200, FUJITA Tomonori [EMAIL PROTECTED] wrote:
  On Sun, 23 Dec 2007 13:09:05 +0200
  Boaz Harrosh [EMAIL PROTECTED] wrote:
  
  On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt [EMAIL 
  PROTECTED] wrote:
  The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly
  by some low level drivers (that typically happens with USB mass
  storage).
 
  This is a problem on non cache coherent architectures such as
  embedded PowerPCs where the sense buffer can share cache lines with
  other structure members, which leads to various forms of corruption.
 
  This uses the newly defined __dma_buffer annotation to enforce that
  on such platforms, the sense_buffer is contained within its own
  cache line. This has no effect on cache coherent architectures.
 
  Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
  ---
 
   include/scsi/scsi_cmnd.h |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  --- linux-merge.orig/include/scsi/scsi_cmnd.h 2007-12-21 
  13:07:14.0 +1100
  +++ linux-merge/include/scsi/scsi_cmnd.h  2007-12-21 13:07:29.0 
  +1100
  @@ -88,7 +88,7 @@ struct scsi_cmnd {
   working on */
   
   #define SCSI_SENSE_BUFFERSIZE96
  - unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE];
  + unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer;
/* obtained by REQUEST SENSE when
 * CHECK CONDITION is received on original
 * command (auto-sense) */
  This has the potential of leaving a big fat ugly hole in the middle of 
  scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be
  the *first member* of struct scsi_cmnd. The command itself is already cache
  aligned, allocated by the proper flags to it's slab. And put a fat comment
  near it's definition.
 
  This is until a proper fix is sent. I have in my Q a proposition for a 
  more prominent solution, which I will send next month. Do to short comings
  in the sense handling and optimizations, but should definitely cover this
  problem.
 
  The code should have time to be discussed and tested, so it is only 2.6.26
  material. For the duration of the 2.6.25 kernel we can live with a reorder
  of scsi_cmnd members, if it solves such a grave bug for some ARCHs.
 
  Boaz
  
  [RFD below]
  My proposed solution will be has follows:
 
   1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an 
  error
  in effect the Q is frozen until the REQUEST_SENSE command returns.
 
   2. The scsi-ml needs the sense buffer for its normal operation, 
  independent 
  from the ULD's request of the sence_buffer or not at request-sense. 
  But
  in effect, 90% of all scsi-requests come with ULD's allocated buffer 
  for
  sense, that is copied to, on command completion.
 
   3. 99% of all commands complete successfully, so if an optimization is 
  proposed for the successful case, sacrificing a few cycles for the 
  error
  case than thats a good thing.
 
   My suggestion is to have a per Q, driver-overridable, sense buffer that 
  is 
   DMAed/written to by drivers. At the end of the REQUEST_SENSE command one
   of 2 things will be done. Either copy the sense to the ULD's supplied 
  buffer,
   or if not available, allocate one from a dedicated mem_cache pool.
   
   So we are completely saving 92 bytes from scsi_cmnd by replacing the 
  buffer
   with a pointer. We can always read the sense into a per Q buffer. And 10% 
  of
   %1 of the times we will need to allocate a sense buffer from a dedicated 
  mem_cache
   I would say thats a nice optimization.
 
   The changes to scsi_error/scsi_cmnd and friends, is pretty strait 
  forward. But
   it depends on a conversion of 4/5 drivers to the new scsi_eh API for 
   REQUEST_SENSE. I have only converted these drivers that interfered with 
  the accessors
   effort + 1 other places. But there are a few more places that are not 
  converted.
   Once done. The implementation can easily change with no affect on drivers.
  
  I think that removing the sense_buffer array from scsi_cmnd effects
  lots of LLDs. As I wrote in other mail, many LLDs assume that
  scsi_cmnd:sense_buffer is always available. Another big task is to
  take care about auto sense.
  
  Have you already had some patches? I've just started to work on this
  and I'd like to push that fix into 2.6.25.
 
 Tomo Hi.
 Since you ask to push this into 2.6.25, I have ask permission to
 prioritize this effort, as until now it was on a back burner.

There are no short-term solusions and seems that __dma_buffer will not
be merged. It would be better to fix this soon though it's a bit hard
to fix this before 2.6.25, I think.


 I have only done 3 drivers up to now. (out of something like 15)
 
 I have seen 4 patterns of sense use.
 
 1

Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd

2008-01-06 Thread FUJITA Tomonori

On Sun, 23 Dec 2007 13:09:05 +0200
Boaz Harrosh <[EMAIL PROTECTED]> wrote:

> On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt <[EMAIL PROTECTED]> 
> wrote:
> > The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly
> > by some low level drivers (that typically happens with USB mass
> > storage).
> > 
> > This is a problem on non cache coherent architectures such as
> > embedded PowerPCs where the sense buffer can share cache lines with
> > other structure members, which leads to various forms of corruption.
> > 
> > This uses the newly defined __dma_buffer annotation to enforce that
> > on such platforms, the sense_buffer is contained within its own
> > cache line. This has no effect on cache coherent architectures.
> > 
> > Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]>
> > ---
> > 
> >  include/scsi/scsi_cmnd.h |2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > --- linux-merge.orig/include/scsi/scsi_cmnd.h   2007-12-21 
> > 13:07:14.0 +1100
> > +++ linux-merge/include/scsi/scsi_cmnd.h2007-12-21 13:07:29.0 
> > +1100
> > @@ -88,7 +88,7 @@ struct scsi_cmnd {
> >working on */
> >  
> >  #define SCSI_SENSE_BUFFERSIZE  96
> > -   unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE];
> > +   unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer;
> > /* obtained by REQUEST SENSE when
> >  * CHECK CONDITION is received on original
> >  * command (auto-sense) */
> 
> This has the potential of leaving a big fat ugly hole in the middle of 
> scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be
> the *first member* of struct scsi_cmnd. The command itself is already cache
> aligned, allocated by the proper flags to it's slab. And put a fat comment
> near it's definition.
> 
> This is until a proper fix is sent. I have in my Q a proposition for a 
> more prominent solution, which I will send next month. Do to short comings
> in the sense handling and optimizations, but should definitely cover this
> problem.
> 
> The code should have time to be discussed and tested, so it is only 2.6.26
> material. For the duration of the 2.6.25 kernel we can live with a reorder
> of scsi_cmnd members, if it solves such a grave bug for some ARCHs.
> 
> Boaz
> 
> [RFD below]
> My proposed solution will be has follows:
> 
>  1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an error
> in effect the Q is frozen until the REQUEST_SENSE command returns.
> 
>  2. The scsi-ml needs the sense buffer for its normal operation, independent 
> from the ULD's request of the sence_buffer or not at request->sense. But
> in effect, 90% of all scsi-requests come with ULD's allocated buffer for
> sense, that is copied to, on command completion.
> 
>  3. 99% of all commands complete successfully, so if an optimization is 
> proposed for the successful case, sacrificing a few cycles for the error
> case than thats a good thing.
> 
>  My suggestion is to have a per Q, driver-overridable, sense buffer that is 
>  DMAed/written to by drivers. At the end of the REQUEST_SENSE command one
>  of 2 things will be done. Either copy the sense to the ULD's supplied buffer,
>  or if not available, allocate one from a dedicated mem_cache pool.
>  
>  So we are completely saving 92 bytes from scsi_cmnd by replacing the buffer
>  with a pointer. We can always read the sense into a per Q buffer. And 10% of
>  %1 of the times we will need to allocate a sense buffer from a dedicated 
> mem_cache
>  I would say thats a nice optimization.
> 
>  The changes to scsi_error/scsi_cmnd and friends, is pretty strait forward. 
> But
>  it depends on a conversion of 4/5 drivers to the new scsi_eh API for 
>  REQUEST_SENSE. I have only converted these drivers that interfered with the 
> accessors
>  effort + 1 other places. But there are a few more places that are not 
> converted.
>  Once done. The implementation can easily change with no affect on drivers.

I think that removing the sense_buffer array from scsi_cmnd effects
lots of LLDs. As I wrote in other mail, many LLDs assume that
scsi_cmnd:sense_buffer is always available. Another big task is to
take care about auto sense.

Have you already had some patches? I've just started to work on this
and I'd like to push that fix into 2.6.25.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread FUJITA Tomonori

On Sun, 6 Jan 2008 21:03:42 +0100
"Torsten Kaiser" <[EMAIL PROTECTED]> wrote:

> On Jan 6, 2008 2:33 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > On Sun, 6 Jan 2008 12:35:35 +0100
> > "Torsten Kaiser" <[EMAIL PROTECTED]> wrote:
> > > On Jan 6, 2008 12:23 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > > > On Sun, 6 Jan 2008 11:41:10 +0100
> > > > "Torsten Kaiser" <[EMAIL PROTECTED]> wrote:
> > > > > I will applie your patch and see if this hunk from
> > > > > find_next_zero_area() makes a difference:
> > > > >
> > > > >end = index + nr;
> > > > > -   if (end > size)
> > > > > +   if (end >= size)
> > > > > return -1;
> 
> -> that might still have made a difference, but ...
> 
> > > > > -   for (i = index + 1; i < end; i++) {
> > > > > +   for (i = index; i < end; i++) {
> 
> ... as you say below, the test for the index position is only needed
> if index is modified after find_next_zero_bit().
> 
> > > > > if (test_bit(i, map)) {
> > > >
> > > > The patch should not make a difference for X86_64.
> > >
> > > Hmm...
> > > arch/x86/kernel/pci-gart_64.c:
> > > alloc_iommu() calls iommu_area_alloc()
> > > lib/iommu-helper.c:
> > > iommu_area_alloc() calls find_next_zero_area()
> > > -> so the above code should be called even on X86_64
> >
> > Oops, I meant that the patch fixes the align allocation (non zero
> > align_mask case). X86_64 doesn't use the align allocation.
> >
> >
> > > And the change in the for loop means that 'index' will now be tested,
> > > but with the old code it was not.
> >
> > With the old code, 'index' is tested by find_next_zero_bit.
> >
> > With the new code and non zero align_mask case, 'index' is not tested
> > by find_next_zero_bit. So test_bit needs to start with 'index'.
> >
> > So If I understand the correctly, this patch should not make a
> > difference for x86_64 though I might miss something.
> 
> You did not miss anything.
> After 18 packages my system crashed again.
> 
> > > And double using something does fit with the errors I'm seeing...
> > >
> > > > Can you try the patch to revert my IOMMU changes?
> > > >
> > > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html
> > >
> > > Testing for this bug is a little bit slow, as I'm compiling ~100
> > > packages trying to trigger it.
> > > If my current testrun with the patch from
> > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html
> > > crashes, I will revert the hole IOMMU changes with above patch and try 
> > > again.
> >
> > Thanks for testing,
> 
> OK, I'm still testing this, but after 95 completed packages I'm rather
> certain that reverting the IOMMU changes with this patch fixes my
> problem.
> I didn't have time to look more into this, so I can't offer any
> concrete ideas where the bug is.
> 
> If you send more patches, I'm willing to test them, but it might take
> some more time during the next week.

Can you try 2.6.24-rc7 + the IOMMU changes?

The patches are available at:

http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/

Or if you prefer the git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git 
iommu-sg-fixes



I've looked at the changes to GART but they are straightforward and
don't look wrong...


Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread FUJITA Tomonori

On Sun, 6 Jan 2008 12:35:35 +0100
"Torsten Kaiser" <[EMAIL PROTECTED]> wrote:

> On Jan 6, 2008 12:23 PM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > On Sun, 6 Jan 2008 11:41:10 +0100
> > "Torsten Kaiser" <[EMAIL PROTECTED]> wrote:
> > > I will applie your patch and see if this hunk from
> > > find_next_zero_area() makes a difference:
> > >
> > >end = index + nr;
> > > -   if (end > size)
> > > +   if (end >= size)
> > > return -1;
> > > -   for (i = index + 1; i < end; i++) {
> > > +   for (i = index; i < end; i++) {
> > > if (test_bit(i, map)) {
> >
> > The patch should not make a difference for X86_64.
> 
> Hmm...
> arch/x86/kernel/pci-gart_64.c:
> alloc_iommu() calls iommu_area_alloc()
> lib/iommu-helper.c:
> iommu_area_alloc() calls find_next_zero_area()
> -> so the above code should be called even on X86_64

Oops, I meant that the patch fixes the align allocation (non zero
align_mask case). X86_64 doesn't use the align allocation.


> And the change in the for loop means that 'index' will now be tested,
> but with the old code it was not.

With the old code, 'index' is tested by find_next_zero_bit.

With the new code and non zero align_mask case, 'index' is not tested
by find_next_zero_bit. So test_bit needs to start with 'index'.

So If I understand the correctly, this patch should not make a
difference for x86_64 though I might miss something.


> And double using something does fit with the errors I'm seeing...
> 
> > Can you try the patch to revert my IOMMU changes?
> >
> > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html
> 
> Testing for this bug is a little bit slow, as I'm compiling ~100
> packages trying to trigger it.
> If my current testrun with the patch from
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html
> crashes, I will revert the hole IOMMU changes with above patch and try again.

Thanks for testing,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread FUJITA Tomonori

On Sun, 6 Jan 2008 11:41:10 +0100
"Torsten Kaiser" <[EMAIL PROTECTED]> wrote:

> On Jan 6, 2008 4:28 AM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > On Sat, 5 Jan 2008 17:25:24 -0800
> > Andrew Morton <[EMAIL PROTECTED]> wrote:
> > > On Sat, 5 Jan 2008 23:10:17 +0100 "Torsten Kaiser" <[EMAIL PROTECTED]> 
> > > wrote:
> > > > But the cause of my mail is the following question:
> > > > Regarding my "iommu-sg-merging-patches are new in -rc3-mm and could be
> > > > the cause"-suspicion I looked at these patches and came across these
> > > > hunks:
> > > >
> > > > This is removed from arch/x86/lib/bitstr_64.c:
> > > > -/* Find string of zero bits in a bitmap */
> > > > -unsigned long
> > > > -find_next_zero_string(unsigned long *bitmap, long start, long nbits, 
> > > > int len)
> > > > -{
> > > > -   unsigned long n, end, i;
> > > > -
> > > > - again:
> > > > -   n = find_next_zero_bit(bitmap, nbits, start);
> > > > -   if (n == -1)
> > > > -   return -1;
> > > > -
> > > > -   /* could test bitsliced, but it's hardly worth it */
> > > > -   end = n+len;
> > > > -   if (end > nbits)
> > > > -   return -1;
> > > > -   for (i = n+1; i < end; i++) {
> > > > -   if (test_bit(i, bitmap)) {
> > > > -   start = i+1;
> > > > -   goto again;
> > > > -   }
> > > > -   }
> > > > -   return n;
> > > > -}
> > > >
> > > > This is added to lib/iommu-helper.c:
> > > > +static unsigned long find_next_zero_area(unsigned long *map,
> > > > +unsigned long size,
> > > > +unsigned long start,
> > > > +unsigned int nr)
> > > > +{
> > > > +   unsigned long index, end, i;
> > > > +again:
> > > > +   index = find_next_zero_bit(map, size, start);
> > > > +   end = index + nr;
> > > > +   if (end > size)
> > > > +   return -1;
> > > > +   for (i = index + 1; i < end; i++) {
> > > > +   if (test_bit(i, map)) {
> > > > +   start = i+1;
> > > > +   goto again;
> > > > +   }
> > > > +   }
> > > > +   return index;
> > > > +}
> > > >
> > > > The old version checks, if find_next_zero_bit returns -1, the new
> > > > version doesn't do this.
> > > > Is this intended and can find_next_zero_bit never fail?
> > > > Hmm... but in the worst case it should only loop forever if I'm
> > > > reading this right (index = -1 => for-loop counts from 0 to nr, if any
> > > > bit is set it will jump to "again:" and if the next call to
> > > > find_next_zero_bit also fails, its an endless loop)
> >
> > find_next_zero_bit returns -1?
> >
> > It seems that x86_64 doesn't.
> 
> I'm sorry. I didn't look into find_next_zero_bit, I only noted that
> the old version did check for -1 and the new one didn't.
> Obviously the old check was superfluous.
> 
> > POWER and SPARC64 IOMMUs use
> > find_next_zero_bit too but both doesn't check if find_next_zero_bit
> > returns -1. If find_next_zero_bit fails, it returns size. So it
> > doesn't leads to an endless loop.
> 
> Yes, this can't happen. It was a wrong assumption on my part.
> 
> > But this patch has other bugs that break POWER IOMMUs.
> >
> > If you use the IOMMUs on POWER, please try the following patch:
> 
> I'm using CONFIG_GART_IOMMU=y on x86_64.
> 
> > http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html
> 
> I also noted the line "index = (index + align_mask) & ~align_mask;" in
> iommu_area_alloc() and didn't understand what this was trying to do
> and how this should work, but as arch/x86/kernel/pci-gart_64.c always
> uses 0 as align_mask I just ignored it.

Yeah, it's for only POWER IOMMUs. It's meaningless for gart and
calgary IOMMUs.


> I will applie your patch and see if this hunk from
> find_next_zero_area() makes a difference:
> 
>end = index + nr;
> -   if (end > size)
> +   if (end >= size)
> return -1;
> -   for (i = index + 1; i < end; i++) {
> +   for (i = index; i < end; i++) {
> if (test_bit(i, map)) {

The patch should not make a difference for X86_64.

Can you try the patch to revert my IOMMU changes?

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread FUJITA Tomonori

On Sun, 6 Jan 2008 21:03:42 +0100
Torsten Kaiser [EMAIL PROTECTED] wrote:

 On Jan 6, 2008 2:33 PM, FUJITA Tomonori [EMAIL PROTECTED] wrote:
  On Sun, 6 Jan 2008 12:35:35 +0100
  Torsten Kaiser [EMAIL PROTECTED] wrote:
   On Jan 6, 2008 12:23 PM, FUJITA Tomonori [EMAIL PROTECTED] wrote:
On Sun, 6 Jan 2008 11:41:10 +0100
Torsten Kaiser [EMAIL PROTECTED] wrote:
 I will applie your patch and see if this hunk from
 find_next_zero_area() makes a difference:

end = index + nr;
 -   if (end  size)
 +   if (end = size)
 return -1;
 
 - that might still have made a difference, but ...
 
 -   for (i = index + 1; i  end; i++) {
 +   for (i = index; i  end; i++) {
 
 ... as you say below, the test for the index position is only needed
 if index is modified after find_next_zero_bit().
 
 if (test_bit(i, map)) {
   
The patch should not make a difference for X86_64.
  
   Hmm...
   arch/x86/kernel/pci-gart_64.c:
   alloc_iommu() calls iommu_area_alloc()
   lib/iommu-helper.c:
   iommu_area_alloc() calls find_next_zero_area()
   - so the above code should be called even on X86_64
 
  Oops, I meant that the patch fixes the align allocation (non zero
  align_mask case). X86_64 doesn't use the align allocation.
 
 
   And the change in the for loop means that 'index' will now be tested,
   but with the old code it was not.
 
  With the old code, 'index' is tested by find_next_zero_bit.
 
  With the new code and non zero align_mask case, 'index' is not tested
  by find_next_zero_bit. So test_bit needs to start with 'index'.
 
  So If I understand the correctly, this patch should not make a
  difference for x86_64 though I might miss something.
 
 You did not miss anything.
 After 18 packages my system crashed again.
 
   And double using something does fit with the errors I'm seeing...
  
Can you try the patch to revert my IOMMU changes?
   
http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html
  
   Testing for this bug is a little bit slow, as I'm compiling ~100
   packages trying to trigger it.
   If my current testrun with the patch from
   http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html
   crashes, I will revert the hole IOMMU changes with above patch and try 
   again.
 
  Thanks for testing,
 
 OK, I'm still testing this, but after 95 completed packages I'm rather
 certain that reverting the IOMMU changes with this patch fixes my
 problem.
 I didn't have time to look more into this, so I can't offer any
 concrete ideas where the bug is.
 
 If you send more patches, I'm willing to test them, but it might take
 some more time during the next week.

Can you try 2.6.24-rc7 + the IOMMU changes?

The patches are available at:

http://www.kernel.org/pub/linux/kernel/people/tomo/iommu/

Or if you prefer the git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git 
iommu-sg-fixes



I've looked at the changes to GART but they are straightforward and
don't look wrong...


Thanks,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] scsi: Use new __dma_buffer to align sense buffer in scsi_cmnd

2008-01-06 Thread FUJITA Tomonori

On Sun, 23 Dec 2007 13:09:05 +0200
Boaz Harrosh [EMAIL PROTECTED] wrote:

 On Fri, Dec 21 2007 at 4:30 +0200, Benjamin Herrenschmidt [EMAIL PROTECTED] 
 wrote:
  The sense buffer ins scsi_cmnd can nowadays be DMA'ed into directly
  by some low level drivers (that typically happens with USB mass
  storage).
  
  This is a problem on non cache coherent architectures such as
  embedded PowerPCs where the sense buffer can share cache lines with
  other structure members, which leads to various forms of corruption.
  
  This uses the newly defined __dma_buffer annotation to enforce that
  on such platforms, the sense_buffer is contained within its own
  cache line. This has no effect on cache coherent architectures.
  
  Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
  ---
  
   include/scsi/scsi_cmnd.h |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  --- linux-merge.orig/include/scsi/scsi_cmnd.h   2007-12-21 
  13:07:14.0 +1100
  +++ linux-merge/include/scsi/scsi_cmnd.h2007-12-21 13:07:29.0 
  +1100
  @@ -88,7 +88,7 @@ struct scsi_cmnd {
 working on */
   
   #define SCSI_SENSE_BUFFERSIZE  96
  -   unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE];
  +   unsigned char sense_buffer[SCSI_SENSE_BUFFERSIZE] __dma_buffer;
  /* obtained by REQUEST SENSE when
   * CHECK CONDITION is received on original
   * command (auto-sense) */
 
 This has the potential of leaving a big fat ugly hole in the middle of 
 scsi_cmnd. I would suggest of *just* moving the sense_buffer array to be
 the *first member* of struct scsi_cmnd. The command itself is already cache
 aligned, allocated by the proper flags to it's slab. And put a fat comment
 near it's definition.
 
 This is until a proper fix is sent. I have in my Q a proposition for a 
 more prominent solution, which I will send next month. Do to short comings
 in the sense handling and optimizations, but should definitely cover this
 problem.
 
 The code should have time to be discussed and tested, so it is only 2.6.26
 material. For the duration of the 2.6.25 kernel we can live with a reorder
 of scsi_cmnd members, if it solves such a grave bug for some ARCHs.
 
 Boaz
 
 [RFD below]
 My proposed solution will be has follows:
 
  1. Since the scsi protocol mandates an immediate REQUEST_SENSE after an error
 in effect the Q is frozen until the REQUEST_SENSE command returns.
 
  2. The scsi-ml needs the sense buffer for its normal operation, independent 
 from the ULD's request of the sence_buffer or not at request-sense. But
 in effect, 90% of all scsi-requests come with ULD's allocated buffer for
 sense, that is copied to, on command completion.
 
  3. 99% of all commands complete successfully, so if an optimization is 
 proposed for the successful case, sacrificing a few cycles for the error
 case than thats a good thing.
 
  My suggestion is to have a per Q, driver-overridable, sense buffer that is 
  DMAed/written to by drivers. At the end of the REQUEST_SENSE command one
  of 2 things will be done. Either copy the sense to the ULD's supplied buffer,
  or if not available, allocate one from a dedicated mem_cache pool.
  
  So we are completely saving 92 bytes from scsi_cmnd by replacing the buffer
  with a pointer. We can always read the sense into a per Q buffer. And 10% of
  %1 of the times we will need to allocate a sense buffer from a dedicated 
 mem_cache
  I would say thats a nice optimization.
 
  The changes to scsi_error/scsi_cmnd and friends, is pretty strait forward. 
 But
  it depends on a conversion of 4/5 drivers to the new scsi_eh API for 
  REQUEST_SENSE. I have only converted these drivers that interfered with the 
 accessors
  effort + 1 other places. But there are a few more places that are not 
 converted.
  Once done. The implementation can easily change with no affect on drivers.

I think that removing the sense_buffer array from scsi_cmnd effects
lots of LLDs. As I wrote in other mail, many LLDs assume that
scsi_cmnd:sense_buffer is always available. Another big task is to
take care about auto sense.

Have you already had some patches? I've just started to work on this
and I'd like to push that fix into 2.6.25.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread FUJITA Tomonori

On Sun, 6 Jan 2008 11:41:10 +0100
Torsten Kaiser [EMAIL PROTECTED] wrote:

 On Jan 6, 2008 4:28 AM, FUJITA Tomonori [EMAIL PROTECTED] wrote:
  On Sat, 5 Jan 2008 17:25:24 -0800
  Andrew Morton [EMAIL PROTECTED] wrote:
   On Sat, 5 Jan 2008 23:10:17 +0100 Torsten Kaiser [EMAIL PROTECTED] 
   wrote:
But the cause of my mail is the following question:
Regarding my iommu-sg-merging-patches are new in -rc3-mm and could be
the cause-suspicion I looked at these patches and came across these
hunks:
   
This is removed from arch/x86/lib/bitstr_64.c:
-/* Find string of zero bits in a bitmap */
-unsigned long
-find_next_zero_string(unsigned long *bitmap, long start, long nbits, 
int len)
-{
-   unsigned long n, end, i;
-
- again:
-   n = find_next_zero_bit(bitmap, nbits, start);
-   if (n == -1)
-   return -1;
-
-   /* could test bitsliced, but it's hardly worth it */
-   end = n+len;
-   if (end  nbits)
-   return -1;
-   for (i = n+1; i  end; i++) {
-   if (test_bit(i, bitmap)) {
-   start = i+1;
-   goto again;
-   }
-   }
-   return n;
-}
   
This is added to lib/iommu-helper.c:
+static unsigned long find_next_zero_area(unsigned long *map,
+unsigned long size,
+unsigned long start,
+unsigned int nr)
+{
+   unsigned long index, end, i;
+again:
+   index = find_next_zero_bit(map, size, start);
+   end = index + nr;
+   if (end  size)
+   return -1;
+   for (i = index + 1; i  end; i++) {
+   if (test_bit(i, map)) {
+   start = i+1;
+   goto again;
+   }
+   }
+   return index;
+}
   
The old version checks, if find_next_zero_bit returns -1, the new
version doesn't do this.
Is this intended and can find_next_zero_bit never fail?
Hmm... but in the worst case it should only loop forever if I'm
reading this right (index = -1 = for-loop counts from 0 to nr, if any
bit is set it will jump to again: and if the next call to
find_next_zero_bit also fails, its an endless loop)
 
  find_next_zero_bit returns -1?
 
  It seems that x86_64 doesn't.
 
 I'm sorry. I didn't look into find_next_zero_bit, I only noted that
 the old version did check for -1 and the new one didn't.
 Obviously the old check was superfluous.
 
  POWER and SPARC64 IOMMUs use
  find_next_zero_bit too but both doesn't check if find_next_zero_bit
  returns -1. If find_next_zero_bit fails, it returns size. So it
  doesn't leads to an endless loop.
 
 Yes, this can't happen. It was a wrong assumption on my part.
 
  But this patch has other bugs that break POWER IOMMUs.
 
  If you use the IOMMUs on POWER, please try the following patch:
 
 I'm using CONFIG_GART_IOMMU=y on x86_64.
 
  http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html
 
 I also noted the line index = (index + align_mask)  ~align_mask; in
 iommu_area_alloc() and didn't understand what this was trying to do
 and how this should work, but as arch/x86/kernel/pci-gart_64.c always
 uses 0 as align_mask I just ignored it.

Yeah, it's for only POWER IOMMUs. It's meaningless for gart and
calgary IOMMUs.


 I will applie your patch and see if this hunk from
 find_next_zero_area() makes a difference:
 
end = index + nr;
 -   if (end  size)
 +   if (end = size)
 return -1;
 -   for (i = index + 1; i  end; i++) {
 +   for (i = index; i  end; i++) {
 if (test_bit(i, map)) {

The patch should not make a difference for X86_64.

Can you try the patch to revert my IOMMU changes?

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-06 Thread FUJITA Tomonori

On Sun, 6 Jan 2008 12:35:35 +0100
Torsten Kaiser [EMAIL PROTECTED] wrote:

 On Jan 6, 2008 12:23 PM, FUJITA Tomonori [EMAIL PROTECTED] wrote:
  On Sun, 6 Jan 2008 11:41:10 +0100
  Torsten Kaiser [EMAIL PROTECTED] wrote:
   I will applie your patch and see if this hunk from
   find_next_zero_area() makes a difference:
  
  end = index + nr;
   -   if (end  size)
   +   if (end = size)
   return -1;
   -   for (i = index + 1; i  end; i++) {
   +   for (i = index; i  end; i++) {
   if (test_bit(i, map)) {
 
  The patch should not make a difference for X86_64.
 
 Hmm...
 arch/x86/kernel/pci-gart_64.c:
 alloc_iommu() calls iommu_area_alloc()
 lib/iommu-helper.c:
 iommu_area_alloc() calls find_next_zero_area()
 - so the above code should be called even on X86_64

Oops, I meant that the patch fixes the align allocation (non zero
align_mask case). X86_64 doesn't use the align allocation.


 And the change in the for loop means that 'index' will now be tested,
 but with the old code it was not.

With the old code, 'index' is tested by find_next_zero_bit.

With the new code and non zero align_mask case, 'index' is not tested
by find_next_zero_bit. So test_bit needs to start with 'index'.

So If I understand the correctly, this patch should not make a
difference for x86_64 though I might miss something.


 And double using something does fit with the errors I'm seeing...
 
  Can you try the patch to revert my IOMMU changes?
 
  http://www.mail-archive.com/[EMAIL PROTECTED]/msg12694.html
 
 Testing for this bug is a little bit slow, as I'm compiling ~100
 packages trying to trigger it.
 If my current testrun with the patch from
 http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html
 crashes, I will revert the hole IOMMU changes with above patch and try again.

Thanks for testing,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-05 Thread FUJITA Tomonori

On Sat, 5 Jan 2008 17:25:24 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Sat, 5 Jan 2008 23:10:17 +0100 "Torsten Kaiser" <[EMAIL PROTECTED]> wrote:
> 
> > On Jan 5, 2008 3:52 PM, Torsten Kaiser <[EMAIL PROTECTED]> wrote:
> > > On Jan 5, 2008 11:13 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> > > > On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote:
> > > > > On Jan 5, 2008 1:07 AM, Jarek Poplawski <[EMAIL PROTECTED]> wrote:
> > > > > > I think it would be easier just to start with this working -rc6 and
> > > > > > simply check if we have 'right' suspects, so: git-net.patch and
> > > > > > git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I 
> > > > > > hope,
> > > > > > can compile - otherwise you could try the other way: add the whole 
> > > > > > -mm
> > > > > > and revert these two). Using current gits could complicate this
> > > > > > "investigation".
> > > > >
> > > > > OK, I will try this...
> > >
> > > still on the todo-list, I had no time to try this yet...
> > 
> > working on it...
> > 2.6.24-rc6 + mm-patches up to git.battery (includes git-net and
> > git-netdev-all) worked for 110 packages, then I proclaimed it good.
> > 2.6.24-rc6 + mm-patches up to (including) git.nfsd is currently
> > getting testet (9 packages done...)
> > 
> > But the cause of my mail is the following question:
> > Regarding my "iommu-sg-merging-patches are new in -rc3-mm and could be
> > the cause"-suspicion I looked at these patches and came across these
> > hunks:
> > 
> > This is removed from arch/x86/lib/bitstr_64.c:
> > -/* Find string of zero bits in a bitmap */
> > -unsigned long
> > -find_next_zero_string(unsigned long *bitmap, long start, long nbits, int 
> > len)
> > -{
> > -   unsigned long n, end, i;
> > -
> > - again:
> > -   n = find_next_zero_bit(bitmap, nbits, start);
> > -   if (n == -1)
> > -   return -1;
> > -
> > -   /* could test bitsliced, but it's hardly worth it */
> > -   end = n+len;
> > -   if (end > nbits)
> > -   return -1;
> > -   for (i = n+1; i < end; i++) {
> > -   if (test_bit(i, bitmap)) {
> > -   start = i+1;
> > -   goto again;
> > -   }
> > -   }
> > -   return n;
> > -}
> > 
> > This is added to lib/iommu-helper.c:
> > +static unsigned long find_next_zero_area(unsigned long *map,
> > +unsigned long size,
> > +unsigned long start,
> > +unsigned int nr)
> > +{
> > +   unsigned long index, end, i;
> > +again:
> > +   index = find_next_zero_bit(map, size, start);
> > +   end = index + nr;
> > +   if (end > size)
> > +   return -1;
> > +   for (i = index + 1; i < end; i++) {
> > +   if (test_bit(i, map)) {
> > +   start = i+1;
> > +   goto again;
> > +   }
> > +   }
> > +   return index;
> > +}
> > 
> > The old version checks, if find_next_zero_bit returns -1, the new
> > version doesn't do this.
> > Is this intended and can find_next_zero_bit never fail?
> > Hmm... but in the worst case it should only loop forever if I'm
> > reading this right (index = -1 => for-loop counts from 0 to nr, if any
> > bit is set it will jump to "again:" and if the next call to
> > find_next_zero_bit also fails, its an endless loop)

find_next_zero_bit returns -1?

It seems that x86_64 doesn't. POWER and SPARC64 IOMMUs use
find_next_zero_bit too but both doesn't check if find_next_zero_bit
returns -1. If find_next_zero_bit fails, it returns size. So it
doesn't leads to an endless loop.

But this patch has other bugs that break POWER IOMMUs.

If you use the IOMMUs on POWER, please try the following patch:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html

> 
> I susect these are doing different things. 
> iommu-sg-kill-__clear_bit_string-and-find_next_zero_string.patch says:
> 
>   This kills unused __clear_bit_string and find_next_zero_string (they
>   were used by only gart and calgary IOMMUs).
> 
> iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch says:
> 
>   This adds IOMMU helper functions for the free area management.  These
>   functions take care of LLD's segment boundary limit for IOMMUs.  They
>   would be useful for IOMMUs that use bitmap for the free area management.
> 
> > So even if this can not explain my bug, could somebody check if this
> > is a real bug or not?
> 
> Let's cc the author of those changes.
> 
> Thanks for persisting with this bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc6-mm1

2008-01-05 Thread FUJITA Tomonori

On Sat, 5 Jan 2008 17:25:24 -0800
Andrew Morton [EMAIL PROTECTED] wrote:

 On Sat, 5 Jan 2008 23:10:17 +0100 Torsten Kaiser [EMAIL PROTECTED] wrote:
 
  On Jan 5, 2008 3:52 PM, Torsten Kaiser [EMAIL PROTECTED] wrote:
   On Jan 5, 2008 11:13 AM, Jarek Poplawski [EMAIL PROTECTED] wrote:
On Sat, Jan 05, 2008 at 09:01:02AM +0100, Torsten Kaiser wrote:
 On Jan 5, 2008 1:07 AM, Jarek Poplawski [EMAIL PROTECTED] wrote:
  I think it would be easier just to start with this working -rc6 and
  simply check if we have 'right' suspects, so: git-net.patch and
  git-nfsd.patch from -mm1-broken-out, as suggested by Herbert (I 
  hope,
  can compile - otherwise you could try the other way: add the whole 
  -mm
  and revert these two). Using current gits could complicate this
  investigation.

 OK, I will try this...
  
   still on the todo-list, I had no time to try this yet...
  
  working on it...
  2.6.24-rc6 + mm-patches up to git.battery (includes git-net and
  git-netdev-all) worked for 110 packages, then I proclaimed it good.
  2.6.24-rc6 + mm-patches up to (including) git.nfsd is currently
  getting testet (9 packages done...)
  
  But the cause of my mail is the following question:
  Regarding my iommu-sg-merging-patches are new in -rc3-mm and could be
  the cause-suspicion I looked at these patches and came across these
  hunks:
  
  This is removed from arch/x86/lib/bitstr_64.c:
  -/* Find string of zero bits in a bitmap */
  -unsigned long
  -find_next_zero_string(unsigned long *bitmap, long start, long nbits, int 
  len)
  -{
  -   unsigned long n, end, i;
  -
  - again:
  -   n = find_next_zero_bit(bitmap, nbits, start);
  -   if (n == -1)
  -   return -1;
  -
  -   /* could test bitsliced, but it's hardly worth it */
  -   end = n+len;
  -   if (end  nbits)
  -   return -1;
  -   for (i = n+1; i  end; i++) {
  -   if (test_bit(i, bitmap)) {
  -   start = i+1;
  -   goto again;
  -   }
  -   }
  -   return n;
  -}
  
  This is added to lib/iommu-helper.c:
  +static unsigned long find_next_zero_area(unsigned long *map,
  +unsigned long size,
  +unsigned long start,
  +unsigned int nr)
  +{
  +   unsigned long index, end, i;
  +again:
  +   index = find_next_zero_bit(map, size, start);
  +   end = index + nr;
  +   if (end  size)
  +   return -1;
  +   for (i = index + 1; i  end; i++) {
  +   if (test_bit(i, map)) {
  +   start = i+1;
  +   goto again;
  +   }
  +   }
  +   return index;
  +}
  
  The old version checks, if find_next_zero_bit returns -1, the new
  version doesn't do this.
  Is this intended and can find_next_zero_bit never fail?
  Hmm... but in the worst case it should only loop forever if I'm
  reading this right (index = -1 = for-loop counts from 0 to nr, if any
  bit is set it will jump to again: and if the next call to
  find_next_zero_bit also fails, its an endless loop)

find_next_zero_bit returns -1?

It seems that x86_64 doesn't. POWER and SPARC64 IOMMUs use
find_next_zero_bit too but both doesn't check if find_next_zero_bit
returns -1. If find_next_zero_bit fails, it returns size. So it
doesn't leads to an endless loop.

But this patch has other bugs that break POWER IOMMUs.

If you use the IOMMUs on POWER, please try the following patch:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12702.html

 
 I susect these are doing different things. 
 iommu-sg-kill-__clear_bit_string-and-find_next_zero_string.patch says:
 
   This kills unused __clear_bit_string and find_next_zero_string (they
   were used by only gart and calgary IOMMUs).
 
 iommu-sg-add-iommu-helper-functions-for-the-free-area-management.patch says:
 
   This adds IOMMU helper functions for the free area management.  These
   functions take care of LLD's segment boundary limit for IOMMUs.  They
   would be useful for IOMMUs that use bitmap for the free area management.
 
  So even if this can not explain my bug, could somebody check if this
  is a real bug or not?
 
 Let's cc the author of those changes.
 
 Thanks for persisting with this bug.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bsg : Add support for io vectors in bsg

2008-01-04 Thread FUJITA Tomonori

From: Deepak Colluru <[EMAIL PROTECTED]>
Subject: [PATCH] bsg : Add support for io vectors in bsg
Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST)

> From: Deepak Colluru <[EMAIL PROTECTED]>
> 
> Add support for io vectors in bsg.
> 
> Signed-off-by: Deepak Colluru <[EMAIL PROTECTED]>
> ---
>   bsg.c |   52 +---
>   1 file changed, 49 insertions(+), 3 deletions(-)

Thanks, but I have to NACK this.

You can find the discussion about bsg io vector support and a similar
patch in linux-scsi archive. I have no plan to support it since it
needs the compat hack.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bsg : Add support for io vectors in bsg

2008-01-04 Thread FUJITA Tomonori

From: Deepak Colluru [EMAIL PROTECTED]
Subject: [PATCH] bsg : Add support for io vectors in bsg
Date: Fri, 4 Jan 2008 21:47:34 +0530 (IST)

 From: Deepak Colluru [EMAIL PROTECTED]
 
 Add support for io vectors in bsg.
 
 Signed-off-by: Deepak Colluru [EMAIL PROTECTED]
 ---
   bsg.c |   52 +---
   1 file changed, 49 insertions(+), 3 deletions(-)

Thanks, but I have to NACK this.

You can find the discussion about bsg io vector support and a similar
patch in linux-scsi archive. I have no plan to support it since it
needs the compat hack.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24-rc BUGFIX] IB/srp: release transport when removing host

2008-01-03 Thread FUJITA Tomonori

On Thu, 03 Jan 2008 21:39:19 -0500
Dave Dillow <[EMAIL PROTECTED]> wrote:

> When removing the ib_srp module, srp_remove_one() does not release the
> SRP transport class when it is releasing the SCSI host. This leads to
> dangling references to kfree()'d memory, and an eventual oops.
> 
> Signed-off-by: David Dillow <[EMAIL PROTECTED]>

Thanks again!

Linus has already merged your previous patch:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b0e47c8b79154772a436f25bf7646733e1d6194c

So please resend a patch to move srp_remove_host before
scsi_remove_host instead of adding srp_remove_host.


> ---
> On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote:
> > I think that this is the root problem and the patch fixes it in the
> > right way. Please send this patch to [EMAIL PROTECTED] and a
> > patch to move srp_remove_host before scsi_remove_host in
> > srp_remove_one to Roland.
> > 
> > Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]>
> 
> Not sure if your Acked-by was for this one as well, so I left it off.

Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]>

> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
> b/drivers/infiniband/ulp/srp/ib_srp.c
> index 950228f..bdb6f85 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device)
>  
>   list_for_each_entry_safe(target, tmp_target,
>>target_list, list) {
> + srp_remove_host(target->scsi_host);
>   scsi_remove_host(target->scsi_host);
>   srp_disconnect_target(target);
>   ib_destroy_cm_id(target->cm_id);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24-rc BUGFIX] SRP transport: only remove our own entries

2008-01-03 Thread FUJITA Tomonori

On Thu, 03 Jan 2008 21:34:49 -0500
Dave Dillow <[EMAIL PROTECTED]> wrote:

> The SCSI SRP transport class currently iterates over all children
> devices of the host that is being removed in srp_remove_host(). However,
> not all of those children were created by the SRP transport, and
> removing them will cause corruption and an oops when their creator tries
> to remove them.
> 
> Signed-off-by: David Dillow <[EMAIL PROTECTED]>
> Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]>
> ---

Thanks!

James, please put this patch into scsi-rc-fixes.

> On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote:
> > On Thu, 03 Jan 2008 15:51:25 -0500
> > I think that this is the root problem and the patch fixes it in the
> > right way. Please send this patch to [EMAIL PROTECTED] and a
> > patch to move srp_remove_host before scsi_remove_host in
> > srp_remove_one to Roland.
> > 
> > Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]>
> 
> diff --git a/drivers/scsi/scsi_transport_srp.c 
> b/drivers/scsi/scsi_transport_srp.c
> index 44a340b..65c584d 100644
> --- a/drivers/scsi/scsi_transport_srp.c
> +++ b/drivers/scsi/scsi_transport_srp.c
> @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del);
>  
>  static int do_srp_rport_del(struct device *dev, void *data)
>  {
> - srp_rport_del(dev_to_rport(dev));
> + if (scsi_is_srp_rport(dev))
> + srp_rport_del(dev_to_rport(dev));
>   return 0;
>  }
>  
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5

2008-01-03 Thread FUJITA Tomonori

On Thu, 03 Jan 2008 15:51:25 -0500
David Dillow <[EMAIL PROTECTED]> wrote:

> 
> On Thu, 2008-01-03 at 15:09 -0500, David Dillow wrote:
> > As for a better fix, I'm not sure.
> 
> Here's a better way than the strncmp. If this meets everyone's approval,
> then I can roll up a proper commit.

Thanks! I really apprecate it.

I think that this is the root problem and the patch fixes it in the
right way. Please send this patch to [EMAIL PROTECTED] and a
patch to move srp_remove_host before scsi_remove_host in
srp_remove_one to Roland.

Acked-by: FUJITA Tomonori <[EMAIL PROTECTED]>

> diff --git a/drivers/scsi/scsi_transport_srp.c 
> b/drivers/scsi/scsi_transport_srp.c
> index 44a340b..65c584d 100644
> --- a/drivers/scsi/scsi_transport_srp.c
> +++ b/drivers/scsi/scsi_transport_srp.c
> @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del);
>  
>  static int do_srp_rport_del(struct device *dev, void *data)
>  {
> - srp_rport_del(dev_to_rport(dev));
> + if (scsi_is_srp_rport(dev))
> + srp_rport_del(dev_to_rport(dev));
>   return 0;
>  }
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: list corruption on ib_srp load in v2.6.24-rc5

2008-01-03 Thread FUJITA Tomonori

On Wed, 02 Jan 2008 09:51:38 -0800
Roland Dreier <[EMAIL PROTECTED]> wrote:

>  > > Can you try this?
>  > 
>  > That patched oopsed in scsi_remove_host(), but reversing the order has
>  > survived over 500 insert/probe/remove cycles.
>  > 
>  > Tested-by: David Dillow <[EMAIL PROTECTED]>
>  > ---
>  > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
> b/drivers/infiniband/ulp/srp/ib_srp.c
>  > index 950228f..77e8b90 100644
>  > --- a/drivers/infiniband/ulp/srp/ib_srp.c
>  > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>  > @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device)
>  >list_for_each_entry_safe(target, tmp_target,
>  > >target_list, list) {
>  >scsi_remove_host(target->scsi_host);
>  > +  srp_remove_host(target->scsi_host);
>  >srp_disconnect_target(target);
> 
> Where do we stand on this?  What is the right place to put the
> srp_remove_host?  Is there a bug somewhere else?

{sas|fc}_remove_host is called before scsi_remove_host. And in
srp_remove_work(), we call srp_remove_host and then
scsi_remove_host. ibmvscsi also calls them in that order.

I thought that I messed up something in srp_transport_class. But I
can't figure out what's wrong. The above patch works and is unlikely
to lead to critical problems so I'm fine with it for now.


> I'd like to get this fixed before 2.6.24 final comes out...

Yeah, it should be fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: list corruption on ib_srp load in v2.6.24-rc5

2008-01-03 Thread FUJITA Tomonori

On Wed, 02 Jan 2008 09:51:38 -0800
Roland Dreier [EMAIL PROTECTED] wrote:

Can you try this?
   
   That patched oopsed in scsi_remove_host(), but reversing the order has
   survived over 500 insert/probe/remove cycles.
   
   Tested-by: David Dillow [EMAIL PROTECTED]
   ---
   diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
 b/drivers/infiniband/ulp/srp/ib_srp.c
   index 950228f..77e8b90 100644
   --- a/drivers/infiniband/ulp/srp/ib_srp.c
   +++ b/drivers/infiniband/ulp/srp/ib_srp.c
   @@ -2054,6 +2054,7 @@ static void srp_remove_one(struct ib_device *device)
  list_for_each_entry_safe(target, tmp_target,
   host-target_list, list) {
  scsi_remove_host(target-scsi_host);
   +  srp_remove_host(target-scsi_host);
  srp_disconnect_target(target);
 
 Where do we stand on this?  What is the right place to put the
 srp_remove_host?  Is there a bug somewhere else?

{sas|fc}_remove_host is called before scsi_remove_host. And in
srp_remove_work(), we call srp_remove_host and then
scsi_remove_host. ibmvscsi also calls them in that order.

I thought that I messed up something in srp_transport_class. But I
can't figure out what's wrong. The above patch works and is unlikely
to lead to critical problems so I'm fine with it for now.


 I'd like to get this fixed before 2.6.24 final comes out...

Yeah, it should be fixed.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ofa-general] Re: list corruption on ib_srp load in v2.6.24-rc5

2008-01-03 Thread FUJITA Tomonori

On Thu, 03 Jan 2008 15:51:25 -0500
David Dillow [EMAIL PROTECTED] wrote:

 
 On Thu, 2008-01-03 at 15:09 -0500, David Dillow wrote:
  As for a better fix, I'm not sure.
 
 Here's a better way than the strncmp. If this meets everyone's approval,
 then I can roll up a proper commit.

Thanks! I really apprecate it.

I think that this is the root problem and the patch fixes it in the
right way. Please send this patch to [EMAIL PROTECTED] and a
patch to move srp_remove_host before scsi_remove_host in
srp_remove_one to Roland.

Acked-by: FUJITA Tomonori [EMAIL PROTECTED]

 diff --git a/drivers/scsi/scsi_transport_srp.c 
 b/drivers/scsi/scsi_transport_srp.c
 index 44a340b..65c584d 100644
 --- a/drivers/scsi/scsi_transport_srp.c
 +++ b/drivers/scsi/scsi_transport_srp.c
 @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del);
  
  static int do_srp_rport_del(struct device *dev, void *data)
  {
 - srp_rport_del(dev_to_rport(dev));
 + if (scsi_is_srp_rport(dev))
 + srp_rport_del(dev_to_rport(dev));
   return 0;
  }
  
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24-rc BUGFIX] SRP transport: only remove our own entries

2008-01-03 Thread FUJITA Tomonori

On Thu, 03 Jan 2008 21:34:49 -0500
Dave Dillow [EMAIL PROTECTED] wrote:

 The SCSI SRP transport class currently iterates over all children
 devices of the host that is being removed in srp_remove_host(). However,
 not all of those children were created by the SRP transport, and
 removing them will cause corruption and an oops when their creator tries
 to remove them.
 
 Signed-off-by: David Dillow [EMAIL PROTECTED]
 Acked-by: FUJITA Tomonori [EMAIL PROTECTED]
 ---

Thanks!

James, please put this patch into scsi-rc-fixes.

 On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote:
  On Thu, 03 Jan 2008 15:51:25 -0500
  I think that this is the root problem and the patch fixes it in the
  right way. Please send this patch to [EMAIL PROTECTED] and a
  patch to move srp_remove_host before scsi_remove_host in
  srp_remove_one to Roland.
  
  Acked-by: FUJITA Tomonori [EMAIL PROTECTED]
 
 diff --git a/drivers/scsi/scsi_transport_srp.c 
 b/drivers/scsi/scsi_transport_srp.c
 index 44a340b..65c584d 100644
 --- a/drivers/scsi/scsi_transport_srp.c
 +++ b/drivers/scsi/scsi_transport_srp.c
 @@ -265,7 +265,8 @@ EXPORT_SYMBOL_GPL(srp_rport_del);
  
  static int do_srp_rport_del(struct device *dev, void *data)
  {
 - srp_rport_del(dev_to_rport(dev));
 + if (scsi_is_srp_rport(dev))
 + srp_rport_del(dev_to_rport(dev));
   return 0;
  }
  
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24-rc BUGFIX] IB/srp: release transport when removing host

2008-01-03 Thread FUJITA Tomonori

On Thu, 03 Jan 2008 21:39:19 -0500
Dave Dillow [EMAIL PROTECTED] wrote:

 When removing the ib_srp module, srp_remove_one() does not release the
 SRP transport class when it is releasing the SCSI host. This leads to
 dangling references to kfree()'d memory, and an eventual oops.
 
 Signed-off-by: David Dillow [EMAIL PROTECTED]

Thanks again!

Linus has already merged your previous patch:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b0e47c8b79154772a436f25bf7646733e1d6194c

So please resend a patch to move srp_remove_host before
scsi_remove_host instead of adding srp_remove_host.


 ---
 On Fri, 2008-01-04 at 09:47 +0900, FUJITA Tomonori wrote:
  I think that this is the root problem and the patch fixes it in the
  right way. Please send this patch to [EMAIL PROTECTED] and a
  patch to move srp_remove_host before scsi_remove_host in
  srp_remove_one to Roland.
  
  Acked-by: FUJITA Tomonori [EMAIL PROTECTED]
 
 Not sure if your Acked-by was for this one as well, so I left it off.

Acked-by: FUJITA Tomonori [EMAIL PROTECTED]

 diff --git a/drivers/infiniband/ulp/srp/ib_srp.c 
 b/drivers/infiniband/ulp/srp/ib_srp.c
 index 950228f..bdb6f85 100644
 --- a/drivers/infiniband/ulp/srp/ib_srp.c
 +++ b/drivers/infiniband/ulp/srp/ib_srp.c
 @@ -2053,6 +2053,7 @@ static void srp_remove_one(struct ib_device *device)
  
   list_for_each_entry_safe(target, tmp_target,
host-target_list, list) {
 + srp_remove_host(target-scsi_host);
   scsi_remove_host(target-scsi_host);
   srp_disconnect_target(target);
   ib_destroy_cm_id(target-cm_id);
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-31 Thread FUJITA Tomonori

On Mon, 31 Dec 2007 17:57:35 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> FUJITA Tomonori wrote:
> > 
> > Oops, it's for -mm.
> > 
> 
> Hi,
> 
> I just tested this patch and it works fine for me so far.
> 
> Tested-by: Balbir Singh <[EMAIL PROTECTED]>

Thanks!

I sent an updated patch including this fix several hours ago:

http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-31 Thread FUJITA Tomonori

On Mon, 31 Dec 2007 17:57:35 +0530
Balbir Singh [EMAIL PROTECTED] wrote:

 FUJITA Tomonori wrote:
  
  Oops, it's for -mm.
  
 
 Hi,
 
 I just tested this patch and it works fine for me so far.
 
 Tested-by: Balbir Singh [EMAIL PROTECTED]

Thanks!

I sent an updated patch including this fix several hours ago:

http://ozlabs.org/pipermail/linuxppc-dev/2007-December/048997.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-27 Thread FUJITA Tomonori

On Fri, 28 Dec 2007 13:05:39 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> FUJITA Tomonori wrote:
> [snip]
> > Thanks,
> > 
> > Can you try this?
> > 
> > 
> > diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
> > index e7d8544..495575a 100644
> > --- a/lib/iommu-helper.c
> > +++ b/lib/iommu-helper.c
> > @@ -8,15 +8,20 @@
> >  static unsigned long find_next_zero_area(unsigned long *map,
> >  unsigned long size,
> >  unsigned long start,
> > -unsigned int nr)
> > +unsigned int nr,
> > +unsigned long align_mask)
> >  {
> > unsigned long index, end, i;
> >  again:
> > index = find_next_zero_bit(map, size, start);
> > +
> > +   /* Align allocation */
> > +   index = (index + align_mask) & ~align_mask;
> > +
> > end = index + nr;
> > -   if (end > size)
> > +   if (end >= size)
> > return -1;
> > -   for (i = index + 1; i < end; i++) {
> > +   for (i = index; i < end; i++) {
> > if (test_bit(i, map)) {
> > start = i+1;
> > goto again;
> > @@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, 
> > unsigned long size,
> >  {
> > unsigned long index;
> >  again:
> > -   index = find_next_zero_area(map, size, start, nr);
> > +   index = find_next_zero_area(map, size, start, nr, align_mask);
> > if (index != -1) {
> > -   index = (index + align_mask) & ~align_mask;
> > if (is_span_boundary(index, nr, shift, boundary_size)) {
> > /* we could do more effectively */
> > start = index + 1;
> 
> This on top of -mm? Or on top of the reverted iommu patch.

Oops, it's for -mm.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-27 Thread FUJITA Tomonori

On Thu, 27 Dec 2007 11:15:37 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> FUJITA Tomonori wrote:
> > On Thu, 27 Dec 2007 10:08:25 +0530
> > Balbir Singh <[EMAIL PROTECTED]> wrote:
> > 
> >> FUJITA Tomonori wrote:
> >>> On Mon, 24 Dec 2007 10:18:50 +0530
> >>> Balbir Singh <[EMAIL PROTECTED]> wrote:
> >>>
> >> [snip]
> >>
> >>> I might break the IOMMU code. Can you reproduce it easily? If so,
> >>> reverting my IOMMU patches (I've attached a patch to revert them) fix
> >>> the problem?
> >> [snip]
> >>
> >>
> >> Yes, this patch fixes the problem for me.
> > 
> > Thanks, so you can reproduce it easily, right?
> > 
> 
> Yes, quite easily
> 
> > The problem is that I don't want to revert these changes. I'll see
> > how these changes cause the problem shortly.
> 
> I'll try and find some bandwidth to review/test the patches and help you
> figure out the right solution.

Thanks,

Can you try this?


diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
index e7d8544..495575a 100644
--- a/lib/iommu-helper.c
+++ b/lib/iommu-helper.c
@@ -8,15 +8,20 @@
 static unsigned long find_next_zero_area(unsigned long *map,
 unsigned long size,
 unsigned long start,
-unsigned int nr)
+unsigned int nr,
+unsigned long align_mask)
 {
unsigned long index, end, i;
 again:
index = find_next_zero_bit(map, size, start);
+
+   /* Align allocation */
+   index = (index + align_mask) & ~align_mask;
+
end = index + nr;
-   if (end > size)
+   if (end >= size)
return -1;
-   for (i = index + 1; i < end; i++) {
+   for (i = index; i < end; i++) {
if (test_bit(i, map)) {
start = i+1;
goto again;
@@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned 
long size,
 {
unsigned long index;
 again:
-   index = find_next_zero_area(map, size, start, nr);
+   index = find_next_zero_area(map, size, start, nr, align_mask);
if (index != -1) {
-   index = (index + align_mask) & ~align_mask;
if (is_span_boundary(index, nr, shift, boundary_size)) {
/* we could do more effectively */
start = index + 1;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-27 Thread FUJITA Tomonori

On Thu, 27 Dec 2007 11:15:37 +0530
Balbir Singh [EMAIL PROTECTED] wrote:

 FUJITA Tomonori wrote:
  On Thu, 27 Dec 2007 10:08:25 +0530
  Balbir Singh [EMAIL PROTECTED] wrote:
  
  FUJITA Tomonori wrote:
  On Mon, 24 Dec 2007 10:18:50 +0530
  Balbir Singh [EMAIL PROTECTED] wrote:
 
  [snip]
 
  I might break the IOMMU code. Can you reproduce it easily? If so,
  reverting my IOMMU patches (I've attached a patch to revert them) fix
  the problem?
  [snip]
 
 
  Yes, this patch fixes the problem for me.
  
  Thanks, so you can reproduce it easily, right?
  
 
 Yes, quite easily
 
  The problem is that I don't want to revert these changes. I'll see
  how these changes cause the problem shortly.
 
 I'll try and find some bandwidth to review/test the patches and help you
 figure out the right solution.

Thanks,

Can you try this?


diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
index e7d8544..495575a 100644
--- a/lib/iommu-helper.c
+++ b/lib/iommu-helper.c
@@ -8,15 +8,20 @@
 static unsigned long find_next_zero_area(unsigned long *map,
 unsigned long size,
 unsigned long start,
-unsigned int nr)
+unsigned int nr,
+unsigned long align_mask)
 {
unsigned long index, end, i;
 again:
index = find_next_zero_bit(map, size, start);
+
+   /* Align allocation */
+   index = (index + align_mask)  ~align_mask;
+
end = index + nr;
-   if (end  size)
+   if (end = size)
return -1;
-   for (i = index + 1; i  end; i++) {
+   for (i = index; i  end; i++) {
if (test_bit(i, map)) {
start = i+1;
goto again;
@@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, unsigned 
long size,
 {
unsigned long index;
 again:
-   index = find_next_zero_area(map, size, start, nr);
+   index = find_next_zero_area(map, size, start, nr, align_mask);
if (index != -1) {
-   index = (index + align_mask)  ~align_mask;
if (is_span_boundary(index, nr, shift, boundary_size)) {
/* we could do more effectively */
start = index + 1;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-27 Thread FUJITA Tomonori

On Fri, 28 Dec 2007 13:05:39 +0530
Balbir Singh [EMAIL PROTECTED] wrote:

 FUJITA Tomonori wrote:
 [snip]
  Thanks,
  
  Can you try this?
  
  
  diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
  index e7d8544..495575a 100644
  --- a/lib/iommu-helper.c
  +++ b/lib/iommu-helper.c
  @@ -8,15 +8,20 @@
   static unsigned long find_next_zero_area(unsigned long *map,
   unsigned long size,
   unsigned long start,
  -unsigned int nr)
  +unsigned int nr,
  +unsigned long align_mask)
   {
  unsigned long index, end, i;
   again:
  index = find_next_zero_bit(map, size, start);
  +
  +   /* Align allocation */
  +   index = (index + align_mask)  ~align_mask;
  +
  end = index + nr;
  -   if (end  size)
  +   if (end = size)
  return -1;
  -   for (i = index + 1; i  end; i++) {
  +   for (i = index; i  end; i++) {
  if (test_bit(i, map)) {
  start = i+1;
  goto again;
  @@ -50,9 +55,8 @@ unsigned long iommu_area_alloc(unsigned long *map, 
  unsigned long size,
   {
  unsigned long index;
   again:
  -   index = find_next_zero_area(map, size, start, nr);
  +   index = find_next_zero_area(map, size, start, nr, align_mask);
  if (index != -1) {
  -   index = (index + align_mask)  ~align_mask;
  if (is_span_boundary(index, nr, shift, boundary_size)) {
  /* we could do more effectively */
  start = index + 1;
 
 This on top of -mm? Or on top of the reverted iommu patch.

Oops, it's for -mm.

Thanks,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-26 Thread FUJITA Tomonori

On Thu, 27 Dec 2007 10:08:25 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> FUJITA Tomonori wrote:
> > On Mon, 24 Dec 2007 10:18:50 +0530
> > Balbir Singh <[EMAIL PROTECTED]> wrote:
> > 
> [snip]
> 
> > 
> > I might break the IOMMU code. Can you reproduce it easily? If so,
> > reverting my IOMMU patches (I've attached a patch to revert them) fix
> > the problem?
> 
> [snip]
> 
> 
> Yes, this patch fixes the problem for me.

Thanks, so you can reproduce it easily, right?

The problem is that I don't want to revert these changes. I'll see
how these changes cause the problem shortly.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SCSI errors on powerpc with 2.6.24-rc6-mm1

2007-12-26 Thread FUJITA Tomonori

On Mon, 24 Dec 2007 10:18:50 +0530
Balbir Singh <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> I've just seen this on my dmesg, this is new, never seen this before on
> this box and it happens only with this version of the kernel.
> 
> In this configuration, the page size is set to 64K and I've enabled fake
> NUMA nodes on PowerPC.
> 
> tce_buildmulti_pSeriesLP: plpar_tce_put failed. rc=-4
> index   = 0x402
> npages  = 0x0
> tce[0] val = 0x15ad0001
> Call Trace:
> [cffe74f0] [c00491a4]
> .tce_buildmulti_pSeriesLP+0x26c/0x2ac (unreliable)
> [cffe75c0] [c00295e4] .iommu_map_sg+0x1d4/0x418
> [cffe76d0] [c0028664] .dma_iommu_map_sg+0x3c/0x50
> [cffe7750] [c03b6c30] .scsi_dma_map+0x70/0x94
> [cffe77d0] [c03dedbc] .ipr_queuecommand+0x300/0x500
> [cffe7880] [c03ae964] .scsi_dispatch_cmd+0x21c/0x2b8
> [cffe7920] [c03b67a0] .scsi_request_fn+0x310/0x460
> [cffe79d0] [c024ab90] .blk_run_queue+0x94/0xec
> [cffe7a70] [c03b3b08] .scsi_run_queue+0x24c/0x27c
> [cffe7b20] [c03b4424] .scsi_next_command+0x48/0x70
> [cffe7bc0] [c03b4b48] .scsi_end_request+0xbc/0xe4
> [cffe7c60] [c03b5294] .scsi_io_completion+0x170/0x3e8
> [cffe7d40] [c03ae0e4] .scsi_finish_command+0xb4/0xd4
> [cffe7dd0] [c03b584c] .scsi_softirq_done+0x114/0x138
> [cffe7e60] [c024af70] .blk_done_softirq+0xa0/0xd0
> [cffe7ef0] [c007a2a0] .__do_softirq+0xa8/0x164
> [cffe7f90] [c0027edc] .call_do_softirq+0x14/0x24
> [c0003e183950] [c000bdcc] .do_softirq+0x74/0xc0
> [c0003e1839e0] [c007a450] .irq_exit+0x5c/0xac
> [c0003e183a60] [c000c414] .do_IRQ+0x17c/0x1f4
> [c0003e183b00] [c0004c24] hardware_interrupt_entry+0x24/0x28
> --- Exception: 501 at .ppc64_runlatch_off+0x28/0x60
> LR = .pseries_dedicated_idle_sleep+0xd8/0x1a4
> [c0003e183df0] [c0048494]
> .pseries_dedicated_idle_sleep+0x78/0x1a4 (unreliable)
> [c0003e183e80] [c001110c] .cpu_idle+0x10c/0x1e8
> [c0003e183f00] [c002b5b0] .start_secondary+0x1b4/0x1d8
> [c0003e183f90] [c00083c4] .start_secondary_prolog+0xc/0x10

I might break the IOMMU code. Can you reproduce it easily? If so,
reverting my IOMMU patches (I've attached a patch to revert them) fix
the problem?

Thanks,

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index ff2a62d..59899b2 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -244,9 +244,6 @@ config IOMMU_VMERGE
 
  Most drivers don't have this problem; it is safe to say Y here.
 
-config IOMMU_HELPER
-   def_bool PPC64
-
 config HOTPLUG_CPU
bool "Support for enabling/disabling CPUs"
depends on SMP && HOTPLUG && EXPERIMENTAL && (PPC_PSERIES || PPC_PMAC)
diff --git a/arch/powerpc/kernel/dma_64.c b/arch/powerpc/kernel/dma_64.c
index 6fcb7cb..1806d96 100644
--- a/arch/powerpc/kernel/dma_64.c
+++ b/arch/powerpc/kernel/dma_64.c
@@ -31,8 +31,8 @@ static inline unsigned long device_to_mask(struct device *dev)
 static void *dma_iommu_alloc_coherent(struct device *dev, size_t size,
  dma_addr_t *dma_handle, gfp_t flag)
 {
-   return iommu_alloc_coherent(dev, dev->archdata.dma_data, size,
-   dma_handle, device_to_mask(dev), flag,
+   return iommu_alloc_coherent(dev->archdata.dma_data, size, dma_handle,
+   device_to_mask(dev), flag,
dev->archdata.numa_node);
 }
 
@@ -52,7 +52,7 @@ static dma_addr_t dma_iommu_map_single(struct device *dev, 
void *vaddr,
   size_t size,
   enum dma_data_direction direction)
 {
-   return iommu_map_single(dev, dev->archdata.dma_data, vaddr, size,
+   return iommu_map_single(dev->archdata.dma_data, vaddr, size,
device_to_mask(dev), direction);
 }
 
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 18e8860..050e9ac 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -31,7 +31,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -82,19 +81,17 @@ static int __init setup_iommu(char *str)
 __setup("protect4gb=", setup_protect4gb);
 __setup("iommu=", setup_iommu);
 
-static unsigned long iommu_range_alloc(struct device *dev,
-  struct iommu_table *tbl,
+static unsigned long iommu_range_alloc(struct iommu_table *tbl,
unsigned long npages,
unsigned long *handle,
unsigned long mask,
unsigned int align_order)
 { 
-

1 2 3 4 >

1 - 100 of 370 matches

Mail list logo