Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-24 Thread Christoph Lameter
On Fri, 24 Jan 2014, Mel Gorman wrote:

 That'd be okish for 64-bit at least although it would show up as
 degraded performance in some cases when virtually contiguous buffers were
 used. Aside from the higher setup, access costs and teardown costs of a
 virtual contiguous buffer, the underlying storage would no longer gets
 a single buffer as part of the IO request. Would that not offset many of
 the advantages?

It would offset some of that. But the major benefit of large order page
cache was the reduction of the number of operations that the kernel has to
perform. A 64k page contains 16 4k pages. So there is only one kernel
operation required instead of 16. If the page is virtually allocated then
the higher level kernel functions still only operate on one page struct.
The lower levels (bio) then will have to deal with the virtuall mappings
and create a scatter gather list. This is some more overhead but not much.

Doing something like this will put more stress on the defragmentation
logic in the kernel. In general I think we need more contiguous physical
memory.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-23 Thread Christoph Lameter
On Wed, 22 Jan 2014, Mel Gorman wrote:

 Don't get me wrong, I'm interested in the topic but I severely doubt I'd
 have the capacity to research the background of this in advance. It's also
 unlikely that I'd work on it in the future without throwing out my current
 TODO list. In an ideal world someone will have done the legwork in advance
 of LSF/MM to help drive the topic.

I can give an overview of the history and the challenges of the approaches
if needed.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-23 Thread Christoph Lameter
On Wed, 22 Jan 2014, Mel Gorman wrote:

 Large block support was proposed years ago by Christoph Lameter
 (http://lwn.net/Articles/232757/). I think I was just getting started
 in the community at the time so I do not recall any of the details. I do
 believe it motivated an alternative by Nick Piggin called fsblock though
 (http://lwn.net/Articles/321390/). At the very least it would be nice to
 know why neither were never merged for those of us that were not around
 at the time and who may not have the chance to dive through mailing list
 archives between now and March.

It was rejected first because of the necessity of higher order page
allocations. Nick and I then added ways to virtually map higher order
pages if the page allocator could no longe provide those.

All of this required changes to the basic page cache operations. I added a
way for the mapping to indicate an order for an address range and then
modified the page cache operations to be able to operate on any order
pages.

The patchset that introduced the ability to specify different orders for
the pagecache address ranges was not accepted by Andrew because he thought
there was no chance for the rest of the modifications to become
acceptable.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Lsf-pc] [LSF/MM TOPIC] really large storage sectors - going beyond 4096 bytes

2014-01-23 Thread Christoph Lameter
On Thu, 23 Jan 2014, James Bottomley wrote:

 If the compound page infrastructure exists today and is usable for this,
 what else do we need to do? ... because if it's a couple of trivial
 changes and a few minor patches to filesystems to take advantage of it,
 we might as well do it anyway.  I was only objecting on the grounds that
 the last time we looked at it, it was major VM surgery.  Can someone
 give a summary of how far we are away from being able to do this with
 the VM system today and what extra work is needed (and how big is this
 piece of work)?

The main problem for me was the page cache. The VM would not be such a
problem. Changing the page cache function required updates to many
filesystems.


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-2012-discuss] SCSI Performance regression [was Re: [PATCH 0/6] tcm_vhost/virtio-scsi WIP code for-3.6]

2012-07-06 Thread Christoph Lameter
On Fri, 6 Jul 2012, James Bottomley wrote:

 What people might pay attention to is evidence that there's a problem in
 3.5-rc6 (without any OFED crap).  If you're not going to bother
 investigating, it has to be in an environment they can reproduce (so
 ordinary hardware, not infiniband) otherwise it gets ignored as an
 esoteric hardware issue.

The OFED stuff in the meantime is part of 3.5-rc6. Infiniband has been
supported for a long time and its a very important technology given the
problematic nature of ethernet at high network speeds.

OFED crap exists for those running RHEL5/6. The new enterprise distros are
based on the 3.2 kernel which has pretty good Infiniband support
out of the box.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [scsi] Remove __GFP_DMA

2007-05-24 Thread Christoph Lameter
On Thu, 24 May 2007, Salyzyn, Mark wrote:

 So, is the sequence:
 
   p = kmalloc(upsg-sg[i].count,GFP_KERNEL);
   . . .
   addr = pci_map_single(dev-pdev, p, upsg-sg[i].count,
 data_dir);
 
 Going to ensure that we have a 31 bit (not 32 bit) physical address?

Only if you have less than 2G of memory. So no.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [scsi] Remove __GFP_DMA

2007-05-24 Thread Christoph Lameter
On Thu, 24 May 2007, James Bottomley wrote:

  Going to ensure that we have a 31 bit (not 32 bit) physical address?
 
 No, unfortunately.  Implementing kmalloc_mask() and kmalloc_dev() was
 something I said I'd do ... about two years ago.

Tell me more about these ideas. 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] [scsi] Remove __GFP_DMA

2007-05-24 Thread Christoph Lameter
On Thu, 24 May 2007, James Bottomley wrote:

 The idea was basically to match an allocation to a device mask.  I was
 going to do a generic implementation (which would probably kmalloc,
 check the physaddr and fall back to GFP_DMA if we were unlucky) but
 allow the architectures to override.

H... We could actually implement something like it in the slab 
allocators. The mask parameter would lead the allocator to check if the 
objects are in a satisfactory range. If not it could examine its partial 
lists for slabs that satisfy the range. If that does not work then it 
would eventually go to the page allocator to ask for a page in a fitting 
range.

That wont be fast though. How performance sensitive are the allocations?

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [scsi] Remove __GFP_DMA

2007-05-22 Thread Christoph Lameter
On Mon, 21 May 2007, Bernhard Walle wrote:

 [PATCH] [scsi] Remove __GFP_DMA
 
 After 821de3a27bf33f11ec878562577c586cd5f83c64, it's not necessary to alloate 
 a
 DMA buffer any more in sd.c.
 
 Signed-off-by: Bernhard Walle [EMAIL PROTECTED]

Great that avoids a DMA kmalloc slab. Any other GFP_DMAs left in the scsi 
layer?

Acked-by: Christoph Lameter [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html