On Thu, Nov 29, 2012 at 9:51 AM, Chris Mason wrote:
>
> The bigger question is do we have users that expect to be able to set
> the blocksize after mmaping the block device (no writes required)? I
> actually feel a little bad for taking up internet bandwidth asking, but
> it is a change in
On Thu, Nov 29, 2012 at 10:26:56AM -0700, Linus Torvalds wrote:
> On Thu, Nov 29, 2012 at 6:12 AM, Chris Mason wrote:
> >
> > Jumping in based on Linus original patch, which is doing something like
> > this:
> >
> > set_blocksize() {
> > block new calls to writepage, prepare/commit_write
On Thu, Nov 29, 2012 at 6:12 AM, Chris Mason wrote:
>
> Jumping in based on Linus original patch, which is doing something like
> this:
>
> set_blocksize() {
> block new calls to writepage, prepare/commit_write
> set the block size
> unblock
>
> < --- can race in
On Thu, Nov 29, 2012 at 07:12:49AM -0700, Chris Mason wrote:
> On Wed, Nov 28, 2012 at 11:16:21PM -0700, Linus Torvalds wrote:
> > On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
> > wrote:
> > >
> > > But the fact that the code wants to do things like
> > >
> > > block =
On Wed, Nov 28, 2012 at 11:16:21PM -0700, Linus Torvalds wrote:
> On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
> wrote:
> >
> > But the fact that the code wants to do things like
> >
> > block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits);
> >
> > seriously seems to be the main
On Thu, Nov 29, 2012 at 2:45 PM, Al Viro wrote:
> On Wed, Nov 28, 2012 at 10:37:27PM -0800, Linus Torvalds wrote:
>> On Wed, Nov 28, 2012 at 10:30 PM, Al Viro wrote:
>> >
>> > Note that sync_blockdev() a few lines prior to that is good only if we
>> > have no other processes doing write(2) (or
On Thu, Nov 29, 2012 at 2:45 PM, Al Viro v...@zeniv.linux.org.uk wrote:
On Wed, Nov 28, 2012 at 10:37:27PM -0800, Linus Torvalds wrote:
On Wed, Nov 28, 2012 at 10:30 PM, Al Viro v...@zeniv.linux.org.uk wrote:
Note that sync_blockdev() a few lines prior to that is good only if we
have no
On Wed, Nov 28, 2012 at 11:16:21PM -0700, Linus Torvalds wrote:
On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
But the fact that the code wants to do things like
block = (sector_t)page-index (PAGE_CACHE_SHIFT - bbits);
seriously seems to
On Thu, Nov 29, 2012 at 07:12:49AM -0700, Chris Mason wrote:
On Wed, Nov 28, 2012 at 11:16:21PM -0700, Linus Torvalds wrote:
On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
But the fact that the code wants to do things like
block =
On Thu, Nov 29, 2012 at 6:12 AM, Chris Mason chris.ma...@fusionio.com wrote:
Jumping in based on Linus original patch, which is doing something like
this:
set_blocksize() {
block new calls to writepage, prepare/commit_write
set the block size
unblock
---
On Thu, Nov 29, 2012 at 10:26:56AM -0700, Linus Torvalds wrote:
On Thu, Nov 29, 2012 at 6:12 AM, Chris Mason chris.ma...@fusionio.com wrote:
Jumping in based on Linus original patch, which is doing something like
this:
set_blocksize() {
block new calls to writepage,
On Thu, Nov 29, 2012 at 9:51 AM, Chris Mason chris.ma...@fusionio.com wrote:
The bigger question is do we have users that expect to be able to set
the blocksize after mmaping the block device (no writes required)? I
actually feel a little bad for taking up internet bandwidth asking, but
it
On Wed, Nov 28, 2012 at 10:37:27PM -0800, Linus Torvalds wrote:
> On Wed, Nov 28, 2012 at 10:30 PM, Al Viro wrote:
> >
> > Note that sync_blockdev() a few lines prior to that is good only if we
> > have no other processes doing write(2) (or dirtying the mmapped pages,
> > for that matter). The
On Wed, Nov 28, 2012 at 10:30 PM, Al Viro wrote:
>
> Note that sync_blockdev() a few lines prior to that is good only if we
> have no other processes doing write(2) (or dirtying the mmapped pages,
> for that matter). The window isn't too wide, but...
So with Mikulas' patches, the write actually
On Wed, Nov 28, 2012 at 10:25 PM, Al Viro wrote:
>
> Umm... set_blocksize() is calling kill_bdev(), which does
> truncate_inode_pages(mapping, 0). What's going to happen to data in
> the dirty pages? IO in progress is not the only thing to worry about...
Hmm. Yes. I think it works by
On Thu, Nov 29, 2012 at 06:25:19AM +, Al Viro wrote:
> Umm... set_blocksize() is calling kill_bdev(), which does
> truncate_inode_pages(mapping, 0). What's going to happen to data in
> the dirty pages? IO in progress is not the only thing to worry about...
Note that sync_blockdev()
On Wed, Nov 28, 2012 at 10:16:21PM -0800, Linus Torvalds wrote:
> On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
> wrote:
> >
> > But the fact that the code wants to do things like
> >
> > block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits);
> >
> > seriously seems to be the main
On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
wrote:
>
> But the fact that the code wants to do things like
>
> block = (sector_t)page->index << (PAGE_CACHE_SHIFT - bbits);
>
> seriously seems to be the main thing that keeps us using
> 'inode->i_blkbits'. Calculating bbits from
On Wed, Nov 28, 2012 at 6:04 PM, Linus Torvalds
wrote:
>
> Interesting. The code *has* the block size (it's in "bh->b_size"), but
> it actually then uses the inode blocksize instead, and verifies the
> two against each other. It could just have used the block size
> directly (and then used the
On Wed, 28 Nov 2012, Linus Torvalds wrote:
> A bigger issue is for things that emulate what blkdev.c does, and
> doesn't do the locking. I see code in md/bitmap.c that seems a bit
> suspicious, for example. That said, it's not *new* breakage, and the
> "lock at mmap/read/write() time" approach
[ Sorry, I was offline for a while driving kids around ]
On Wed, Nov 28, 2012 at 4:38 PM, Mikulas Patocka wrote:
>
> It can happen. Take your patch (the one that moves bd_block_size_semaphore
> into blkdev_readpage, blkdev_writepage and blkdev_write_begin).
Interesting. The code *has* the block
On Wed, 28 Nov 2012, Linus Torvalds wrote:
> > For example, __block_write_full_page and __block_write_begin do
> > if (!page_has_buffers(page)) { create_empty_buffers... }
> > and then they do
> > WARN_ON(bh->b_size != blocksize)
> > err = get_block(inode, block, bh, 1)
On Wed, Nov 28, 2012 at 2:52 PM, Linus Torvalds
wrote:
>
>> For example, __block_write_full_page and __block_write_begin do
>> if (!page_has_buffers(page)) { create_empty_buffers... }
>> and then they do
>> WARN_ON(bh->b_size != blocksize)
>> err = get_block(inode, block,
On Wed, Nov 28, 2012 at 1:29 PM, Mikulas Patocka wrote:
>
> The problem with this approach is that it is very easy to miss points
> where it is assumed that the block size doesn't change - and if you miss a
> point, it results in a hidden bug that has a little possibility of being
> found.
Umm.
On Wed, 28 Nov 2012, Linus Torvalds wrote:
> On Wed, Nov 28, 2012 at 12:32 PM, Linus Torvalds
> wrote:
> >
> > Here is a *COMPLETELY* untested patch. Caveat emptor. It will probably
> > do unspeakable things to your family and pets.
>
> Btw, *if* this approach works, I suspect we could just
On Wed, 28 Nov 2012, Linus Torvalds wrote:
> On Wed, Nov 28, 2012 at 12:03 PM, Linus Torvalds
> wrote:
> >
> > mmap() is in *no* way special. The exact same thing happens for
> > regular read/write. Yet somehow the mmap code is special-cased, while
> > the normal read-write code is not.
>
> I
On Wed, Nov 28, 2012 at 12:32 PM, Linus Torvalds
wrote:
>
> Here is a *COMPLETELY* untested patch. Caveat emptor. It will probably
> do unspeakable things to your family and pets.
Btw, *if* this approach works, I suspect we could just switch the
bd_block_size_semaphore semaphore to be a regular
On Wed, Nov 28, 2012 at 12:13 PM, Linus Torvalds
wrote:
>
> I dunno. Maybe there is some fundamental reason why the above is
> broken, but it seems to be a much simpler approach. Sure, you need to
> guarantee that the people who get the write-lock cannot possibly cause
> IO while holding it, but
On Wed, Nov 28, 2012 at 12:03 PM, Linus Torvalds
wrote:
>
> mmap() is in *no* way special. The exact same thing happens for
> regular read/write. Yet somehow the mmap code is special-cased, while
> the normal read-write code is not.
I just double-checked, because it's been a long time since I
On Wed, Nov 28, 2012 at 11:50 AM, Mikulas Patocka wrote:
>
> mmap_region() doesn't care about the block size. But a lot of
> page-in/page-out code does.
That seems a bogus argument.
mmap() is in *no* way special. The exact same thing happens for
regular read/write. Yet somehow the mmap code is
On Wed, Nov 28, 2012 at 11:43 AM, Al Viro wrote:
> Have a
> private vm_operations - a copy of generic_file_vm_ops with ->open()/->close()
> added to it.
That sounds more reasonable.
However, I suspect the *most* reasonable thing to do is to just remove
the whole damn thing.
On Wed, 28 Nov 2012, Linus Torvalds wrote:
> No, this is crap.
>
> We don't introduce random hooks like this just because the block layer
> has shit-for-brains and cannot be bothered to do things right.
>
> The fact is, the whole locking in the block layer open routine is
> total and utter
On Wed, Nov 28, 2012 at 11:15:12AM -0800, Linus Torvalds wrote:
> No, this is crap.
>
> We don't introduce random hooks like this just because the block layer
> has shit-for-brains and cannot be bothered to do things right.
>
> The fact is, the whole locking in the block layer open routine is
>
No, this is crap.
We don't introduce random hooks like this just because the block layer
has shit-for-brains and cannot be bothered to do things right.
The fact is, the whole locking in the block layer open routine is
total and utter crap. It doesn't lock the right thing, even with your
change
On Wed, 28 Nov 2012, Jens Axboe wrote:
> On 2012-11-28 04:57, Mikulas Patocka wrote:
> >
> > This patch is wrong because you must check if the device is mapped while
> > holding bdev->bd_block_size_semaphore (because
> > bdev->bd_block_size_semaphore prevents new mappings from being created)
On Wed, 28 Nov 2012, Jens Axboe wrote:
On 2012-11-28 04:57, Mikulas Patocka wrote:
This patch is wrong because you must check if the device is mapped while
holding bdev-bd_block_size_semaphore (because
bdev-bd_block_size_semaphore prevents new mappings from being created)
No it
No, this is crap.
We don't introduce random hooks like this just because the block layer
has shit-for-brains and cannot be bothered to do things right.
The fact is, the whole locking in the block layer open routine is
total and utter crap. It doesn't lock the right thing, even with your
change
On Wed, Nov 28, 2012 at 11:15:12AM -0800, Linus Torvalds wrote:
No, this is crap.
We don't introduce random hooks like this just because the block layer
has shit-for-brains and cannot be bothered to do things right.
The fact is, the whole locking in the block layer open routine is
total
On Wed, 28 Nov 2012, Linus Torvalds wrote:
No, this is crap.
We don't introduce random hooks like this just because the block layer
has shit-for-brains and cannot be bothered to do things right.
The fact is, the whole locking in the block layer open routine is
total and utter crap. It
On Wed, Nov 28, 2012 at 11:43 AM, Al Viro v...@zeniv.linux.org.uk wrote:
Have a
private vm_operations - a copy of generic_file_vm_ops with -open()/-close()
added to it.
That sounds more reasonable.
However, I suspect the *most* reasonable thing to do is to just remove
the
On Wed, Nov 28, 2012 at 11:50 AM, Mikulas Patocka mpato...@redhat.com wrote:
mmap_region() doesn't care about the block size. But a lot of
page-in/page-out code does.
That seems a bogus argument.
mmap() is in *no* way special. The exact same thing happens for
regular read/write. Yet somehow
On Wed, Nov 28, 2012 at 12:03 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
mmap() is in *no* way special. The exact same thing happens for
regular read/write. Yet somehow the mmap code is special-cased, while
the normal read-write code is not.
I just double-checked, because it's
On Wed, Nov 28, 2012 at 12:13 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
I dunno. Maybe there is some fundamental reason why the above is
broken, but it seems to be a much simpler approach. Sure, you need to
guarantee that the people who get the write-lock cannot possibly cause
On Wed, Nov 28, 2012 at 12:32 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
Here is a *COMPLETELY* untested patch. Caveat emptor. It will probably
do unspeakable things to your family and pets.
Btw, *if* this approach works, I suspect we could just switch the
bd_block_size_semaphore
On Wed, 28 Nov 2012, Linus Torvalds wrote:
On Wed, Nov 28, 2012 at 12:03 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
mmap() is in *no* way special. The exact same thing happens for
regular read/write. Yet somehow the mmap code is special-cased, while
the normal read-write
On Wed, 28 Nov 2012, Linus Torvalds wrote:
On Wed, Nov 28, 2012 at 12:32 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
Here is a *COMPLETELY* untested patch. Caveat emptor. It will probably
do unspeakable things to your family and pets.
Btw, *if* this approach works, I
On Wed, Nov 28, 2012 at 1:29 PM, Mikulas Patocka mpato...@redhat.com wrote:
The problem with this approach is that it is very easy to miss points
where it is assumed that the block size doesn't change - and if you miss a
point, it results in a hidden bug that has a little possibility of being
On Wed, Nov 28, 2012 at 2:52 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
For example, __block_write_full_page and __block_write_begin do
if (!page_has_buffers(page)) { create_empty_buffers... }
and then they do
WARN_ON(bh-b_size != blocksize)
err =
On Wed, 28 Nov 2012, Linus Torvalds wrote:
For example, __block_write_full_page and __block_write_begin do
if (!page_has_buffers(page)) { create_empty_buffers... }
and then they do
WARN_ON(bh-b_size != blocksize)
err = get_block(inode, block, bh, 1)
Right.
[ Sorry, I was offline for a while driving kids around ]
On Wed, Nov 28, 2012 at 4:38 PM, Mikulas Patocka mpato...@redhat.com wrote:
It can happen. Take your patch (the one that moves bd_block_size_semaphore
into blkdev_readpage, blkdev_writepage and blkdev_write_begin).
Interesting. The code
On Wed, 28 Nov 2012, Linus Torvalds wrote:
A bigger issue is for things that emulate what blkdev.c does, and
doesn't do the locking. I see code in md/bitmap.c that seems a bit
suspicious, for example. That said, it's not *new* breakage, and the
lock at mmap/read/write() time approach
On Wed, Nov 28, 2012 at 6:04 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
Interesting. The code *has* the block size (it's in bh-b_size), but
it actually then uses the inode blocksize instead, and verifies the
two against each other. It could just have used the block size
directly
On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
But the fact that the code wants to do things like
block = (sector_t)page-index (PAGE_CACHE_SHIFT - bbits);
seriously seems to be the main thing that keeps us using
'inode-i_blkbits'. Calculating
On Wed, Nov 28, 2012 at 10:16:21PM -0800, Linus Torvalds wrote:
On Wed, Nov 28, 2012 at 6:58 PM, Linus Torvalds
torva...@linux-foundation.org wrote:
But the fact that the code wants to do things like
block = (sector_t)page-index (PAGE_CACHE_SHIFT - bbits);
seriously seems to
On Thu, Nov 29, 2012 at 06:25:19AM +, Al Viro wrote:
Umm... set_blocksize() is calling kill_bdev(), which does
truncate_inode_pages(mapping, 0). What's going to happen to data in
the dirty pages? IO in progress is not the only thing to worry about...
Note that sync_blockdev() a
On Wed, Nov 28, 2012 at 10:25 PM, Al Viro v...@zeniv.linux.org.uk wrote:
Umm... set_blocksize() is calling kill_bdev(), which does
truncate_inode_pages(mapping, 0). What's going to happen to data in
the dirty pages? IO in progress is not the only thing to worry about...
Hmm. Yes. I
On Wed, Nov 28, 2012 at 10:30 PM, Al Viro v...@zeniv.linux.org.uk wrote:
Note that sync_blockdev() a few lines prior to that is good only if we
have no other processes doing write(2) (or dirtying the mmapped pages,
for that matter). The window isn't too wide, but...
So with Mikulas' patches,
On Wed, Nov 28, 2012 at 10:37:27PM -0800, Linus Torvalds wrote:
On Wed, Nov 28, 2012 at 10:30 PM, Al Viro v...@zeniv.linux.org.uk wrote:
Note that sync_blockdev() a few lines prior to that is good only if we
have no other processes doing write(2) (or dirtying the mmapped pages,
for that
58 matches
Mail list logo