Re: [PATCH 1/7] BTRFS: Fix lseek return value for error

2011-09-18 Thread Jeff Liu
Hi Andreas and Andi,

Thanks for your comments.

On 09/18/2011 09:46 AM, Andi Kleen wrote:

 with an additional improvement if the offset is larger or equal to the
 file size, return -ENXIO in directly:

if (offset = inode-i_size) {
mutex_unlock(inode-i_mutex);
return -ENXIO;
}

 Except that is wrong, because it would then be impossible to write sparse 
 files. 

Per my tryout, except that, if the offset = source file size, call
lseek(fd, offset, SEEK_DATA/SEEK_HOLE) against Btrfs will always return
the total file size rather than -ENXIO.  however, our desired result it
-ENXIO in this case, Am I right?

 
 And also i_size must be always read with i_size_read()

Thanks for pointing this out!
Would you please kindly review the revised as below?

Signed-off-by: Jie Liu jeff@oracle.com

---
 fs/btrfs/file.c |9 +++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index e7872e4..40c1ef3 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1813,6 +1813,11 @@ static loff_t btrfs_file_llseek(struct file
*file, loff_t offset, int origin)
goto out;
case SEEK_DATA:
case SEEK_HOLE:
+   if (offset = i_size_read(inode)) {
+   mutex_unlock(inode-i_mutex);
+   return -ENXIO;
+   }
+
ret = find_desired_extent(inode, offset, origin);
if (ret) {
mutex_unlock(inode-i_mutex);
@@ -1821,11 +1826,11 @@ static loff_t btrfs_file_llseek(struct file
*file, loff_t offset, int origin)
}

if (offset  0  !(file-f_mode  FMODE_UNSIGNED_OFFSET)) {
-   ret = -EINVAL;
+   offset = -EINVAL;
goto out;
}
if (offset  inode-i_sb-s_maxbytes) {
-   ret = -EINVAL;
+   offset = -EINVAL;
goto out;
}

-- 
1.7.4.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] BTRFS: Fix lseek return value for error

2011-09-18 Thread Marco Stornelli

Il 18/09/2011 09:29, Jeff Liu ha scritto:

Hi Andreas and Andi,

Thanks for your comments.

On 09/18/2011 09:46 AM, Andi Kleen wrote:


with an additional improvement if the offset is larger or equal to the
file size, return -ENXIO in directly:

if (offset= inode-i_size) {
mutex_unlock(inode-i_mutex);
return -ENXIO;
}


Except that is wrong, because it would then be impossible to write sparse files.


Per my tryout, except that, if the offset= source file size, call
lseek(fd, offset, SEEK_DATA/SEEK_HOLE) against Btrfs will always return
the total file size rather than -ENXIO.  however, our desired result it
-ENXIO in this case, Am I right?



Yes, ENXIO should be the operation result.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] BTRFS: Fix lseek return value for error

2011-09-18 Thread Jeff liu

在 2011-9-18,下午4:42, Marco Stornelli 写道:

 Il 18/09/2011 09:29, Jeff Liu ha scritto:
 Hi Andreas and Andi,
 
 Thanks for your comments.
 
 On 09/18/2011 09:46 AM, Andi Kleen wrote:
 
 with an additional improvement if the offset is larger or equal to the
 file size, return -ENXIO in directly:
 
if (offset= inode-i_size) {
mutex_unlock(inode-i_mutex);
return -ENXIO;
}
 
 Except that is wrong, because it would then be impossible to write sparse 
 files.
 
 Per my tryout, except that, if the offset= source file size, call
 lseek(fd, offset, SEEK_DATA/SEEK_HOLE) against Btrfs will always return
 the total file size rather than -ENXIO.  however, our desired result it
 -ENXIO in this case, Am I right?
 
 
 Yes, ENXIO should be the operation result.

Thanks for your kind confirmation.


-Jeff

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] BTRFS: Fix lseek return value for error

2011-09-18 Thread Chris Mason
Excerpts from Jeff liu's message of 2011-09-18 06:33:38 -0400:
 
 在 2011-9-18,下午4:42, Marco Stornelli 写道:
 
  Il 18/09/2011 09:29, Jeff Liu ha scritto:
  Hi Andreas and Andi,
  
  Thanks for your comments.
  
  On 09/18/2011 09:46 AM, Andi Kleen wrote:
  
  with an additional improvement if the offset is larger or equal to the
  file size, return -ENXIO in directly:
  
 if (offset= inode-i_size) {
 mutex_unlock(inode-i_mutex);
 return -ENXIO;
 }
  
  Except that is wrong, because it would then be impossible to write 
  sparse files.
  
  Per my tryout, except that, if the offset= source file size, call
  lseek(fd, offset, SEEK_DATA/SEEK_HOLE) against Btrfs will always return
  the total file size rather than -ENXIO.  however, our desired result it
  -ENXIO in this case, Am I right?
  
  
  Yes, ENXIO should be the operation result.
 
 Thanks for your kind confirmation.

Thanks everyone, I've put Jeff's last version of this in my queue.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: fix directory offsets for '.' and '..' entries

2011-09-18 Thread Chris Mason
Excerpts from Tsutomu Itoh's message of 2011-09-11 19:49:22 -0400:
 The same patch has been posted about one month ago.
 
   http://marc.info/?l=linux-btrfsm=131363399500506w=2

Thanks, I've queued up the fujitsu version (just because they sent it
first).

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix pages truncation in btrfs_ioctl_clone()

2011-09-18 Thread Chris Mason
Excerpts from Sage Weil's message of 2011-09-16 12:39:06 -0400:
 On Fri, 16 Sep 2011, Li Zefan wrote:
  It's a bug in commit f81c9cdc567cd3160ff9e64868d9a1a7ee226480
  (Btrfs: truncate pages from clone ioctl target range)
  
  We should pass the dest range to the truncate function, but not the
  src range.
 
 Sigh... yes.
 
  Also move the function before locking extent state.
 
 Hmm, any reason?  i_mutex protects us from a racing write(2), but what 
 about a racing mmap()?  e.g.
 
 cloner: truncates dest pages
 writer: mmap - page_mkwrite locks extent, creates new dirty page, unlocks
 cloner: locks extent, clones, unlocks extent

Thanks guys.  The locking order is page lock - extent lock.  So if we
call truncate_inode_pages with the extent lock held, we're off into
ABBA.

If we want to avoid the mmap race, we'll have to look for the dirty
pages with the extent lock held, drop the lock and goto back to the
truncate_inode_pages call.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs vs data deduplication

2011-09-18 Thread Maciej Marcin Piechotka
On Sat, 2011-07-09 at 08:19 +0200, Paweł Brodacki wrote:
 Hello,
 
 I've stumbled upon this article:
 http://storagemojo.com/2011/06/27/de-dup-too-much-of-good-thing/
 
 Reportedly Sandforce SF1200 SSD controller does internally block-level
 data de-duplication. This effectively removes the additional
 protection given by writing multiple metadata copies. This technique
 may be used, or can be used in the future by manufactureres of other
 drives too.
 
 I would like to ask, if the metadata copies written to a btrfs system
 with enabled metadata mirroring are identical, or is there something
 that makes them unique on-disk, therefore preventing their
 de-duplication. I tried googling for the answer, but didn't net
 anything that would answer my question.
 
 If the metadata copies are identical, I'd like to ask if it would be
 possible to change this without major disruption? I know that changes
 to on-disk format aren't a thing made lightly, but I'd be grateful for
 any comments.
 
 The increase of the risk of file system corruption introduced by data
 de-duplication on Sandforce controllers was down-played in the
 vendor's reply included in the article, but still, what's the point of
 duplicating metadata on file system level, if storage below can remove
 that redundancy?
 
 Regards,
 Paweł

Hello,

Sorry I add my 0.03$. It is possible to workaround it by using
encryption. If something other then ebc is used the identical elements
in unecrypted mode are stored as different on hdd.

The drawbacks:

 - Encryption overhead (you may want to use non-secure mode as you're
not interested in security)
 - There is avalanche effect (whole [encryption] block gets corrupted
even if one bit of block is corrupted).

Regards


signature.asc
Description: This is a digitally signed message part


Re: btrfs vs data deduplication

2011-09-18 Thread Chris Samuel
On Mon, 19 Sep 2011, 06:15:51 EST, Hubert Kario hub...@kario.pl wrote:

 You shouldn't depend on single drive, metadata
 raid is there to protect against single bad
 blocks, not disk crash.

I guess the issue here is you no longer even
have that protection with this sort of dedup.

cheers,
Chris
-- 
Chris Samuel - http://www.csamuel.org/
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Inefficient storing of ISO images with compress=lzo

2011-09-18 Thread Maciej Marcin Piechotka
I've noticed that:

 - with x86-64 Fedora 15 DVD install images:
   - du -sh ROOT VOLUME was 36 GB
   - btrfs df | grep -i data have shown over 40 GB used
 - without
   - du -sh ROOT VOLUME is 34 GB
   - btrfs df | grep -i data have shown less then 34 GB used

It seems that iso files are considered compressable while they may not be (and 
penalty is severe - 3x).

Regards

 



signature.asc
Description: This is a digitally signed message part


Re: kernel BUG at fs/btrfs/inode.c:2299

2011-09-18 Thread Maciej Marcin Piechotka
On Tue, 2011-08-30 at 14:27 +0800, Miao Xie wrote:
  
  Unfortunately it results in freeze of system and I cannot give more
  details. Sometimes it happens not from fcron but then it does not result
  in freeze (???).
 
 Could you give me the method to reproduce it?
 
 Thanks
 Miao

Sorry for spamming in this thread but I'm trying to post my findings in
hope that somebody will understand what's going on.

Recent crash gave some valuable information IMHO:

 1. I started the autocompletion of path in zsh
 2. At some point the zsh hanged. In ps the process was listed as
runnable
 3. Any access to root volume (the one that zsh was trying to readdir)
finished in hang.
 4. I was able to access the child volume (/home)
 5. After some time the bug is hit. At this time strange things happens
(screen freeze etc.). I guess that there is some strange interaction
between KMS, X and now-hanged composite manager

Next time it happend (also during listing root directory of volume 0) I
observed the following thing - I can log out and unmount home but the
volume 0 remains busy and cannot be unmounted.

Things to consider:

 - It is not enabled/disabled by any mount option
 - Is it triggered when the parent volume (say volume 0) and child
volume are both mounted?
 - Which case is it failing (I've tried to add printk but I cannot find
the option in printk to print u64)
 - Why it happens only during night?

Regards

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Inefficient storing of ISO images with compress=lzo

2011-09-18 Thread Li Zefan
Maciej Marcin Piechotka wrote:
 I've noticed that:
 
  - with x86-64 Fedora 15 DVD install images:
- du -sh ROOT VOLUME was 36 GB
- btrfs df | grep -i data have shown over 40 GB used
  - without
- du -sh ROOT VOLUME is 34 GB
- btrfs df | grep -i data have shown less then 34 GB used
 
 It seems that iso files are considered compressable while they may not be 
 (and penalty is severe - 3x).
 

With compress option specified, btrfs will try to compress the file, at most
128K at one time, and if the compressed result is not smaller, the file will
be marked as uncompressable.

I just tried with Fedora-14-i386-DVD.iso, and the first 896K is compressed,
with a compress ratio about 71.7%, and the remaining data is not compressed.

--
Li Zefan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix pages truncation in btrfs_ioctl_clone()

2011-09-18 Thread Li Zefan
 Also move the function before locking extent state.
 
 Hmm, any reason?  i_mutex protects us from a racing write(2), but what 
 about a racing mmap()?  e.g.
 
 cloner: truncates dest pages
 writer: mmap - page_mkwrite locks extent, creates new dirty page, unlocks
 cloner: locks extent, clones, unlocks extent
 

(besides Chris' comments)

How can we avoid the race on dst file by locking src file extent state..

--
Li Zefan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Inefficient storing of ISO images with compress=lzo

2011-09-18 Thread Li Zefan
Li Zefan wrote:
 Maciej Marcin Piechotka wrote:
 I've noticed that:

  - with x86-64 Fedora 15 DVD install images:
- du -sh ROOT VOLUME was 36 GB
- btrfs df | grep -i data have shown over 40 GB used
  - without
- du -sh ROOT VOLUME is 34 GB
- btrfs df | grep -i data have shown less then 34 GB used

 It seems that iso files are considered compressable while they may not be 
 (and penalty is severe - 3x).

 
 With compress option specified, btrfs will try to compress the file, at most
 128K at one time, and if the compressed result is not smaller, the file will
 be marked as uncompressable.
 
 I just tried with Fedora-14-i386-DVD.iso, and the first 896K is compressed,
 with a compress ratio about 71.7%, and the remaining data is not compressed.
 

correct: the compression ratio is 38.3%, pretty good :)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix pages truncation in btrfs_ioctl_clone()

2011-09-18 Thread Sage Weil
On Sun, 18 Sep 2011, Chris Mason wrote:
 Excerpts from Sage Weil's message of 2011-09-16 12:39:06 -0400:
  On Fri, 16 Sep 2011, Li Zefan wrote:
   It's a bug in commit f81c9cdc567cd3160ff9e64868d9a1a7ee226480
   (Btrfs: truncate pages from clone ioctl target range)
   
   We should pass the dest range to the truncate function, but not the
   src range.
  
  Sigh... yes.
  
   Also move the function before locking extent state.
  
  Hmm, any reason?  i_mutex protects us from a racing write(2), but what 
  about a racing mmap()?  e.g.
  
  cloner: truncates dest pages
  writer: mmap - page_mkwrite locks extent, creates new dirty page, unlocks
  cloner: locks extent, clones, unlocks extent
 
 Thanks guys.  The locking order is page lock - extent lock.  So if we
 call truncate_inode_pages with the extent lock held, we're off into
 ABBA.

Oh right.  This patch makes sense now.
 
 If we want to avoid the mmap race, we'll have to look for the dirty
 pages with the extent lock held, drop the lock and goto back to the
 truncate_inode_pages call.

Yeah, given that (at Li points out) we're not locking dst extents at all, 
I don't think it's worth worrying about now.  Sorry for the noise!

sage

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html