Re: [RFC PATCH 4/7] Btrfs: introduce subvol uuids and times

2012-07-05 Thread Zach Brown
On 07/05/2012 11:59 AM, Ilya Dryomov wrote: What if you are on a big-endian machine with a big-endian kernel and userspace? Everything on-disk should be little-endian, so if you are going to write stuff you got from userspace to disk, at some point you have to make sure you are writing out byte

Re: [RFC PATCH 4/7] Btrfs: introduce subvol uuids and times

2012-07-05 Thread Zach Brown
and take endianess into account with le{64,32}_to_cpu and cpu_to_le{64,32} macros. The kernel doesn't support system calls from userspace of a different endianness, no worries there :) - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to ma

Re: [RFC PATCH 4/7] Btrfs: introduce subvol uuids and times

2012-07-05 Thread Zach Brown
On 07/05/2012 10:14 AM, Alexander Block wrote: On Thu, Jul 5, 2012 at 7:08 PM, Zach Brown wrote: Careful, timespec will be different sizes in 32bit userspace and a 64bit kernel. I'd use btrfs_timespec to get a fixed size timespec and avoid all the compat_timespec noise. (I'd then

Re: [RFC PATCH 4/7] Btrfs: introduce subvol uuids and times

2012-07-05 Thread Zach Brown
+static long btrfs_ioctl_set_received_subvol(struct file *file, + void __user *arg) +{ + struct btrfs_ioctl_received_subvol_args *sa = NULL; + ret = copy_to_user(arg, sa, sizeof(*sa)); +struct btrfs_ioctl_received_subvol_args { + ch

Re: BTRFS fsck apparent errors

2012-07-03 Thread Zach Brown
read-only mode is default and (hopefully) does no writes to the device, this would require the --repair option so what you propose is sort of a sanity check, right? Ah, I didn't realize that it didn't write without --repair. Yeah, making sure that people don't try to combine the repair and re

Re: BTRFS fsck apparent errors

2012-07-03 Thread Zach Brown
On 07/03/2012 08:52 AM, David Sterba wrote: On Tue, Jul 03, 2012 at 04:22:08PM +0100, Hugo Mills wrote: Correct, by default it just checks the filesystem. Just to be sure: the filesystems in question weren't mounted, were they? fsck will refuse to run on a mounted filesystem, though in cas

Re: [PATCH] Btrfs: flush delayed inodes if we're short on space

2012-06-21 Thread Zach Brown
Ugh sorry I just dug this patch out from last week and forgot I had just picked an arbitrary number to make sure it was working. You are correct, what I _meant_ to do (and will do after I respond) was calculate how much we wanted to flush and then divide that by how much the delayed inodes reser

Re: [PATCH] Btrfs: flush delayed inodes if we're short on space

2012-06-21 Thread Zach Brown
+ case FLUSH_DELAYED_ITEMS_NR: + case FLUSH_DELAYED_ITEMS: + nr = (*state == FLUSH_DELAYED_ITEMS_NR) ? 10 : -1; This 10 seemed awfully magical so I read a bit more. It appears to be an attempt to pop back up into reserve_metadata_bytes() to see if the caller has been

Re: [PATCH 1/4] Btrfs: use radix tree for checksum

2012-06-14 Thread Zach Brown
+BUG_ON(ret); I wonder if we can patch BUG_ON() to break the build if its only argument is "ret". why? Well, I'm mostly joking :). That would be a very silly change to make. But only mostly joking. btrfs does have a real fragility problem from all these incomplete error handling pa

Re: Question, Does BTRFS provide a read speed increase with RAID1

2012-06-14 Thread Zach Brown
I'd like to find a better mirror selection hint that would work well on avearage and will get back to it someday, unless somebody else wants to continue experimenting here. Well, for some context you can see what the existing kernel raid implementations do: drivers/md/raid1.c:read_b

Re: [PATCH 1/4] Btrfs: use radix tree for checksum

2012-06-13 Thread Zach Brown
int set_state_private(struct extent_io_tree *tree, u64 start, u64 private) { [...] + ret = radix_tree_insert(&tree->csum, (unsigned long)start, + (void *)((unsigned long)private<< 1)); Will this fail for 64bit files on 32bit hosts? + BUG_ON(ret

Re: [PATCH] Btrfs: use rcu to protect device->name V2

2012-06-12 Thread Zach Brown
#define device_name_printk(dev, level, fmt, ...) do { \ struct rcu_string *name;\ \ rcu_read_lock();\ name = rcu_dereference(d

Re: [PATCH] Btrfs: use rcu to protect device->name V2

2012-06-11 Thread Zach Brown
- if (state->print_mask& BTRFSIC_PRINT_MASK_SUPERBLOCK_WRITE) + if (state->print_mask& BTRFSIC_PRINT_MASK_SUPERBLOCK_WRITE) { + struct rcu_string *name; + + rcu_read_lock(); + name = rcu_dereference(d

Re: hard links

2012-04-04 Thread Zach Brown
My understanding is that the limit on the number of hardlinks to the same file stored in the same directory, is, because the names of the hardlinks are stored within the same inode. As such the number of hardlinks is naturally limited by the size of the inode (and dependent on the length of the

Re: Fractal Tree Indexing over B-Trees?

2012-03-28 Thread Zach Brown
I imagine there is, but based on what little information they've shown I don't see how it's a hands down win against b-trees. If anything we're talking about having to solve really complex problems in order to get any sort of good performance out of this thing. Oh, absolutely. Tack on COW an

Re: Fractal Tree Indexing over B-Trees?

2012-03-28 Thread Zach Brown
but lets say O(log N/2) where N is the number of elements in the row. So in the situation I describe you are looking at having to do minimum of 29 reads, one for each row, Hmm. Levels are powers of two and are either full or empty. So the total item count tells you which levels are full or e

Re: getdents - ext4 vs btrfs performance

2012-03-14 Thread Zach Brown
On 03/14/2012 12:48 PM, Ted Ts'o wrote: On Wed, Mar 14, 2012 at 10:17:37AM -0400, Zach Brown wrote: We could do this if we have two b-trees, one indexed by filename and one indexed by inode number, which is what JFS (and I believe btrfs) does. Typically the inode number of the destin

Re: getdents - ext4 vs btrfs performance

2012-03-14 Thread Zach Brown
We could do this if we have two b-trees, one indexed by filename and one indexed by inode number, which is what JFS (and I believe btrfs) does. Typically the inode number of the destination inode isn't used to index entries for a readdir tree because of (wait for it) hard links. You end up ri

Re: [PATCH] xfstests 255: add a seek_data/seek_hole tester

2011-08-26 Thread Zach Brown
> > Hole: a range of the file that contains no data or is made up > > entirely of  NULL (zero) data. Holes include preallocated ranges of > > files that have not had actual data written to them. > No for me. A hole is made up of zero data? It's a strange definition > for me. It's a very natural d

Re: [RFC] big fat transaction ioctl

2009-11-11 Thread Zach Brown
> I like this much more than providing a journal start/stop to userland. > If we can get Christoph to ack the exports we can work on the interface > in general. I'll note, briefly, that it seems dangerous to call right into the sys_ functions instead of going through the architecture's syscall nu

Re: Mass-Hardlinking Oops

2009-10-13 Thread Zach Brown
> This hasn't been at the top of my list for a while, I remember a bunch > of planning sessions where you weren't worried about it ;) Yeah, no doubt. I go back and forth :) - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.k

Re: Mass-Hardlinking Oops

2009-10-13 Thread Zach Brown
>>> this thread. I get EMLINK when trying to create more than 311 (not 272) >>> links in a directory >> >> what real-world application uses and needs this many hard links? > > I don't think that's a good counterargument for why this is not a bug. I strongly agree. Our ignorance of users operati

Re: btrfs csum failed on git .pack file

2009-09-17 Thread Zach Brown
> 0130 9FA0: E2 3B 43 AA 63 BF 28 B3 87 B7 FD AB DA 74 2D 1C > 0130 9FA0: E2 3B 43 AA 63 BF 28 B3 87 33 FD AB DA 74 2D 1C B7 = 10110111 33 = 00110011 > 06CD DF90: B0 22 6B 46 9F ED 6E 47 73 5E 7E EB DA 5F D6 11 > 06CD DF90: B0 22 6B 46 9F ED 6E 47 73 1E 7E EB DA 5F D6 11 5E = 0100 1E =

Re: kernel bug in file-item.c

2009-04-29 Thread Zach Brown
> Do you think you're hitting a memtest bug or is the HW really bad? If you can stomach it, you can get a second opinion from the bootable windows memory testing iso: http://oca.microsoft.com/en/windiag.asp - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the b

Re: [PATCH] Btrfs-progs: btrfs file system size should be bigger then 256m

2008-12-31 Thread Zach Brown
> + if (block_count < 256*1024*1024) { > + fprintf(stderr, "File system size is > too small\n"); > + exit(1); > + } And please, if you could, include both the size that

Re: Compressed Filesystem

2008-10-28 Thread Zach Brown
> Compression is optional and off by default (mount -o compress to enable > it). When enabled, every file is compressed. Compression is attempted as files are written when the mount option is enabled, right? There isn't a background scrubber that tries to compress files which are already writte

Re: packing structures and numbers

2008-10-03 Thread Zach Brown
Avi Kivity wrote: > I've been reading btrfs's on-disk format, and two things caught my eye > > - attribute((packed)) structures everywhere, often with misaligned > fields. This conserves space, but can be harmful to in-memory > performance on some archs. How harmful? Do you have any profiles th

Re: Btrfs git repos available

2008-09-25 Thread Zach Brown
> Well, after some hints from Linus I've rebased these about 4 times now. > The new changesets are generally cleaner and are setup properly under > fs/btrfs. Can you publish these hints somewhere? - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a mess

Re: New feature Idea

2008-08-14 Thread Zach Brown
> File granularity is not well suited to dedup when files differ by only a > few blocks, but I'd want to see some numbers on how often that happens > before carrying around the disk format needed to do block level dedup. I was imagining that one could easily make a flag to debug-tree which caused

Re: kernel BUG at extent_map.c:275!

2008-07-22 Thread Zach Brown
David Woodhouse wrote: > On Tue, 2008-07-22 at 13:03 -0400, Chris Mason wrote: >> Well, the test is there to make sure the caller is doing the right >> thing. Before we remove it, I'd like to understand why it is failing. > > Because this is a uniprocessor kernel. So spin_lock() and spin_unlock()

Re: [PATCH] COW and checksumming ioctls

2008-06-19 Thread Zach Brown
> +#define BTRFS_IOC_NODATACOW _IO(BTRFS_IOCTL_MAGIC, 13) > +#define BTRFS_IOC_DATACOW _IO(BTRFS_IOCTL_MAGIC, 14) > +#define BTRFS_IOC_NODATASUM _IO(BTRFS_IOCTL_MAGIC, 15) > +#define BTRFS_IOC_DATASUM _IO(BTRFS_IOCTL_MAGIC, 16) Hmm. Do we really want 4 different ioctl commands to turn 2 features

Re: Future Linux filesystems

2008-06-11 Thread Zach Brown
> SSD is still very expensive when compared to traditional hard disks. When measured by GB/$, sure. Many data centers, though, care more about (ops/sec) / ($ * power * heat). SSDs look much more compelling by that metric. - z -- To unsubscribe from this list: send the line "unsubscribe linux-b

Re: [Btrfs-devel] cloning file data

2008-04-25 Thread Zach Brown
> Running debug-tree on a live FS is a very good way to learn about trees that > get left around while snapshot deletion is happening and cache aliasing > caused by the way Btrfs puts metadata into its own address space. > > But, if you're trying to learn the disk format, I'd stick an unmount b

Re: [Btrfs-devel] cloning file data

2008-04-25 Thread Zach Brown
> We've written into the middle of that 100MB extent, and we need to do COW. > One option is to read the whole thing, change 4k and write it all back. > Instead, btrfs does something like this (+/- off by need more coffee errors): > > file pos = 0 -> [ old extent, offset = 0, num_bytes = 400k

Re: [Btrfs-devel] cloning file data

2008-04-25 Thread Zach Brown
> We've written into the middle of that 100MB extent, and we need to do COW. > One option is to read the whole thing, change 4k and write it all back. > Instead, btrfs does something like this (+/- off by need more coffee errors): > > file pos = 0 -> [ old extent, offset = 0, num_bytes = 400k

Re: [Btrfs-devel] transaction ioctls

2008-04-22 Thread Zach Brown
> A misbehaving application could also deliberately hold a transaction open, > effectively locking up the FS, so it may make sense to restrict something > like this to root or something. I suspect it doesn't have to be deliberate. Have you tried this under memory pressure? I wonder if the app

<    1   2   3   4