On 07/05/2012 11:59 AM, Ilya Dryomov wrote:
What if you are on a big-endian machine with a big-endian kernel and
userspace? Everything on-disk should be little-endian, so if you are
going to write stuff you got from userspace to disk, at some point you
have to make sure you are writing out byte
and take endianess into account with le{64,32}_to_cpu and
cpu_to_le{64,32} macros.
The kernel doesn't support system calls from userspace of a different
endianness, no worries there :)
- z
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to ma
On 07/05/2012 10:14 AM, Alexander Block wrote:
On Thu, Jul 5, 2012 at 7:08 PM, Zach Brown wrote:
Careful, timespec will be different sizes in 32bit userspace and a 64bit
kernel. I'd use btrfs_timespec to get a fixed size timespec and avoid
all the compat_timespec noise. (I'd then
+static long btrfs_ioctl_set_received_subvol(struct file *file,
+ void __user *arg)
+{
+ struct btrfs_ioctl_received_subvol_args *sa = NULL;
+ ret = copy_to_user(arg, sa, sizeof(*sa));
+struct btrfs_ioctl_received_subvol_args {
+ ch
read-only mode is default and (hopefully) does no writes to the device,
this would require the --repair option so what you propose is sort of a
sanity check, right?
Ah, I didn't realize that it didn't write without --repair. Yeah,
making sure that people don't try to combine the repair and
re
On 07/03/2012 08:52 AM, David Sterba wrote:
On Tue, Jul 03, 2012 at 04:22:08PM +0100, Hugo Mills wrote:
Correct, by default it just checks the filesystem. Just to be sure:
the filesystems in question weren't mounted, were they?
fsck will refuse to run on a mounted filesystem, though in cas
Ugh sorry I just dug this patch out from last week and forgot I had just
picked an arbitrary number to make sure it was working. You are correct,
what I _meant_ to do (and will do after I respond) was calculate how
much we wanted to flush and then divide that by how much the delayed
inodes reser
+ case FLUSH_DELAYED_ITEMS_NR:
+ case FLUSH_DELAYED_ITEMS:
+ nr = (*state == FLUSH_DELAYED_ITEMS_NR) ? 10 : -1;
This 10 seemed awfully magical so I read a bit more.
It appears to be an attempt to pop back up into reserve_metadata_bytes()
to see if the caller has been
+BUG_ON(ret);
I wonder if we can patch BUG_ON() to break the build if its only
argument is "ret".
why?
Well, I'm mostly joking :). That would be a very silly change to make.
But only mostly joking. btrfs does have a real fragility problem from
all these incomplete error handling pa
I'd like to find a better mirror selection hint that would work well on
avearage and will get back to it someday, unless somebody else wants to
continue experimenting here.
Well, for some context you can see what the existing kernel raid
implementations do:
drivers/md/raid1.c:read_b
int set_state_private(struct extent_io_tree *tree, u64 start, u64 private)
{
[...]
+ ret = radix_tree_insert(&tree->csum, (unsigned long)start,
+ (void *)((unsigned long)private<< 1));
Will this fail for 64bit files on 32bit hosts?
+ BUG_ON(ret
#define device_name_printk(dev, level, fmt, ...) do { \
struct rcu_string *name;\
\
rcu_read_lock();\
name = rcu_dereference(d
- if (state->print_mask& BTRFSIC_PRINT_MASK_SUPERBLOCK_WRITE)
+ if (state->print_mask& BTRFSIC_PRINT_MASK_SUPERBLOCK_WRITE) {
+ struct rcu_string *name;
+
+ rcu_read_lock();
+ name = rcu_dereference(d
My understanding is that the limit on the number of hardlinks to the same
file stored in the same directory, is, because the names of the
hardlinks are stored within the same inode. As such the number of hardlinks is
naturally limited by the size of the inode (and dependent on the length
of the
I imagine there is, but based on what little information they've shown
I don't see how it's a hands down win against b-trees. If anything
we're talking about having to solve really complex problems in order
to get any sort of good performance out of this thing.
Oh, absolutely. Tack on COW an
but lets say O(log N/2) where N is the number of elements in the row.
So in the situation I describe you are looking at having to do minimum
of 29 reads, one for each row,
Hmm.
Levels are powers of two and are either full or empty. So the total
item count tells you which levels are full or e
On 03/14/2012 12:48 PM, Ted Ts'o wrote:
On Wed, Mar 14, 2012 at 10:17:37AM -0400, Zach Brown wrote:
We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.
Typically the inode number of the destin
We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.
Typically the inode number of the destination inode isn't used to index
entries for a readdir tree because of (wait for it) hard links. You end
up ri
> > Hole: a range of the file that contains no data or is made up
> > entirely of NULL (zero) data. Holes include preallocated ranges of
> > files that have not had actual data written to them.
> No for me. A hole is made up of zero data? It's a strange definition
> for me.
It's a very natural d
> I like this much more than providing a journal start/stop to userland.
> If we can get Christoph to ack the exports we can work on the interface
> in general.
I'll note, briefly, that it seems dangerous to call right into the sys_
functions instead of going through the architecture's syscall nu
> This hasn't been at the top of my list for a while, I remember a bunch
> of planning sessions where you weren't worried about it ;)
Yeah, no doubt. I go back and forth :)
- z
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.k
>>> this thread. I get EMLINK when trying to create more than 311 (not 272)
>>> links in a directory
>>
>> what real-world application uses and needs this many hard links?
>
> I don't think that's a good counterargument for why this is not a bug.
I strongly agree. Our ignorance of users operati
> 0130 9FA0: E2 3B 43 AA 63 BF 28 B3 87 B7 FD AB DA 74 2D 1C
> 0130 9FA0: E2 3B 43 AA 63 BF 28 B3 87 33 FD AB DA 74 2D 1C
B7 = 10110111
33 = 00110011
> 06CD DF90: B0 22 6B 46 9F ED 6E 47 73 5E 7E EB DA 5F D6 11
> 06CD DF90: B0 22 6B 46 9F ED 6E 47 73 1E 7E EB DA 5F D6 11
5E = 0100
1E =
> Do you think you're hitting a memtest bug or is the HW really bad?
If you can stomach it, you can get a second opinion from the bootable
windows memory testing iso:
http://oca.microsoft.com/en/windiag.asp
- z
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the b
> + if (block_count < 256*1024*1024) {
> + fprintf(stderr, "File system size is
> too small\n");
> + exit(1);
> + }
And please, if you could, include both the size that
> Compression is optional and off by default (mount -o compress to enable
> it). When enabled, every file is compressed.
Compression is attempted as files are written when the mount option is
enabled, right?
There isn't a background scrubber that tries to compress files which are
already writte
Avi Kivity wrote:
> I've been reading btrfs's on-disk format, and two things caught my eye
>
> - attribute((packed)) structures everywhere, often with misaligned
> fields. This conserves space, but can be harmful to in-memory
> performance on some archs.
How harmful? Do you have any profiles th
> Well, after some hints from Linus I've rebased these about 4 times now.
> The new changesets are generally cleaner and are setup properly under
> fs/btrfs.
Can you publish these hints somewhere?
- z
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a mess
> File granularity is not well suited to dedup when files differ by only a
> few blocks, but I'd want to see some numbers on how often that happens
> before carrying around the disk format needed to do block level dedup.
I was imagining that one could easily make a flag to debug-tree which
caused
David Woodhouse wrote:
> On Tue, 2008-07-22 at 13:03 -0400, Chris Mason wrote:
>> Well, the test is there to make sure the caller is doing the right
>> thing. Before we remove it, I'd like to understand why it is failing.
>
> Because this is a uniprocessor kernel. So spin_lock() and spin_unlock()
> +#define BTRFS_IOC_NODATACOW _IO(BTRFS_IOCTL_MAGIC, 13)
> +#define BTRFS_IOC_DATACOW _IO(BTRFS_IOCTL_MAGIC, 14)
> +#define BTRFS_IOC_NODATASUM _IO(BTRFS_IOCTL_MAGIC, 15)
> +#define BTRFS_IOC_DATASUM _IO(BTRFS_IOCTL_MAGIC, 16)
Hmm. Do we really want 4 different ioctl commands to turn 2 features
> SSD is still very expensive when compared to traditional hard disks.
When measured by GB/$, sure.
Many data centers, though, care more about (ops/sec) / ($ * power *
heat). SSDs look much more compelling by that metric.
- z
--
To unsubscribe from this list: send the line "unsubscribe linux-b
> Running debug-tree on a live FS is a very good way to learn about trees that
> get left around while snapshot deletion is happening and cache aliasing
> caused by the way Btrfs puts metadata into its own address space.
>
> But, if you're trying to learn the disk format, I'd stick an unmount b
> We've written into the middle of that 100MB extent, and we need to do COW.
> One option is to read the whole thing, change 4k and write it all back.
> Instead, btrfs does something like this (+/- off by need more coffee errors):
>
> file pos = 0 -> [ old extent, offset = 0, num_bytes = 400k
> We've written into the middle of that 100MB extent, and we need to do COW.
> One option is to read the whole thing, change 4k and write it all back.
> Instead, btrfs does something like this (+/- off by need more coffee errors):
>
> file pos = 0 -> [ old extent, offset = 0, num_bytes = 400k
> A misbehaving application could also deliberately hold a transaction open,
> effectively locking up the FS, so it may make sense to restrict something
> like this to root or something.
I suspect it doesn't have to be deliberate.
Have you tried this under memory pressure? I wonder if the app
301 - 336 of 336 matches
Mail list logo