?
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kerne
ish this than using /proc/self/mountinfo?
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...
On 06/19/2012 07:22 AM, Calvin Walton wrote:
>
> All subvolumes are accessible from the volume mounted when you use -o
> subvolid=0. (Note that 0 is not the real ID of the root volume, it's
> just a shortcut for mounting it.)
>
Could you clarify this bit? Specifically, what is the real ID of th
On 06/19/2012 04:51 PM, Chris Mason wrote:
>
> At mount time, we go through and verify the path names still belong to
> the filesystem you thought they belonged to. The bdev is locked during
> the verification, so it won't be able to go away or change.
>
> This is a long way of saying right we d
On 06/19/2012 04:49 PM, Chris Mason wrote:
> On Mon, Jun 18, 2012 at 06:39:31PM -0600, H. Peter Anvin wrote:
>> I'm trying to figure out an algorithm from taking an arbitrary mounted
>> btrfs directory and break it down into:
>>
>>
>>
>> where, keep in
oot may be a subvolume
because it has different policies, this gets pretty gnarly to get right.
It is also a very high value to get right.
So it is possible I'm approaching this wrong. I would love to have a
discussion about this.
-hpa
--
H. Peter Anvin, Intel Open Source Technol
h what is installed in the boot block (functionally another
part of the bootloader), or all hell will break loose. I think that
means that relying on the subvolume ID makes more sense. To upgrade the
bootloader, invoke the bootloader installer at the end of the update;
that will repoint *everything*
NFO.
2. Allow returning more than one device at a time. Userspace can
already know the number of devices from BTRFS_IOC_FS_INFO(*), and
it'd be better to just size a buffer and return N items rather
having to iterate over the potentially sparse devid space.
I might write this one up i
th an arbitrary number of devices.
This means that a bootloader can consider a single device in isolation:
if the firmware gives access only to a single device, it can be booted.
Since /boot is usually a very small amount of data, this is a very
reasonable tradeoff.
-hpa
--
H. Peter
On 06/20/2012 09:34 AM, Goffredo Baroncelli wrote:
>
> At the first I tough that having the /boot separate could be a good
> thing. Unfortunately /boot contains both the bootloader code and the
> kernel image. The kernel image should be in sync with the contents of
> /lib/modules/
>
> This is
the
> kernel under /lib/modules is not so wrong.
No, it is completely, totally and very very seriously wrong.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "
grub.
>
It is both more and less elegant; it means you don't get the same kind
of atomic update for the bootloader itself.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send
Could you have a mode, though, where M = N at all times, so a user doesn't end
up adding a new drive and get a nasty surprise?
Chris Mason wrote:
>On Wed, Jun 20, 2012 at 06:35:30PM -0600, Marios Titas wrote:
>> On Wed, Jun 20, 2012 at 12:27 PM, H. Peter Anvin
>wrote:
>
On 06/20/2012 10:47 PM, Goffredo Baroncelli wrote:
>
> This leads to have a separately /boot filesystem. In this case I agree
> with you: make sense that the kernel is near the bootloader files.
>
> But if /boot has to be in a separate filesystem, which is the point to
> support btrfs at all ? Do
On 06/21/2012 10:05 AM, Goffredo Baroncelli wrote:
> On 06/21/2012 03:38 PM, H. Peter Anvin wrote:
>>> But if /boot has to be in a separate filesystem, which is the point to
>>>> support btrfs at all ? Does make sense to support only a subset of btrfs
>>>>
On 06/25/2012 08:21 AM, Chris Mason wrote:
> Yes and no. If you have 2 drives and you add one more, we can make it
> do all new chunks over 3 drives. But, turning the existing double
> mirror chunks into a triple mirror requires a balance.
>
> -chris
So trigger one. This is the exact analogue
On 06/25/2012 03:28 PM, Gareth Pye wrote:
> To me one doesn't have to be triggered, a user expects to have to tell
> the disks to rebuild/resync/balance after adding a disk, they may want
> to wait till they've added all 4 disks and run a few extra commands
> before they run the rebalance.
They do
On 06/25/2012 03:54 PM, Hugo Mills wrote:
> On Mon, Jun 25, 2012 at 10:46:01AM -0700, H. Peter Anvin wrote:
>> On 06/25/2012 08:21 AM, Chris Mason wrote:
>>> Yes and no. If you have 2 drives and you add one more, we can
>>> make it do all new chunks over 3 drives. Bu
On 06/25/2012 04:00 PM, H. Peter Anvin wrote:
I am aware of that, and it is not a problem... the one-device
bootloader can find out *which* disk it is talking to by comparing
uuids, and the btrfs data structures will tell it how to find the data
on that specific disk. It does of course mean
Hi,
I'm looking at using a btrfs with snapshots to implement a generational
backup capacity. However, doing it the naïve way would have the side
effect that for a file that has been partially modified, after
snapshotting the file would be written with *mostly* the same data. How
does btrfs' COW
On 05/25/16 02:29, Hugo Mills wrote:
> On Wed, May 25, 2016 at 01:58:15AM -0700, H. Peter Anvin wrote:
>> Hi,
>>
>> I'm looking at using a btrfs with snapshots to implement a generational
>> backup capacity. However, doing it the naïve way would have the side
&g
On 11/18/2013 02:08 PM, Andrea Mazzoleni wrote:
> Hi,
>
> I want to report that I recently implemented a support for
> arbitrary number of parities that could be useful also for Linux
> RAID and Btrfs, both currently limited to double parity.
>
> In short, to generate the parity I use a Cauchy ma
On 11/18/2013 02:35 PM, Andrea Mazzoleni wrote:
> Hi Peter,
>
> The Cauchy matrix has the mathematical property to always have itself
> and all submatrices not singular. So, we are sure that we can always
> solve the equations to recover the data disks.
>
> Besides the mathematical proof, I've al
It is also possible to quickly multiply by 2^-1 which makes for an interesting
R parity.
Andrea Mazzoleni wrote:
>Hi David,
>
>>> The choice of ZFS to use powers of 4 was likely not optimal,
>>> because to multiply by 4, it has to do two multiplications by 2.
>> I can agree with that. I didn't
On 11/20/2013 10:56 AM, Andrea Mazzoleni wrote:
> Hi,
>
> Yep. At present to multiply for 2^-1 I'm using in C:
>
> static inline uint64_t d2_64(uint64_t v)
> {
> uint64_t mask = v & 0x0101010101010101U;
> mask = (mask << 8) - mask;
> v = (v >> 1) & 0x7f7f7f7f7f7f7f7fU;
>
On 11/20/2013 10:56 AM, Andrea Mazzoleni wrote:
> Hi,
>
> Yep. At present to multiply for 2^-1 I'm using in C:
>
> static inline uint64_t d2_64(uint64_t v)
> {
> uint64_t mask = v & 0x0101010101010101U;
> mask = (mask << 8) - mask;
(mask << 7) I assume...
--
To unsubscribe from
On 11/20/2013 11:05 AM, Andrea Mazzoleni wrote:
>
> For the first row with j=0, I use xi = 2^-i and y0 = 0, that results in:
>
How can xi = 2^-i if x is supposed to be constant?
That doesn't mean that your approach isn't valid, of course, but it
might not be a Cauchy matrix and thus needs addit
On 11/20/2013 01:04 PM, Andrea Mazzoleni wrote:
> Hi Peter,
>
>>> static inline uint64_t d2_64(uint64_t v)
>>> {
>>> uint64_t mask = v & 0x0101010101010101U;
>>> mask = (mask << 8) - mask;
>>
>> (mask << 7) I assume...
> No. It's "(mask << 8) - mask". We want to expand the bit at p
On 11/20/2013 12:30 PM, James Plank wrote:
> Peter, I think I understand it differently. Concrete example in GF(256) for
> k=6, m=4:
>
> First, create a 3 by 6 cauchy matrix, using x_i = 2^-i, and y_i = 0 for i=0,
> and y_i = 2^i for other i. In this case: x = { 1, 142, 71, 173, 216, 108 }
On 11/21/2013 04:30 PM, Stan Hoeppner wrote:
>
> The rebuild time of a parity array normally has little to do with CPU
> overhead.>
Unless you have to fall back to table driven code.
Anyway, this looks like a great concept. Now we just need to implement
it ;)
-hpa
--
To unsubscribe fr
@@ -1389,6 +1392,14 @@ int btrfs_rm_device(struct btrfs_root *root, char
*device_path)
}
btrfs_dev_replace_unlock(&root->fs_info->dev_replace);
+ if ((all_avail & (BTRFS_BLOCK_GROUP_RAID5 |
+ BTRFS_BLOCK_GROUP_RAID6) && num_devices <= 3)) {
+
Also, a 2-member raid5 or 3-member raid6 are a raid1 and can be treated as such.
Chris Mason wrote:
>On Mon, Feb 04, 2013 at 02:42:24PM -0700, H. Peter Anvin wrote:
>> @@ -1389,6 +1392,14 @@ int btrfs_rm_device(struct btrfs_root *root,
>char
>
It turns out that the primary 64K "Boot Area A" is too small for some
applications and/or some architectures.
When I discussed this with Chris Mason, he pointed out that the area
beyond the superblock is also unused, up until at least the megabyte
point (from my reading of the mkfs code, it is act
On 05/14/2014 05:01 PM, H. Peter Anvin wrote:
> It turns out that the primary 64K "Boot Area A" is too small for some
> applications and/or some architectures.
>
> When I discussed this with Chris Mason, he pointed out that the area
> beyond the superblock is also unus
On 05/20/2014 04:37 PM, Chris Mason wrote:
> On 05/20/2014 07:29 PM, H. Peter Anvin wrote:
>> On 05/14/2014 05:01 PM, H. Peter Anvin wrote:
>>> It turns out that the primary 64K "Boot Area A" is too small for some
>>> applications and/or some architectures.
>
On 05/20/2014 04:37 PM, Chris Mason wrote:
>
> Hi Peter,
>
> We do leave the first 1MB of each device alone. Can we do 256K-1024K
> for the boot loader? We don't have an immediate need for the extra
> space, but I'd like to reserve a little more than the extra 64KB.
>
Incidentally, the curren
Typically they are using 64-bit signed seconds.
On May 31, 2014 11:22:37 AM PDT, Richard Cochran
wrote:
>On Sat, May 31, 2014 at 05:23:02PM +0200, Arnd Bergmann wrote:
>>
>> It's an approximation:
>
>(Approximately never ;)
>
>> with 64-bit timestamps, you can represent close to 300 billion
>>
On 06/02/2014 12:19 PM, Arnd Bergmann wrote:
> On Monday 02 June 2014 13:52:19 Joseph S. Myers wrote:
>> On Fri, 30 May 2014, Arnd Bergmann wrote:
>>
>>> a) is this the right approach in general? The previous discussion
>>>pointed this way, but there may be other opinions.
>>
>> The syscall cha
On 06/02/2014 12:55 PM, Arnd Bergmann wrote:
>>
>> The bit that is really going to hurt is every single ioctl that uses a
>> timespec.
>>
>> Honestly, though, I really don't understand the point with "struct
>> inode_time". It seems like the zeroeth-order thing is to change the
>> kernel internal
On 06/04/2014 12:24 PM, Arnd Bergmann wrote:
>
> For other timekeeping stuff in the kernel, I agree that using some
> 64-bit representation (nanoseconds, 32/32 unsigned seconds/nanoseconds,
> ...) has advantages, that's exactly the point I was making earlier
> against simply extending the internal
~100,000
> kernel functions.
>
> I'll try to annotate the inline asms (there's not _that_ many of them),
> and measure what the size impact is.
The main reason to do #2 over #3 would be for programmer documentation.
There simply should be no reason to ever out-of-lining the
es us some way of injecting the actual
weight of the asm statement...
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body
ng any "inline" in the current code that
isn't "__always_inline"...
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs&qu
one type of branches. As a result, it tends to drastically
overestimate, on purpose.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" i
#x27;
> and what ones really should have been __always_inline.
>
> Not that I feel _that_ strongly about it.
>
Actually, we have that reasonably well down by now. There seems to be a
couple of minor tweaking still necessary, but I think we're 90-95% there
already.
-hpa
onsensical annotations like that?
>
__asm_inline was my suggestion, to distinguish "inline this
unconditionally because gcc screws up in the presence of asm()" versus
"inline this unconditionally because the world ends if it isn't" -- to
tell the human reader, not gcc. I guess
tionally for
performance", as a documentation issue), but those are the four we get.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
ne and noinline does work.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
M
Linus Torvalds wrote:
So we do have special issues. And exactly _because_ we have special issues
we should also expect that some compiler defaults simply won't ever really
be appropriate for us.
That is, of course, true.
However, the Linux kernel (and quite a few other kernels) is a very
Richard Guenther wrote:
But it's also not inconceivable that gcc adds a -fkernel-inlining or
similar that changes the parameters if we ask nicely. I suppose
actually such a parameter would be useful for far more programs
than the kernel.
I think that the kernel is a perfect target to optimize
Richard Guenther wrote:
On Fri, Jan 9, 2009 at 8:21 PM, Andi Kleen wrote:
GCC 4.3 should be good in compiling the
kernel with default -Os settings.
It's unfortunately not. It doesn't inline a lot of simple asm() inlines
for example.
Reading Ingos posting with the actual numbers states the op
Linus Torvalds wrote:
Because then they are _our_ mistakes, not some random compiler version that
throws a dice!
This does bring up the idea of including a compiler with the kernel
sources again, doesn't it?
-hpa (ducks & runs)
--
To unsubscribe from this list: send the line "unsub
Andi Kleen wrote:
Fetch a gigabyte's worth of data for the debuginfo RPM?
The suse 11.0 kernel debuginfo is ~120M.
Still, though, hardly worth doing client-side when it can be done
server-side for all the common distro kernels. For custom kernels, not
so, but there you should already have
ses where removing inline helps is
in .h files, which would require code movement to fix. Hence to see if
it can be automated.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the l
ible
for the compiler to do the right thing for at least this class of
functions really does seem like a good idea.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
--
To unsubscribe from this list: send the line "
ased kernels.. if the fedora/suse guys
> would help to at least have the vmlinux for their released updates
> easily available that would be a huge help without that it's going
> to suck.
>
We could just pick them up automatically from the kernel.org mirrors
with a little
an additional mechanism with additional introduction complexities
>and an ongoing maintenance cost.
>
I think I have a pretty clean idea for how to do this. I'm going to
experiment with it over the next few days.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Cent
Andi Kleen wrote:
>
> Weird. I wonder where this strange restriction comes from. It indeed
> makes this much less useful than it could be :/
>
Most likely they're collapsing at too early of a stage.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I w
Andi Kleen wrote:
On Mon, Jan 12, 2009 at 11:02:17AM -0800, Linus Torvalds wrote:
Something at the back of my mind said "aliasing".
$ gcc linus.c -O2 -S ; grep subl linus.s
subl$1624, %esp
$ gcc linus.c -O2 -S -fno-strict-aliasing; grep subl linus.s
subl$824, %esp
That'
Ingo Molnar wrote:
Hm, GCC uses __restrict__, right?
I'm wondering whether there's any internal tie-up between alias analysis
and the __restrict__ keyword - so if we turn off aliasing optimizations
the __restrict__ keyword's optimizations are turned off as well.
Actually I suspect that "r
Dan Williams wrote:
On Mon, Jul 13, 2009 at 7:11 AM, David Woodhouse wrote:
We'll want to use these in btrfs too.
Signed-off-by: David Woodhouse
Do you suspect that btrfs will also want to perform these operations
asynchronously? I am preparing an updated release of the raid6
offload patch
Ric Wheeler wrote:
Worth sharing a pointer to a really neat set of papers that describe
open source friendly RAID6 and erasure encoding algorithms that were
presented last year and this at FAST:
http://www.cs.utk.edu/~plank/plank/papers/papers.html
If I remember correctly, James Plank's pap
Ric Wheeler wrote:
I have seen the papers; I'm not sure it really makes that much
difference. One of the things that bugs me about these papers is that he
compares to *his* implementation of my optimizations, but not to my
code. In real life implementations, on commodity hardware, we're limited
Ric Wheeler wrote:
The bottom line is pretty much this: the cost of changing the encoding
would appear to outweigh the benefit. I'm not trying to claim the Linux
RAID-6 implementation is optimal, but it is simple and appears to be
fast enough that the math isn't the bottleneck.
Cost? Thank abo
Ric Wheeler wrote:
The main flaw, as I said, is in the phrase "as implemented by the
Jerasure library". He's comparing his own implementations of various
algorithms, not optimized implementations.
The bottom line is pretty much this: the cost of changing the encoding
would appear to outweigh th
David Woodhouse wrote:
I'm only interested in what we can use directly within btrfs -- and
ideally I do want something which gives me an _arbitrary_ number of
redundant blocks, rather than limiting me to 2. But the legacy code is
good enough for now¹.
When I get round to wanting more, I was thi
ore
than two disks may be interesting, it may very well be worth spending
some time at new codes now rather than later. Part of that
investigation, though, is going to have to be if and how they can be
accelerated.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I do
>>
>> http://www.cs.utk.edu/~plank/plank/papers/CS-96-332.html even describes an
>> implementation _very_ similar to the current code, right down to using a
>> table for the logarithm and inverse logarithm calculations.
>>
We don't use a table for logarithm and inverse logarithm calculations.
An
On 07/18/2009 11:52 AM, Alex Elsayed wrote:
> Alex Elsayed wrote:
>
>> H. Peter Anvin wrote:
>>> implementation in Java, called "Jerasure".) Implementability using real
>>> array instruction sets is key to decent performance.
>> Actually, it is ma
69 matches
Mail list logo