Re: [RFC] Btrfs device and pool management (wip)

Goffredo Baroncelli Tue, 01 Dec 2015 10:02:10 -0800

On 2015-11-30 13:43, Qu Wenruo wrote:
> 
> 
> On 11/30/2015 03:59 PM, Anand Jain wrote:
>> (fixed alignment)
>>
>>
[...]
> 
> I'm overall OK with your *current* hot-spare implement.
> It's quite small and straightforward.
> Just hope some more more easy-to-implement features, like hot-remove instead 
> of replace. (for degradable case, it would case less IO).
> And more test-cases.
> 
> And per-filesystem hot-spare device. Global one has its limitation, like no 
> priority or choose less proper device.
> (use a TB device to replace a GB device, eating up the pool quite easily)
> It should be not hard to do, maybe add fsid into hot-spare device superblock 
> and modify kernel/user-progs a little.
> 
> 
> 
> But if your ultimate goal of *in-kernel* hot-spare is to do such complicated 
> *in-kernel police*, I would say *NO* right now before things get messed up.
> (Yeah, maybe another "discussion" just like feature auto-align)
> 
> Kernel should provide *mechanisim*, not *policy*.
> (Pretty sure most of us should hear it in one form or another).
> 
> In this case, btrfs supports for *replace* is a mechanism. (not automatically 
> replace)
> But *when* to replace a bad device, is *policy*.
> 
> 
> But if you just want to get to that goal, *not restricted to in-kernel 
> implement*, it would be much easier to do.


+1

> 
> 1) Implement a API(maybe sysfs as you suggested) to allow user-space programs 
> get informed when a btrfs device get sick(including missing or number of IO 
> errors hit a threshold)

This API should be device related and not specific to btrfs: what if the error 
happens in one partition not used by btrfs, but the disk has another partition 
used by btrfs ? 

> 
> 2) Write a user-space program listening with that API
> 
> 3) Trigger a action when device get failed.
>    Maybe replace, maybe remove, or just do nothing, fully *tunable* and
>    much *easier* to implement.
> 
> If use above method, kernel part should be as easy as the following:
> 1) A new API for user-progs to listen
> 
> 2) (Optional) Tuning interface for that API
>    E.g, threshold of IO error before informing user space
> 
> 3) Kernel fallback behavior for such error
>    Even no need to trigger replace from kernel, but just put the
>    filesystem into degraded will be good enough.
> 
> 3) A user daemon, maybe in btrfs-progs or another project.
>    Easy to debug, easy to implement, and you will be the
>    maintainer/leader/author of the new project!!
> 
> Now all the policy is moved to user-space, kernel is kept small and clean.

This is the most important thing: we should work to stabilize the current 
kernel implementation before adding further functionality. BTRFS is 8 year old, 
but it still needs some work to stabilize. I don't think that we should put 
further code in kernel space if we could add it in user space.



-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] Btrfs device and pool management (wip)

Reply via email to