> On 8 Aug 2019, at 1:55 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
> [...]
>>>>
>>>> Fundamentally, logical address space has no relevance in the user context,
>>>> so also I don’t understand your view on how anyone shall use the
>>>> range::start
>>>> even if there is no check?
>>>
>>> range::start == bg_bytenr, range::len = bg_len to trim only a bg.
>>>
>>
>> Thanks for the efforts in explaining.
>>
>> My point is- it should not be one single bg but it should rather be all
>> bg(s) in the specified range [start, length] so the %range.start=0 and
>> %range.length=<U64MAX/total_bytes> should trim all the bg(s).
>
> That's the common usage, but it doesn't mean it's the only usage.
Oh you are right. man page doesn’t restrict range.start to be within
super_total_bytes. It's only generic/260 that it is trying to enforce.
> Above bg range trim is also a valid use case.
>
>> May be your next question is- as we relocate the chunks how would the user
>> ever know correct range.start to use? for which I don’t have an answer and
>> the same question again applies to your proposal range.start=[0 to U64MAX]
>> as well.
>>
>> So I am asking you again, even if you allow range.start=[0 to U64MAX] how
>> will the user use it? Can you please explain?
>
> There are a lot of tools to show the bg bytenr and usage of each bg.
> It isn't a problem at all.
>
External tools sounds better than some logic within kernel to perform such a
transformations. Now I get your idea. My bad.
I am withdrawing this patch.
Thanks, Anand
>>
>>
>>> And that bg_bytenr is at 128T, since the fs has gone through several
>>> balance.
>>> But there is only one device, and its size is only 1T.
>>>
>>> Please tell me how to trim that block group only?
>>>
>>
>> Block groups are something internal the users don’t have to worry about it.
>> The range is [0 to totalbytes] for start and [0 to U64MAX] for len is fair.
>
> Nope, users sometimes care. Especially for the usage of each bg.
>
> Furthermore, we have vusage/vrange filter for balance, so user is not
> blocked from the whole bg thing.
>
>>
>>>>
>>>> As in the man page it's ok to adjust the range internally, and as length
>>>> can be up to U64MAX we can still trim beyond super_total_bytes?
>>>
>>> As I said already, super_total_bytes makes no sense in logical address
>>> space.
>>
>> But super_total_bytes makes sense in the user land though, on the other hand
>> logical address space which you are trying to expose to the user land does
>> not make sense to me.
>
> Nope, super_total_bytes in fact makes no sense under most cases.
> It doesn't even shows the up limit of usable space. (E.g. For all RADI1
> profiles, it's only half the space at most. Even for all SINGLE
> profiles, it doesn't account the 1M reserved space).
>
> It's a good thing to detect device list corruption, but despite that, it
> really doesn't make much sense.
>
> For logical address space, as explains, we have tools (not in
> btrfs-progs though) and interface (balance vrange filter) to take use of
> them.
>
>>
>>> As super_total_bytes is just the sum of all devices, it's a device layer
>>> thing, nothing to do with logical address space.
>>>
>>> You're mixing logical bytenr with something not even a device physical
>>> offset, how can it be correct?
>>>
>>> Let me make it more clear, btrfs has its own logical address space
>>> unrelated to whatever the devices mapping are.
>>> It's always [0, U64_MAX], no matter how many devices there are.
>>>
>>> If btrfs is implemented using dm, it should be more clear.
>>>
>>> (single device btrfs)
>>> |
>>> (dm linear, 0 ~ U64_MAX, virtual devices)<- that's logical address space
>>> | | | |
>>> | | | \- (dm raid1, 1T ~ 1T + 128M, devid1 XXX, devid2 XXX)
>>> | | \------ (dm raid0, 2T ~ 2T + 1G, devid1 XXX, devid2 XXX)
>>> | \---------- (dm raid1, 128G ~ 128G + 128M, devi1 XXX, devid xxx)
>>> \-------------- (dm raid0, 1M ~ 1M + 1G, devid1 xxx, devid2 xxx).
>>>
>>> If we're trim such fs layout, you tell me which offset you should use.
>>>
>>
>> There is no perfect solution, the nearest solution I can think - map
>> range.start and range.len to the physical disk range and search and discard
>> free spaces in that range.
>
> Nope, that's way worse than current behavior.
> See the above example, how did you pass devid? Above case is using RAID0
> and RAID1 on two devices, how do you handle that?
> Furthermore, btrfs can have different devices sizes for RAID profiles,
> how to handle that them? Using super total bytes would easily exceed
> every devices boundary.
>
> Yes, the current behavior is not the perfect solution either, but you're
> attacking from the wrong direction.
> In fact, for allocated bgs, the current behavior is the best solution,
> you can choose to trim any range and you have the tool like Hans'
> python-btrfs.
>
> The not-so-perfect part is about the unallocated range.
> IIRC things like thin-provision LVM choose not to trim the unallocated
> part, while btrfs choose to trim all the unallocated part.
>
> If you're arguing how btrfs handles unallocated space, I have no word to
> defend at all. But for the logical address part? I can't have more words
> to spare.
>
> Thanks,
> Qu
>
>> This idea may be ok for raids/linear profiles, but again as btrfs can
>> relocate the chunks its not perfect.
>>
>> Thanks, Anand
>>
>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>> Thanks, Anand
>>>>
>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>>>
>>>>>> The change log is also vague to me, doesn't explain why you are
>>>>>> re-adding that check.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>>
>>>>>>> /*
>>>>>>> * NOTE: Don't truncate the range using super->total_bytes.
>>>>>>> Bytenr of
>>>>>>> --
>>>>>>> 2.21.0 (Apple Git-120)