Goffredo Baroncelli wrote:
> On Tuesday, 14 December, 2010, Li Zefan wrote:
>> Goffredo Baroncelli wrote:
>>> Hi Li,
>>>
>>> On Monday, 13 December, 2010, Li Zefan wrote:
>>>> The keys returned by tree search ioctl should be restricted to:
>>>>
>>>>    key.objectid = [min_objectid, max_objectid] &&
>>>>    key.offset   = [min_offset, max_offset] &&
>>>>    key.type     = [min_type, max_type]
>>>>
>>>> But actually it returns those keys:
>>>>
>>>>    [(min_objectid, min_type, min_offset),
>>>>            (max_objectid, max_type, max_offset)].
>>>>
>>> I have to admit that I had need several minutes to understand what you 
> wrote 
>>> :). Then I came to conclusion that the tree search ioctl is basically 
> wrong.
>>> IMHO, the error in this API is to use the lower bound of the acceptance 
>>> criteria (the min_objectid, min_offset, min_type fields) also as starting 
>>> point for the search.
>>>
>>> Let me explain with an example.
>>>
>>> Suppose to want to search all the keys in the range
>>>
>>>     key.objectid = 10..20
>>>     key.offset   = 100..200
>>>     key.type     = 2..5
>>>
>>>
>>> Suppose to set sk->nr_items to 1 for simplicity, and the keys available 
> which 
>>> fit in the range are
>>>
>>> 1)  [15,150,3]
>>> 2)  [16,160,4]
>>> 3)  [17,180,3]
>>>
>>> All these key satisfy the "acceptance criteria", but because we have to 
>>> restart the search from the last key found, the code should resemble
>>>
>>>     sk = &args.key
>>>
>>>     sk->min_objectid=10; sk->max_objectid=20
>>>     sk->min_offset=100; sk->max_offset=200
>>>     sk->min_type=2; sk->max_type=5
>>>     sk->nr_items = 1;
>>>
>>>     while(1){
>>>             ioctl(fd,  BTRFS_IOC_TREE_SEARCH, &args);
>>>             if( !sk->nr_items )
>>>                     break
>>>
>>>             for(off = 0, i=0 ; i < sk->nr_items ; i   ){
>>>                     sh = (struct btrfs_ioctl_search_header *)(args.buf  
>>>                                                                  off);
>>>
>>>                     [...]
>>>                     sk->min_objectid = sh->objectid;
>>>                     sk->min_offset = sh->offset;
>>>                     sk->min_type = sh->type;
>>>             }
>>>
>>>             <increase the sk->min_* key of 1>
>>>                             
>>>     }
>>>
>>> But in this case, the code after found the key #2, sets the minimum 
> acceptance 
>>> criteria to [16,160,4], which exclude the key #3 because min_type is too 
> high.
>>> Ideally, we should add three new field to the search key structure:
>>>
>>>     sk->start_objectid
>>>     sk->start_offset
>>>     sk->start_type
>>>
>>> And after every iteration the code (even the kernel code) should set these 
>>> fields as "last key found  1", leaving the min_* fields as they are.
>>>
>>> My analysis is correct or I miss  something ?
>>>
>> After looking more deeply, I found the ioctl was changed in this way
>> on purpose, to support "btrfs subvolume find-new" specifically.
>>
>> See this commit:
>>
>> commit abc6e1341bda974e2d0eddb75f57a20ac18e9b33
>> Author: Chris Mason <[email protected]>
>> Date:   Thu Mar 18 12:10:08 2010 -0400
>>
>>     Btrfs: fix key checks and advance in the search ioctl
>>
>>     The search ioctl was working well for finding tree roots, but using it 
> for
>>     generic searches requires a few changes to how the keys are advanced.
>>     This treats the search control min fields for objectid, type and offset
>>     more like a key, where we drop the offset to zero once we bump the type,
>>     etc.
>>
>>     The downside of this is that we are changing the min_type and min_offset
>>     fields during the search, and so the ioctl caller needs extra checks to 
> make
>>     the keys in the result are the ones it wanted.
>>
>>     This also changes key_in_sk to use btrfs_comp_cpu_keys, just to make
>>     things more readable.
>>
>> So I think we can just fix the btrfs tool. Though adding sk->start_xxx 
> should
>> also be able to meet the needs for "btrfs subvolume find-new".
> 
> Sorry, but I have to disagree. This API seems to me simply bugged. The 
> example 
> above (which is quite generic) highlights this fact. But I can provide a more 
> real case: suppose to use the BTRFS_IOC_TREE_SEARCH ioctl to find the new 
> files. We are interested to the following items:
> 
> - BTRFS_EXTENT_DATA_KEY                       (type = 1)
> - BTRFS_INODE_ITEM_KEY                        (type = 24)
> - BTRFS_XATTR_ITEM_KEY                        (type = 108)
> 
> Acceptance criteria:
> 
>       min_type = 1
>       max_type = 108
>       min_offset = 0
>       max_offset = ~0
>       min_objectid = 0
>       max_objectid = ~0
>       min_transid = <the base generation number>
> 
> Pay attention that we aren't interested in the offset.
> 
> Suppose to have the following sequence keys  [objectid, type, offset]:
> 
> [...]
> 1)    [300, BTRFS_EXTENT_DATA_KEY, xx]
> 2)    [300, BTRFS_INODE_ITEM_KEY, xx]
> 3)    [300, BTRFS_XATTR_ITEM_KEY, xx]
> 4)    [301, BTRFS_EXTENT_DATA_KEY, xx]
> 5)    [301, BTRFS_INODE_ITEM_KEY, xx]
> 7)    [30200, BTRFS_EXTENT_DATA_KEY, xx]
> 8)    [30200, BTRFS_INODE_ITEM_KEY, xx]
> 9)    [30200, BTRFS_XATTR_ITEM_KEY, xx]
> [...]
> 
> 
> Suppose that the buffer is filled between the item 2 and 3. We should restart 
> the search, but how set the min_* key ? Try the following hypothesis
> 
> h1) objectid++, type = 0 -> In the next search the key 3 would be skipped
> h2) objectid asis, type ++, -> in the next search the key 4 would be skipped
> h3) objectid asis, type = 0 -> in the next search the key 1,2,3 would be 

h4) objectid asis, type asis, offset++ -> we should get the correct result.

because the current ioctl uses min_{x,y,z} and max_{x,y,z} as start_key and
end_key, and it returns all keys that falls in [start_key, end_key].

So this btrfs-progs patch should fix missing subvolumes in the output of
"subvolume list":

diff --git a/btrfs-list.c b/btrfs-list.c
index 93766a8..1b9ea45 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -620,7 +620,10 @@ int list_subvols(int fd)
                /* this iteration is done, step forward one root for the next
                 * ioctl
                 */
-               if (sk->min_objectid < (u64)-1) {
+               if (sk->min_type < BTRFS_ROOT_BACKREF_KEY) {
+                       sk->min_type = BTRFS_ROOT_BACKREF_KEY;
+                       sk->min_offset = 0;
+               } else  if (sk->min_objectid < (u64)-1) {
                        sk->min_objectid++;
                        sk->min_type = BTRFS_ROOT_BACKREF_KEY;
                        sk->min_offset = 0;

> returned a second time...
> 
> Pay attention that every inode may have more key type BTRFS_XATTR_ITEM_KEY or 
> type BTRFS_EXTENT_DATA_KEY, so it is not possible to know in advance when the 
> buffer is filled.
> 
> Only as  theoretical exercise, we can improve the search logic in userspace 
> so 
> when an item is returned, in the next search we set the minimum type as 
> previous type+1, and the *maximum* objectid as the latest ofound bject id. 
> When we are sure that there are not more key with this objectid  we can reuse 
> the old max_objectid and min_type... But to me it seems very fragile.
> 
> Chris what do you think ? Otherwise I missed something this seems a severe 
> bug 
> in the api ?
> 
> In another email I will propose a patch which may address this problem.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to