Re: [vim/vim] Add the matchfuzzy() function (#6932)

Bram Moolenaar Sun, 13 Sep 2020 03:45:10 -0700


> >> On Sat, Sep 12, 2020 at 1:09 PM Prabir Shrestha <
> >> [email protected]> wrote:
> >>
> >>> +1 for adding fuzzy match.
> >>>
> >>> A bit late to this. But few questions.
> >>>
> >>> How do we use matchfuzzy for a list of dictionary instead of item.
> >>> Primary reason for this is LSP (language server protocol) makes heavy use
> >>> of user_data. Can you add an example for sorting a list whose type is 
> >>> dict [
> >>> { 'word': 'foo', user_data: 1}, { 'word': 'foobar', user_data: 2}, {
> >>> 'word': 'bar', user_data: 3} ]. Should we have a 3rd optional parameter
> >>> that takes a callback?
> >>>
> >>> let completeitems = [
> >>>  \ { 'word': 'foo', user_data: 1},
> >>>  \ { 'word': 'foobar', user_data: 2},
> >>>  \ { 'word': 'bar', user_data: 3}
> >>>  \ ]
> >>> let filtertedlist = matchfuzzy(completeitems, 'foo', {item->item['word']})
> >>>
> >>>
> >> That is a good idea. We can add a callback argument that will be called
> >> for each item
> >> if the list item is dictionary.
> >>
> >
> > You can try the attached diff which implements the above and see how this
> > works with a
> > very large list of dictionaries.
> >
> >
> I used the below functions to measure the time it takes to fuzzy search a
> million entries
> in a List and a Dict:
> 
> ==========================================
> func MeasureList()
>     let l = ['abcdef']->repeat(1000000)
>     let start = reltime()
>     let m = l->matchfuzzy('bcd')
>     let secs = start->reltime()->reltimefloat()
>     echomsg "Elapsed Seconds = " .. secs->string()
> endfunc
> 
> func MeasureDict()
>     let l = []
>     for i in range(1, 1000000)
>         call add(l, {'text' : 'abcdef', 'idx' : i})
>     endfor
> 
>     let start = reltime()
>     let m = l->matchfuzzy('bcd', {v -> v.text})
>     let secs = start->reltime()->reltimefloat()
>     echomsg "Elapsed Seconds = " .. secs->string()
> endfunc
> ==========================================
> 
> For a list, it took around a second to search all the entries and for a
> dict,
> it took around 5.5 seconds.


A list is always going to be more efficient than a dict.  It uses less
memory and doesn't require hashing for the lookup.

Also, when using a dict the key will be the same and rather meaningless?
I can only imagine this being useful if you already have a list of
dicts.  How often does that happen?

Using a list of lists, where the inner list has the text to fuzzy-match
on as the first item and the rest of the list can be anything, would be
the most efficient.  Looking up the first item of a list is fast as
well.  Only creating this list of lists might take extra time, if you
have an already existing structure.

[['abcd', oneData], ['bcde', twoDta]]->matchfuzzy('cd')

-- 
Kisses may last for as much as, but no more than, five minutes.
                [real standing law in Iowa, United States of America]

 /// Bram Moolenaar -- [email protected] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\  an exciting new programming language -- http://www.Zimbu.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/vim_dev/202009131044.08DAifuW1348641%40masaka.moolenaar.net.

Re: [vim/vim] Add the matchfuzzy() function (#6932)

Raspunde prin e-mail lui