[Python-Dev] rlcompleter -- auto-complete dictionary keys (+ tests)

Valery Khamenya Sun, 07 Nov 2010 05:51:08 -0800

Hi,

A) I missed the auto-complete feature for dictionary keys a lot in python
console. This patch seems to do the job.


B) There is no rlcompleter tests in trunk for some reason. So, I've taken
the 2.7.x test_rlcompleter.py and extended it.

C) patched rlcompleter as such works OK for unicode dictionary keys as well.
All tests pass OK. HOWEVER, readline's completion mechanism seem to be
confused with unicode strings -- see comments to
Completer.dict_key_matches(). So, perhaps, some changes should be applied to
readline code too.

Attached:

1. rlcompleter.py (as for trunk)

2. test_rlcompleter (as for trunk)

3. rlcompleter_trunk_to_new.diff (created as: diff rlcompleter_trunk.py
rlcompleter.py >rlcompleter_trunk_to_new.diff)

P.S. thanks to kerio & ssbr on icq for advices.

best regards
--
Valery A.Khamenya

# -*- coding: utf-8 -*-

"""Word completion for GNU readline 2.0.

This requires the latest extension to the readline module. The completer
completes keywords, built-ins and globals in a selectable namespace (which
defaults to __main__); when completing NAME.NAME..., it evaluates (!) the
expression up to the last dot and completes its attributes.

It's very cool to do "import sys" type "sys.", hit the
completion key (twice), and see the list of names defined by the
sys module!

Tip: to use the tab key as the completion key, call

    readline.parse_and_bind("tab: complete")

Notes:

- Exceptions raised by the completer function are *ignored* (and
generally cause the completion to fail).  This is a feature -- since
readline sets the tty device in raw (or cbreak) mode, printing a
traceback wouldn't work well without some complicated hoopla to save,
reset and restore the tty state.

- The evaluation of the NAME.NAME... form may cause arbitrary
application defined code to be executed if an object with a
__getattr__ hook is found.  Since it is the responsibility of the
application (or the user) to enable this feature, I consider this an
acceptable risk.  More complicated expressions (e.g. function calls or
indexing operations) are *not* evaluated.

- GNU readline is also used by the built-in functions input() and
raw_input(), and thus these also benefit/suffer from the completer
features.  Clearly an interactive application can benefit by
specifying its own completer function and using raw_input() for all
its input.

- When the original stdin is not a tty device, GNU readline is never
used, and this module (and the readline module) are silently inactive.

"""

import __builtin__
import __main__

__all__ = ["Completer"]

class Completer:
    def __init__(self, namespace=None):
        """Create a new completer for the command line.

        Completer([namespace]) -> completer instance.

        If unspecified, the default namespace where completions are performed
        is __main__ (technically, __main__.__dict__). Namespaces should be
        given as dictionaries.

        Completer instances should be used as the completion mechanism of
        readline via the set_completer() call:

        readline.set_completer(Completer(my_namespace).complete)
        """

        if namespace and not isinstance(namespace, dict):
            raise TypeError, 'namespace must be a dictionary'

        # Don't bind to namespace quite yet, but flag whether the user wants a
        # specific namespace or to use __main__.__dict__. This will allow us
        # to bind to __main__.__dict__ at completion time, not now.
        if namespace is None:
            self.use_main_ns = 1
        else:
            self.use_main_ns = 0
            self.namespace = namespace

    def complete(self, text, state):
        """Return the next possible completion for 'text'.

        This is called successively with state == 0, 1, 2, ... until it
        returns None.  The completion should begin with 'text'.
       
        """
        if self.use_main_ns:
            self.namespace = __main__.__dict__

        if state == 0:
            if "." in text:
                self.matches = self.attr_matches(text)
            elif "[" in text:
                self.matches = self.dict_key_matches(text)
            else:
                self.matches = self.global_matches(text)
        try:
            return self.matches[state]
        except IndexError:
            return None

    def _callable_postfix(self, val, word):
        if hasattr(val, '__call__'):
            word = word + "("
        return word

    def global_matches(self, text):
        """Compute matches when text is a simple name.

        Return a list of all keywords, built-in functions and names currently
        defined in self.namespace that match.

        """
        import keyword
        matches = []
        n = len(text)
        for word in keyword.kwlist:
            if word[:n] == text:
                matches.append(word)
        for nspace in [__builtin__.__dict__, self.namespace]:
            for word, val in nspace.items():
                if word[:n] == text and word != "__builtins__":
                    matches.append(self._callable_postfix(val, word))
        return matches

    def attr_matches(self, text):
        """Compute matches when text contains a dot.

        Assuming the text is of the form NAME.NAME....[NAME], and is
        evaluatable in self.namespace, it will be evaluated and its attributes
        (as revealed by dir()) are used as possible completions.  (For class
        instances, class members are also considered.)

        WARNING: this can still invoke arbitrary C code, if an object
        with a __getattr__ hook is evaluated.

        """
        import re
        m = re.match(r"(\w+(\.\w+)*)\.(\w*)", text)
        if not m:
            return []
        expr, attr = m.group(1, 3)
        try:
            thisobject = eval(expr, self.namespace)
        except Exception:
            return []

        # get the content of the object, except __builtins__
        words = dir(thisobject)
        if "__builtins__" in words:
            words.remove("__builtins__")

        if hasattr(thisobject, '__class__'):
            words.append('__class__')
            words.extend(get_class_members(thisobject.__class__))
            
        if hasattr(thisobject, '__class__'):
            words.append('__class__')
            words.extend(get_class_members(thisobject.__class__))

        matches = []
        n = len(attr)
        for word in words:
            if word[:n] == attr and hasattr(thisobject, word):
                val = getattr(thisobject, word)
                word = self._callable_postfix(val, "%s.%s" % (expr, word))
                matches.append(word)
        return matches

    def dict_key_matches(self, text):
        """Compute matches when text contains a [.

        The readline's completer mechanism has problems with "text",
        when a completions contains unicode data.
        The first unicode occurrence makes the match value look like: u'd[...' . 
        This seem to reak the requirement that all matches should start from "text". 
        The readline's iterator stops. As result other completions aren't seen.
        
        As a work around, we put all unicode-containing keys in the end of completion list.
        So that at least all ascii keys are well displayed.
        Maybe one day the upstream readline's completer code will be more unicode-friendy.
        
        All rlcompleter unicode test are passed OK! 
        
        The evaluation of the part before the '[' could be enhanced.

        """
        
        import re
        m = re.match(r"(\w+(\.\w+)*)\[(([0-9]+)|((u?)([\'\"])([^\'\"\]]*))|(.+))?", text)
        
        if not m:
            return []
        expr, int_num, literal_prefix, quote, key_start, not_valid = m.group(1, 4, 6, 7, 8, 9)        
        
        try:
            thisobject = eval(expr, self.namespace)
        except Exception:
            return []
        
        matches = []
        
        if int_num:
            num_keys = (k for k in thisobject.keys() if isinstance(k, (int, long)))
            for k in num_keys: 
                if ('%d' % k).startswith(int_num):
                    matches.append("%s[%d]" % (expr, k))
        elif key_start != None:
            ## see comments in documentation to this function above.
            char_keys = [k for k in thisobject.keys() if isinstance(k, str)]
            char_keys += [k for k in thisobject.keys() if isinstance(k, unicode)]
            for k in char_keys:
                if k.startswith(key_start):
                    lit_pref = literal_prefix
                    if isinstance(k, unicode):
                        # we force unicode prefix
                        lit_pref = 'u'
                    matches.append("%s[%s%s%s%s]" % (expr, lit_pref, quote, k, quote))
        elif not_valid:
            ## any actions on invalid input?
            pass
        else:
            ## see comments in documentation to this function above.
            non_unicode_keys = [k for k in thisobject.keys() if not isinstance(k, unicode)]
            unicode_keys = [k for k in thisobject.keys() if isinstance(k, unicode)]
            for k in non_unicode_keys + unicode_keys:
                if isinstance(k, str):
                    matches.append("%s[\'%s\']" % (expr, k))
                elif isinstance(k, unicode):
                    matches.append("%s[u\'%s\']" % (expr, k))
                else:
                    matches.append("%s[%s]" % (expr, k))
        return matches

def get_class_members(klass):
    ret = dir(klass)
    if hasattr(klass, '__bases__'):
        for base in klass.__bases__:
            ret = ret + get_class_members(base)
    return ret

try:
    import readline
except ImportError:
    pass
else:
    readline.set_completer(Completer().complete)

# -*- coding: utf-8 -*-
from test import test_support as support
import unittest
import rlcompleter
import __builtin__ as builtins

"""
See comments for Completer.dict_key_matches() in rlcompleter.py regarding unicode

"""

class CompleteMe(object):
    """ Trivial class used in testing rlcompleter.Completer. """
    spam = 1


# A trivial dicts used in testing rlcompleter.Completer

NoUnicodeDictCompleteMe = { 'spam' : 1,
                           1984 : 42 } 

UnicodeDictCompleteMe = { 'spam' : 1,
                         1984 : 42,
                         u'öh, вау!' : 'yep world is not ascii only'} 


class TestRlcompleter(unittest.TestCase):
    def setUp(self):
        self.stdcompleter = rlcompleter.Completer()
        self.completer = rlcompleter.Completer(dict(spam=int,
                                        egg=str,
                                        CompleteMe=CompleteMe,
                                        UnicodeDictCompleteMe=UnicodeDictCompleteMe,
                                        NoUnicodeDictCompleteMe=NoUnicodeDictCompleteMe))

        # forces stdcompleter to bind builtins namespace
        self.stdcompleter.complete('', 0)

    def test_namespace(self):
        class A(dict):
            pass
        class B(list):
            pass

        self.assertTrue(self.stdcompleter.use_main_ns)
        self.assertFalse(self.completer.use_main_ns)
        self.assertFalse(rlcompleter.Completer(A()).use_main_ns)
        self.assertRaises(TypeError, rlcompleter.Completer, B((1,)))

    def test_global_matches(self):
        # test with builtins namespace
#        self.assertEqual(self.stdcompleter.global_matches('di'),
#                         [x+'(' for x in dir(builtins) if x.startswith('di')])
        self.assertEqual(self.stdcompleter.global_matches('st'),
                         [x + '(' for x in dir(builtins) if x.startswith('st')])
        self.assertEqual(self.stdcompleter.global_matches('akaksajadhak'), [])

        # test with a customized namespace
        self.assertEqual(self.completer.global_matches('CompleteM'),
                         ['CompleteMe('])
        self.assertEqual(self.completer.global_matches('eg'),
                         ['egg('])
        # XXX: see issue5256
        self.assertEqual(self.completer.global_matches('CompleteM'),
                         ['CompleteMe('])


    def test_attr_matches(self):
        # test with builtins namespace
        self.assertEqual(self.stdcompleter.attr_matches('str.s'),
                         ['str.%s(' % x for x in dir(str) if x.startswith('s')])

        self.assertEqual(self.stdcompleter.attr_matches('tuple.foospamegg'), [])

        # test with a customized namespace
        self.assertEqual(self.completer.attr_matches('CompleteMe.sp'),
                         ['CompleteMe.spam'])
        self.assertEqual(self.completer.attr_matches('Completeme.egg'), [])

        CompleteMe.me = CompleteMe
        self.assertEqual(self.completer.attr_matches('CompleteMe.me.me.sp'),
                         ['CompleteMe.me.me.spam'])
        self.assertEqual(self.completer.attr_matches('egg.s'),
                         ['egg.%s(' % x for x in dir(str)
                          if x.startswith('s')])

    def test_dict_key_matches_without_unicode(self):
        self.assertEqual(self.completer.dict_key_matches('NoUnicodeDictCompleteMe['),
                         ['NoUnicodeDictCompleteMe[1984]',
                          'NoUnicodeDictCompleteMe[\'spam\']'])
        
        self.assertEqual(self.completer.dict_key_matches('NoUnicodeDictCompleteMe[1'),
                         ['NoUnicodeDictCompleteMe[1984]'])
        
        self.assertEqual(self.completer.dict_key_matches('NoUnicodeDictCompleteMe[2'),
                         [])

        self.assertEqual(self.completer.dict_key_matches('NoUnicodeDictCompleteMe[s'),
                         [])

        self.assertEqual(self.completer.dict_key_matches('NoUnicodeDictCompleteMe[\''),
                         ['NoUnicodeDictCompleteMe[\'spam\']'])

        self.assertEqual(self.completer.dict_key_matches('NoUnicodeDictCompleteMe[\'s'),
                         ['NoUnicodeDictCompleteMe[\'spam\']'])

        self.assertEqual(self.completer.dict_key_matches('NoUnicodeDictCompleteMe[\"s'),
                         ['NoUnicodeDictCompleteMe[\"spam\"]'])


    def test_dict_key_matches_with_unicode(self):
        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe['),
                         ['UnicodeDictCompleteMe[1984]',
                          'UnicodeDictCompleteMe[\'spam\']',
                          u'UnicodeDictCompleteMe[u\'öh, вау!\']' ])
        
        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe[1'),
                         ['UnicodeDictCompleteMe[1984]'])
        
        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe[2'),
                         [])

        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe[s'),
                         [])

        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe[\''),
                         ['UnicodeDictCompleteMe[\'spam\']',
                          u'UnicodeDictCompleteMe[u\'öh, вау!\']'])

        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe[\'s'),
                         ['UnicodeDictCompleteMe[\'spam\']'])

        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe[\"s'),
                         ['UnicodeDictCompleteMe[\"spam\"]'])

        self.assertEqual(self.completer.dict_key_matches(u'UnicodeDictCompleteMe[u\'ö'),
                         [u'UnicodeDictCompleteMe[u\'öh, вау!\']'])

        self.assertEqual(self.completer.dict_key_matches('UnicodeDictCompleteMe[u\''),
                         ["UnicodeDictCompleteMe[u'spam']", #  it is a question of requirements!
                          u'UnicodeDictCompleteMe[u\'öh, вау!\']' ])

def test_main():
    support.run_unittest(TestRlcompleter)


if __name__ == '__main__':
    test_main()

0a1,2
> # -*- coding: utf-8 -*-
> 
48c50
<     def __init__(self, namespace = None):
---
>     def __init__(self, namespace=None):
64c66
<             raise TypeError,'namespace must be a dictionary'
---
>             raise TypeError, 'namespace must be a dictionary'
80c82
< 
---
>        
87a90,91
>             elif "[" in text:
>                 self.matches = self.dict_key_matches(text)
148a153,157
>             
>         if hasattr(thisobject, '__class__'):
>             words.append('__class__')
>             words.extend(get_class_members(thisobject.__class__))
> 
157a167,231
>     def dict_key_matches(self, text):
>         """Compute matches when text contains a [.
> 
>         The readline's completer mechanism has problems with "text",
>         when a completions contains unicode data.
>         The first unicode occurrence makes the match value look like: u'd[...' . 
>         This seem to reak the requirement that all matches should start from "text". 
>         The readline's iterator stops. As result other completions aren't seen.
>         
>         As a work around, we put all unicode-containing keys in the end of completion list.
>         So that at least all ascii keys are well displayed.
>         Maybe one day the upstream readline's completer code will be more unicode-friendy.
>         
>         All rlcompleter unicode test are passed OK! 
>         
>         The evaluation of the part before the '[' could be enhanced.
> 
>         """
>         
>         import re
>         m = re.match(r"(\w+(\.\w+)*)\[(([0-9]+)|((u?)([\'\"])([^\'\"\]]*))|(.+))?", text)
>         
>         if not m:
>             return []
>         expr, int_num, literal_prefix, quote, key_start, not_valid = m.group(1, 4, 6, 7, 8, 9)        
>         
>         try:
>             thisobject = eval(expr, self.namespace)
>         except Exception:
>             return []
>         
>         matches = []
>         
>         if int_num:
>             num_keys = (k for k in thisobject.keys() if isinstance(k, (int, long)))
>             for k in num_keys: 
>                 if ('%d' % k).startswith(int_num):
>                     matches.append("%s[%d]" % (expr, k))
>         elif key_start != None:
>             ## see comments in documentation to this function above.
>             char_keys = [k for k in thisobject.keys() if isinstance(k, str)]
>             char_keys += [k for k in thisobject.keys() if isinstance(k, unicode)]
>             for k in char_keys:
>                 if k.startswith(key_start):
>                     lit_pref = literal_prefix
>                     if isinstance(k, unicode):
>                         # we force unicode prefix
>                         lit_pref = 'u'
>                     matches.append("%s[%s%s%s%s]" % (expr, lit_pref, quote, k, quote))
>         elif not_valid:
>             ## any actions on invalid input?
>             pass
>         else:
>             ## see comments in documentation to this function above.
>             non_unicode_keys = [k for k in thisobject.keys() if not isinstance(k, unicode)]
>             unicode_keys = [k for k in thisobject.keys() if isinstance(k, unicode)]
>             for k in non_unicode_keys + unicode_keys:
>                 if isinstance(k, str):
>                     matches.append("%s[\'%s\']" % (expr, k))
>                 elif isinstance(k, unicode):
>                     matches.append("%s[u\'%s\']" % (expr, k))
>                 else:
>                     matches.append("%s[%s]" % (expr, k))
>         return matches
> 
160c234
<     if hasattr(klass,'__bases__'):
---
>     if hasattr(klass, '__bases__'):
170a245,246
> 
>

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] rlcompleter -- auto-complete dictionary keys (+ tests)

Reply via email to