[issue34417] imp.find_module reacts badly to iterator

2018-08-21 Thread Phillip M. Feldman


Phillip M. Feldman  added the comment:

My apologies for the tone of my remark.  I am grateful to you and others
who donate their time to develop the code.

I'm attaching the wrapper code that I created to work around the problem.

Phillip

def expander(paths='./*'):
   """
   OVERVIEW

   This function is a generator, i.e., creates an iterator that recursively
   searches a list of folders in an incremental fashion.  This approach is
   advantageous when the folder tree(s) to be searched are large and the
item of
   interest is likely to be found early in the process.

   INPUTS

   `paths` must be either (a) a list of folder paths (each of which is a
string)
   or (b) a single string containing one or more folder paths separated by
the
   OS-specific path delimiter.

   Each path in `paths` must be either (a) an existing folder or (b) an
existing
   folder followed by '/*' or '\*'.  In case (a), the folder string is
copied
   from the input (`paths`) to the output result verbatim.  In case (b), the
   folder string is replaced by an expanded list that includes not only the
   base (the portion of the path that remains after the '/*' or '\*' has
been
   removed), but all subfolders as well.

   RETURN VALUES

   The returned value is an iterator.

   Invoking the `next` method of the iterator produces one folder path at a
   time.
   """

   if isinstance(paths, basestring):
  paths= paths.split(os.pathsep)

   elif not isinstance(paths, list):
  raise TypeError("`paths` must be either a string or a list of
strings.")

   found= set()

   for path in paths:
  if path.endswith('/*') or path.endswith('\*'):

 # A recursive search of subfolders is required:
 for item in os.walk(path[:-2]):
base= os.path.abspath(item[0])
new= [os.path.join(base, nested) for nested in item[1]]

for item in new:
   if not item in found:
  found.add(item)
  yield item

  else:

 # No recursive search is required:
 if not item in found:
found.add(item)
yield item

   # end for path in paths

def find_module(module_name, in_folders=[]):
   """
   This function finds a module and return the fully-qualified file name.
   Folders from `in_folders`, if specified, are search first, followed by
   folders in the global `import_path` list.

   If any folder name in `in_folders` or `import_path` ends with an
asterisk,
   indicating that a recursive search is required, `files.expander` is
   invoked to create iterators that return one folder at a time, and
   `imp.find_module` is invoked separately for each of these folders.

   EXPLICIT INPUTS

   `module_name` is the unqualified name of the module to be found.

   `in_folders` is an optional list of additional folders to be searched
before
   the folders in `import_path` are searched.

   IMPLICIT INPUTS

   `import_path` is obtained from the global namespace.

   RETURN VALUES

   If `find_module` is able to find the requested module, it returns the
same
   three return values (`f`, `filename`, and `description`) that
   `imp.find_module` would return.
   """

   if isinstance(in_folders, basestring):
  in_folders= [in_folders]
   elif not isinstance(in_folders, list):
  raise TypeError("If specified, `in_folders` must be either a string
or a "
"list of strings.  (A string is wrapped to produce a length-1
list).")

   if any([item.endswith('*') for item in in_folders ]) or \
  any([item.endswith('*') for item in import_path]):

  ex= None

  for folder in itertools.chain(
expander(in_folders), expander(import_path)):
 try:
return imp.find_module(module_name, in_folders + import_path)
 except Exception as ex:
pass

  if ex:
 raise ex

   else:
  return imp.find_module(module_name, in_folders + import_path)

On Tue, Aug 21, 2018 at 10:32 AM Brett Cannon 
wrote:

>
> Brett Cannon  added the comment:
>
> Saying "the available functionality is massively inefficient" is
> unnecessarily hostile towards those of us who actually wrote and maintain
> that code. Without diving into the code, chances are that requirement is
> there so that the C code can use macros to access the list as efficiently
> as possible.
>
> Now if you want to propose specific changes to importlib's code for it to
> work with iterables instead of just lists then we would be happy to review
> the pull request.
>
> --
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 

[issue34417] imp.find_module reacts badly to iterator

2018-08-21 Thread Brett Cannon


Brett Cannon  added the comment:

Saying "the available functionality is massively inefficient" is unnecessarily 
hostile towards those of us who actually wrote and maintain that code. Without 
diving into the code, chances are that requirement is there so that the C code 
can use macros to access the list as efficiently as possible.

Now if you want to propose specific changes to importlib's code for it to work 
with iterables instead of just lists then we would be happy to review the pull 
request.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34417] imp.find_module reacts badly to iterator

2018-08-20 Thread Phillip M. Feldman


Phillip M. Feldman  added the comment:

It appears that the `importlib` package has the same issue: One can't
provide an iterator for the path.  When searching a large folder tree for
an item that is likely to be found early in the search process (i.e., at a
high level in the folder tree), the available functionality is massively
inefficient.  So, I wrote my own wrapper for `imp.find_module` to do this
job, and will eventually modify this code to use `importlib` instead of
`imp`.

On Fri, Aug 17, 2018 at 9:05 AM Eric Snow  wrote:

>
> Eric Snow  added the comment:
>
> There are several issues at hand here, Phillip.  I'll enumerate them below.
>
> Thanks for taking the time to let us know about this.  However, I'm
> closing this issue since realistically the behavior of imp.find_module()
> isn't going to change, particularly in Python 2.7.  Even though the issue
> is closed, feel free to reply, particularly about how you are using
> imp.find_module() (we may be able to point you toward how to use importlib
> instead).
>
> Also, I've changed this issue's type to "enhancement".  imp.find_module()
> is working as designed, so what you are looking for is a feature request.
> Consequently there's a much higher bar for justifying a change.  Here are
> reasons why the requested change doesn't reach that bar:
>
> 1. Python 2.7 is closed to new features.
>
> So imp.find_module() is not going to change.
>
> 2. Python 2.7 is nearing EOL.
>
> We highly recommend that everyone move to Python 3 as soon as possible.
> Hopefully you are in a position to do so.  If you're stuck on Python 2.7
> then you miss the advantages of importlib, along with a ton of other
> benefits.
>
> If you are not going to be able to migrate before 2020 then send an email
> to python-l...@python.org asking for recommendations on what to do.
>
> 3. Starting in Python 3.4, using the imp module is discouraged/deprecated.
>
>   "Deprecated since version 3.4: The imp package is pending deprecation in
> favor of importlib." [1]
>
> The importlib package should have everything you need.  What are you using
> imp.find_module() for?  We should be able to demonstrate the equivalent
> using importlib.
>
> 4. The import machinery is designed around using a list (the builtin type,
> not the concept) for the "module search path".
>
> * imp.find_module(): "the list of directory names given by sys.path is
> searched" [2]
> * imp.find_module(): "Otherwise, path must be a list of directory names"
> [2]
> * importlib.find_loader() (deprecated): "optionally within the specified
> path" (which defaults to sys.path) [3]
> * importlib.util.find_spec(): doesn't even have a "path" parameter [4]
> * ModuleSpec.submodule_search_locations: "List of strings for where to
> find submodules" [5]
> * sys.path: "A list of strings that specifies the search path for modules.
> ... Only strings and bytes should be added to sys.path; all other data
> types are ignored during import." [6]
>
>
> [1] https://docs.python.org/3/library/imp.html#module-imp
> [2] https://docs.python.org/3/library/imp.html#imp.find_module
> [3] https://docs.python.org/3/library/importlib.html#importlib.find_loader
> [4]
> https://docs.python.org/3/library/importlib.html#importlib.util.find_spec
> [5]
> https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec.submodule_search_locations
> [6] https://docs.python.org/3/library/sys.html#sys.path
>
> --
> nosy: +brett.cannon, eric.snow
> resolution:  -> wont fix
> stage:  -> resolved
> status: open -> closed
> type: behavior -> enhancement
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34417] imp.find_module reacts badly to iterator

2018-08-17 Thread Eric Snow


Eric Snow  added the comment:

There are several issues at hand here, Phillip.  I'll enumerate them below.

Thanks for taking the time to let us know about this.  However, I'm closing 
this issue since realistically the behavior of imp.find_module() isn't going to 
change, particularly in Python 2.7.  Even though the issue is closed, feel free 
to reply, particularly about how you are using imp.find_module() (we may be 
able to point you toward how to use importlib instead).

Also, I've changed this issue's type to "enhancement".  imp.find_module() is 
working as designed, so what you are looking for is a feature request.  
Consequently there's a much higher bar for justifying a change.  Here are 
reasons why the requested change doesn't reach that bar:

1. Python 2.7 is closed to new features.

So imp.find_module() is not going to change.

2. Python 2.7 is nearing EOL.

We highly recommend that everyone move to Python 3 as soon as possible.  
Hopefully you are in a position to do so.  If you're stuck on Python 2.7 then 
you miss the advantages of importlib, along with a ton of other benefits.

If you are not going to be able to migrate before 2020 then send an email to 
python-l...@python.org asking for recommendations on what to do.

3. Starting in Python 3.4, using the imp module is discouraged/deprecated.

  "Deprecated since version 3.4: The imp package is pending deprecation in 
favor of importlib." [1]

The importlib package should have everything you need.  What are you using 
imp.find_module() for?  We should be able to demonstrate the equivalent using 
importlib.

4. The import machinery is designed around using a list (the builtin type, not 
the concept) for the "module search path".

* imp.find_module(): "the list of directory names given by sys.path is 
searched" [2]
* imp.find_module(): "Otherwise, path must be a list of directory names" [2]
* importlib.find_loader() (deprecated): "optionally within the specified path" 
(which defaults to sys.path) [3]
* importlib.util.find_spec(): doesn't even have a "path" parameter [4]
* ModuleSpec.submodule_search_locations: "List of strings for where to find 
submodules" [5]
* sys.path: "A list of strings that specifies the search path for modules. ... 
Only strings and bytes should be added to sys.path; all other data types are 
ignored during import." [6]


[1] https://docs.python.org/3/library/imp.html#module-imp
[2] https://docs.python.org/3/library/imp.html#imp.find_module
[3] https://docs.python.org/3/library/importlib.html#importlib.find_loader
[4] https://docs.python.org/3/library/importlib.html#importlib.util.find_spec
[5] 
https://docs.python.org/3/library/importlib.html#importlib.machinery.ModuleSpec.submodule_search_locations
[6] https://docs.python.org/3/library/sys.html#sys.path

--
nosy: +brett.cannon, eric.snow
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed
type: behavior -> enhancement

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34417] imp.find_module reacts badly to iterator

2018-08-16 Thread Phillip M. Feldman


New submission from Phillip M. Feldman :

`imp.find_module` goes down in flames if one tries to pass an iterator rather 
than a list of folders.  Firstly, the message that it produces is somewhat 
misleading:

   RuntimeError: sys.path must be a list of directory names

Secondly, it would be helpful if one could pass an iterator. I'm thinking in 
particular of the situation where one wants to import something from a large 
folder tree, and the module in question is likely to be found early in the 
search process, so that it is more efficient to explore the folder tree 
incrementally.

--
components: Library (Lib)
messages: 323623
nosy: phillip.m.feld...@gmail.com
priority: normal
severity: normal
status: open
title: imp.find_module reacts badly to iterator
type: behavior
versions: Python 2.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com