Steven D'Aprano wrote: > On Tue, 19 Apr 2016 09:44 am, Sayth Renshaw wrote: > >> Hi >> >> Why would it be that my files are not being found in this script? > > You are calling the script with: > > python jqxml.py samples *.xml > > This does not do what you think it does: under Linux shells, the glob > *.xml will be expanded by the shell. Fortunately, in your case, you have > no files in the current directory matching the glob *.xml, so it is not > expanded and the arguments your script receives are: > > > "python jqxml.py" # not used > > "samples" # dir > > "*.xml" # mask > > > You then call: > > fileResult = filter(lambda x: x.endswith(mask), files) > > which looks for file names which end with a literal string (asterisk, dot, > x, m, l) in that order. You have no files that match that string. > > At the shell prompt, enter this: > > touch samples/junk\*.xml > > and run the script again, and you should see that it now matches one file. > > Instead, what you should do is: > > > (1) Use the glob module: > > https://docs.python.org/2/library/glob.html > https://docs.python.org/3/library/glob.html > > https://pymotw.com/2/glob/ > https://pymotw.com/3/glob/ > > > (2) When calling the script, avoid the shell expanding wildcards by > escaping them or quoting them: > > python jqxml.py samples "*.xml"
(3) *Use* the expansion mechanism provided by the shell instead of fighting it: $ python jqxml.py samples/*.xml This requires that you change your script from pyquery import PyQuery as pq import pandas as pd import sys fileResult = sys.argv[1:] if not fileResult: print("no files specified") sys.exit(1) for file in fileResult: print(file) for items in fileResult: try: d = pq(filename=items) except FileNotFoundError as e: print(e) continue res = d('nomination') # you could move the attrs definition before the loop attrs = ('id', 'horse') # probably a bug: you are overwriting data on every iteration data = [[res.eq(i).attr(x) for x in attrs] for i in range(len(res))] I think this is the most natural approach if you are willing to accept the quirk that the script tries to process the file 'samples/*.xml' if the samples directory doesn't contain any files with the .xml suffix. Common shell tools work that way: $ ls samples/*.xml samples/1.xml samples/2.xml samples/3.xml $ ls samples/*.XML ls: cannot access samples/*.XML: No such file or directory Unrelated: instead of working with sys.argv directly you could use argparse which is part of the standard library. The code to get at least one file is import argparse parser = argparse.ArgumentParser() parser.add_argument("files", nargs="+") args = parser.parse_args() print(args.files) Note that this doesn't fix the shell expansion oddity. -- https://mail.python.org/mailman/listinfo/python-list