On Tue, Aug 11, 2015 at 08:23:00PM -0700, Clayton Kirkwood wrote: > Question 2: > My current code: > See "Look here" below.
There's an art to picking good variable names, neither too short nor too long, just descriptive enough without being too verbose. Or to put it another way... The existence of a systematic application of knowledge or skill in effecting a desired result in such a way as to be in accordance with the actual state of reality is a factual statement when applied to the technique of making a agreeable selection between hypothetical candidate appellations for variables. *wink* I'm afraid that reading your code gives me a headache. You have master_directory_file_list (which is a dict, not a list) and directory_file_list; also target_filename_size (a dict, not a size), current_directory_path, target_directory, current_file_list, file_list, and file_file_path_list, although I admit that my head is spinning at the moment and that last one might be inaccurate :-) I don't know how you can keep track of all those so-very-similar variable names, especially since some of them lie about what they are! (Claiming to be a list when they are actually dicts, etc.) Better names will help. It is rare that stating what *type* a variable is will be helpful. E.g. you normally wouldn't say "count_int" or "width_float", rather "count" and "width". You need to find a balance between variable names which are too generic and non-descriptive, and those which are too verbose and detailed. That will come with experience, and from reading other people's code to see what works and what doesn't work. It will also help if you can break your code up into small, self-contained functions which can be digested by the reader in isolation. For example, if I were writing your code, I might do something like this: def name_is_seen_before(filename, already_seen): """Returns whether or not filename has already been seen. If the filename has not been seen before, it is added to the set of already seen filenames, and False is returned; otherwise True is returned. """ if filename in already_seen: return True already_seen.add(filename) return False Then use it something like this: already_seen = set() duplicates = set() for name in bunch_of_names: if name_is_seen_before(name, already_seen): duplicates.add(name) Obviously you have to get the bunch_of_names somehow first. To do that, you can also simplify the process of iterating over files with a filter that discards the files you don't care about: def is_media_file(filename): extensions = ['jpg', 'gif', 'png'] # etc. base, ext = os.path.splitext(filename) return ext[1:].lower() in extensions def filter_media(filenames): media_files = [] for name in filenames: if is_media_file(name): media_files.append(name) return media_files The advantage of this is that you can write these small functions independently of the rest of your loop, you can test them easily and satisfy yourself that they work correctly, and then forget all the internal details of how they work when you go to use them: for dir_path, directories, filenames in os.walk(main_dir): filenames = filter_media(filenames) # ... continue processing ... And you don't need to care that variable names are duplicated in different functions, because each function is self-contained. There's no confusion between variable "name" in one function and "name" in another, so there's no need to try to come up with distinctive names for them both. The point being, the *details* of how the media files are filtered are not important. You can "look inside" the filter_media() function when you care about the details, and when you don't, you can just treat it as a black-box that takes a list of file names and returns only those which are media files. Try redesigning your code to be a bit more hierarchical in this way, and see if that makes it easier for you to solve the problem. -- Steve _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor