On 28/08/2011 00:18, r...@rdo.python.org wrote:
Thank you so much. The code worked perfectly.

This is what I tried using Emile code. The only time when it picked
wrong name from the list was when the file was named like this.

Data Mark Stone.doc

How can I fix this? Hope I am not asking too much?

Have you tried the alternative word orders, "Mark Stone" as well as
"Stone, Mark", picking whichever name has the best ratio for either?

import os
from difflib import SequenceMatcher as SM

path = r'D:\Files '
txt_names = []


with open(r'D:/python/log1.txt') as f:
     for txt_name in f.readlines():
         txt_names.append(txt_name.strip())

def ignore(x):
      return x in ' ,.'

for filename in os.listdir(path):
      ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in
txt_names]
      best = max(ratios)
      owner = txt_names[ratios.index(best)]
      print filename,":",owner





On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebille<em...@fenx.com>
wrote:

On 8/27/2011 1:15 PM r...@rdo.python.org said...

Hello Emile ,

Thank you for the code below as I have not encountered SequenceMatcher
before and would have to take a look at it closer.

My question would it work for a text file list of names about 25k
lines and a directory with say 100 files inside?

Sure.

Emile



Thank you once again.


On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebille<em...@fenx.com>
wrote:

On 8/27/2011 10:03 AM r...@rdo.python.org said...
Hello,

What would be the best way to accomplish this task?

I'd do something like:


usernames = """Adler, Jack
Smith, John
Smith, Sally
Stone, Mark""".split('\n')

filenames = """Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc""".split('\n')

>from difflib import SequenceMatcher as SM


def ignore(x):
      return x in ' ,.'


for filename in filenames:
      ratios = [SM(ignore,filename,username).ratio() for username in
usernames]
      best = max(ratios)
      owner = usernames[ratios.index(best)]
      print filename,":",owner


Emile



I have many files in separate directories, each file name
contain a persons name but never in the same spot.
I need to find that name which is listed in a large
text file in the following format. Last name, comma
and First name. The last name could be duplicate.

Adler, Jack
Smith, John
Smith, Sally
Stone, Mark
etc.


The file names don't necessary follow any standard
format.

Smith, John - 02-15-75 - business files.doc
Random Data - Adler Jack - expenses.xls
More Data Mark Stone files list.doc
etc

I need some way to pull the name from the file name, find it in the
text list and then create a directory based on the name on the list
"Smith, John" and move all files named with the clients name into that
directory.



--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to