On 03/10/17 09:48, renukesh nk wrote: > requirement: > i have a directory , that contains multiple sub directories, each sub > directory has multiple text and log files, my script fetches the required > lines from all the sub directories and stores it in one text file. >
I'm not too sure what you mean by writing the "required lines" but I've added some general comments below > but i want it to store separate text file for each sub directory ,after > fetching the contents. can anyone please help me where to edit my script. It is better if you copy the script into the body of the email, attachments often get stripped by the server so we can't see them. In this case you got lucky... Comments below: > import zipfile,fnmatch,os > import os, shutil, re, glob you import os twice, its not a bug but it is a potential future maintenance gotcha. > from os.path import isfile, join you use os.path.xxx so don't need this import > from os import walk You use os.walk so don't need this import. > root_src_dir = r'E:\\New folder' > root_dst_dir = "E:\\destination" dst_dir should be a raw string too. And why double quotes here and single quotes above? It's good to be consistent. > for src_dir, dirs, files in os.walk(root_src_dir): > dst_dir = src_dir.replace(root_src_dir, root_dst_dir, 1) > if not os.path.exists(dst_dir): > os.makedirs(dst_dir) > for file_ in files: > src_file = os.path.join(src_dir, file_) > dst_file = os.path.join(dst_dir, file_) > if os.path.exists(dst_file): > os.remove(dst_file) > shutil.copy(src_file, dst_dir) This just copies everything so why not use the shutil.copytree() function instead? > rootPath = r"E:\\destination" Rather than duplicate the string use root_dst_dir > pattern = '*.zip' > for root, dirs, files in os.walk(rootPath): This is where you do stuff at a per directory level so this is probably where you want to create your logfile and write to it. But i'm still not sure what you are looking to write... > for filename in fnmatch.filter(files, pattern): > print(os.path.join(root, filename)) > zipfile.ZipFile(os.path.join(root, > filename)).extractall(os.path.join(root, > os.path.splitext(filename)[0])) This chunk looks at every zip file and extracts the contents into the same folder(what if there are duplicate filenames?) But you don't do anything with those files? > os.chdir(rootPath) > for file in glob.glob(pattern): > f = open((file.rsplit(".", 1)[0]) + "ANA.txt", "w") > > f.close() Then you change to the destination root and iterate over any zip files at the root level, opening a new file and immediately closing it again? At this point I'm confused? It would be good to put a comment before each block explaining the intended purpose of the code - what it is expected to achieve and why you are doing it. > ##here folder output > mypath =r"E:\\destination" Again this could use root_dst_dir to avoid string duplication (and potential error) > newpath = os.path.expanduser("E:\\destination") expanduser() replaces an initial tilde(~) with the homedir but since you don't have an initial tilde its completely redundant here. > filenam = "1.txt" > f = [] > path = '' > path1 = [] These look suspiciously like bad variable names. What is their purpose? Is 'path1' really a list? Usually lists suggest a plural name, such as paths. > for (dirpath, dirnames, filenames) in walk(mypath): > if isfile(join(mypath, dirpath)): This should never be true since mypath is a directory and dirpath is, by definition, the folder that walk() is currently reading. But joining them makes no sense since dirpath should already include mypath. > path1.extend(join(mypath, dirpath)) > if os.path.isdir(join(mypath, dirpath)): Likewise because you join() them the result will probably not exist so the tst will fail. > for f in filenames: > path1.append(str(join(mypath, dirpath, f))) If this does get called it will probably have broken paths since you use join again. If you just want the files use the list of files that os.walk gave you - it has already separated the contents into dirs and files as part of its return value. > print(path1) > > newf = open(os.path.join(newpath, filenam ), "w+") Why use 'w+'? You only need that if you are going to be reading and writing the file - usually involving use of tell()/seek() etc. You don't do any of that so keep things simple and just use 'w'. > myarray = {"ERROR", "error"} Its not an array its a set > for element in myarray: > elementstring = ''.join(element) This is a very complex way of writing elementstring = 'Error' Although it does have a frisson of interest because the actual set element chosen is theoretically arbitrary since sets are unordered. It's not clear what you are trying to achieve but I suspect this is not it. > for f in path1: > openfile = open(os.path.join(path, f), "r") Since path is an empty string this is effectively saying openfile = open(f, "r") > for line in openfile: > if elementstring in line: > newf.write(line) > newf.close() > openfile.close() You should be closing openfile in the loop above where you open it. But better still you should be using the with open((f,'r') as openfile: idiom which will auto close the file for you. I'm still not clear on what you are trying to do(*) but it looks like you are making it much more difficult than I suspect it needs to be. It will also be pretty slow since you walk the directory tree multiple times when one copytree() followed by one walk() should, I think, suffice. (*)I think it's to copy the initial file structure then search all files in the copy for lines including the strings 'Error' or 'error' and write those lines to a logfile, one per folder. But I'm not really certain. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor