I work in as 1st line support and python is one of my hobbies. We get quite a few requests for xml from our website and its a long strung out process. So I thought I'd try and create a system that deals with it for fun.
I've been tidying up the archived xml and have been thinking what's the best way to approach this issue as it took a long time to deal with big quantities of xml. If you have 5/6 years worth of 26000+ 5-20k xml files per year. The archived stuff is zipped but what is better, 26000 files in one big zip file, 26000 files in one big zip file but in folders for months and days, or zip files in zip files! I created an app in wxpython to search the unzipped xml files by the modified date and just open them up and just using the something like l.find('>%s<' % fiveDigitNumber) != -1: is this quicker than parsing the xml? Generally the requests are less than 3 months old so that got me into thinking should I create a script that finds all the file names and corresponding web number of old xml and bungs them into a db table one for each year and another script that after everyday archives the xml and after 3months zip it up, bungs info into table etc. Sorry for the ramble I just want other peoples opinions on the matter. =) -- http://mail.python.org/mailman/listinfo/python-list