BTW reading over the patch, it looks like I got a tab expansion issue, sorry, 5am blues :-) new one below
MJM Index: MimeDel.py =================================================================== RCS file: /cvsroot/mailman/mailman/Mailman/Handlers/MimeDel.py,v retrieving revision 2.1 diff -u -r2.1 MimeDel.py --- MimeDel.py 18 Apr 2002 20:46:53 -0000 2.1 +++ MimeDel.py 14 Aug 2002 09:19:29 -0000 @@ -33,7 +33,9 @@ from Mailman import Errors from Mailman.Logging.Syslog import syslog from Mailman.Version import VERSION - +from Mailman.Handlers.Scrubber import save_attachment +from time import strftime +from Mailman.i18n import _ def process(mlist, msg, msgdata): @@ -41,6 +43,7 @@ if not mlist.filter_content or not mlist.filter_mime_types: return # We also don't care about our own digests or plaintext + make_attachment(mlist, msg) ctype = msg.get_type('text/plain') mtype = msg.get_main_type('text') if msgdata.get('isdigest') or ctype == 'text/plain': @@ -54,7 +57,7 @@ if msg.is_multipart(): # Recursively filter out any subparts that match the filter list prelen = len(msg.get_payload()) - filter_parts(msg, filtertypes) + filter_parts(mlist, msg, filtertypes) # If the outer message is now an emtpy multipart (and it wasn't # before!) then, again it gets discarded. postlen = len(msg.get_payload()) @@ -96,7 +99,7 @@ -def filter_parts(msg, filtertypes): +def filter_parts(mlist, msg, filtertypes): # Look at all the message's subparts, and recursively filter if not msg.is_multipart(): return 1 @@ -104,9 +107,12 @@ prelen = len(payload) newpayload = [] for subpart in payload: - keep = filter_parts(subpart, filtertypes) + keep = filter_parts(mlist, subpart, filtertypes) if not keep: continue + if make_attachment(mlist, subpart): + newpayload.append(subpart) + continue ctype = subpart.get_type('text/plain') mtype = subpart.get_main_type('text') if ctype in filtertypes or mtype in filtertypes: @@ -164,3 +170,32 @@ subpart.set_type('text/plain') changedp = 1 return changedp + + + +def make_attachment(mlist, subpart): + #should be set from mlist, work in progress + #BTW this will act real stupid with mulipart, it need the real object not the +house keeping + attach_filter = ['image/bmp', 'image/jpeg', 'image/tiff', 'image/gif', +'image/png', 'image/pjpeg', 'image/x-png', 'image/x-wmf'] + ctype = subpart.get_type('text/plain') + mtype = subpart.get_main_type('text') + if ctype in attach_filter or mtype in attach_filter: + cctype = subpart.get_type() + #size is off, just could not stand to call decode to correct, might just take +off 20% and be done + size = len(subpart.get_payload()) + desc = subpart.get('content-description', (_('not available'))) + filename = subpart.get_filename(_('not available')) + url = save_attachment(mlist, subpart, strftime("attch/%Y%m/%d")) + del subpart['content-type'] + del subpart['content-transfer-encoding'] + del subpart['content-disposition'] + del subpart['content-description'] + subpart.add_header('Content-Type', 'text/plain', charset='us-ascii') + subpart.add_header('Content-Transfer-Encoding', '7bit') + subpart.set_payload(_("""\ +Name: %(filename)s Type: %(cctype)s Size: %(size)d bytes Desc: %(desc)s +Url: %(url)s +""")) + return 1 + else: + return 0 ----- Original Message ----- From: "Michael Meltzer" <[EMAIL PROTECTED]> To: "Barry A. Warsaw" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Wednesday, August 14, 2002 5:02 AM Subject: Re: [Mailman-Developers] Scrubber.py confusion, 2.1b3 > save_attachment is looking good, "Cool", my only gripe is the url are getting very >long, 80 column wrap will be an ongoing issue and > most likely unsolvable. I am not married to the path issue/usage I used. I did have >a problem with after 3 years by using the fully > qualified date their would be over 1000 files in one directory. > > I am not sure about white vs. black list. The white list is nice because I know what >type will pass thought, but will have the > problem of playing catch up with new type's, hassle factor for the admin's and >questions from new users. The black list is nice but > will I wake up one mooring and read about the "latest hole" that is being exploited, >could ruin a whole day ;-) Pondering it, I > suspect a white list with a good set of defaults should work. I kind of like the >"get the extension form mime type" but it broke > down as soon as I tried to attach a "word" document, came up a >application/octet-stream with only the extension as a clue. I like > the method but I do not think it will last, we will end back up at lists(or maybe a >real opensource anti-virus :-) > > MJM > PS. I am sure I will get the pointy hat award for the patch below :-) I also have it >running on the test server at > http://www.michaelmeltzer.com/mailman/listinfo/meltzer-list , it open(at least for a >few day :-), if anyone want to past some > traffic thought it and see the output..............Just do not flood it out. > > > > > > Index: MimeDel.py > =================================================================== > RCS file: /cvsroot/mailman/mailman/Mailman/Handlers/MimeDel.py,v > retrieving revision 2.1 > diff -u -r2.1 MimeDel.py > --- MimeDel.py 18 Apr 2002 20:46:53 -0000 2.1 > +++ MimeDel.py 14 Aug 2002 08:21:58 -0000 > @@ -33,7 +33,9 @@ > from Mailman import Errors > from Mailman.Logging.Syslog import syslog > from Mailman.Version import VERSION > - > +from Mailman.Handlers.Scrubber import save_attachment > +from time import strftime > +from Mailman.i18n import _ > > > def process(mlist, msg, msgdata): > @@ -41,6 +43,7 @@ > if not mlist.filter_content or not mlist.filter_mime_types: > return > # We also don't care about our own digests or plaintext > + make_attachment(mlist, msg) > ctype = msg.get_type('text/plain') > mtype = msg.get_main_type('text') > if msgdata.get('isdigest') or ctype == 'text/plain': > @@ -54,7 +57,7 @@ > if msg.is_multipart(): > # Recursively filter out any subparts that match the filter list > prelen = len(msg.get_payload()) > - filter_parts(msg, filtertypes) > + filter_parts(mlist, msg, filtertypes) > # If the outer message is now an emtpy multipart (and it wasn't > # before!) then, again it gets discarded. > postlen = len(msg.get_payload()) > @@ -96,7 +99,7 @@ > > > > -def filter_parts(msg, filtertypes): > +def filter_parts(mlist, msg, filtertypes): > # Look at all the message's subparts, and recursively filter > if not msg.is_multipart(): > return 1 > @@ -104,9 +107,12 @@ > prelen = len(payload) > newpayload = [] > for subpart in payload: > - keep = filter_parts(subpart, filtertypes) > + keep = filter_parts(mlist, subpart, filtertypes) > if not keep: > continue > + if make_attachment(mlist, subpart): > + newpayload.append(subpart) > + continue > ctype = subpart.get_type('text/plain') > mtype = subpart.get_main_type('text') > if ctype in filtertypes or mtype in filtertypes: > @@ -164,3 +170,32 @@ > subpart.set_type('text/plain') > changedp = 1 > return changedp > + > + > + > +def make_attachment(mlist, subpart): > + #should be set from mlist, work in progress > + #BTW this will act real stupid with mulipart, it need the real object not the >house keeping > + attach_filter = ['image/bmp', 'image/jpeg', 'image/tiff', 'image/gif', >'image/png', 'image/pjpeg', 'image/x-png', > 'image/x-wmf'] > + ctype = subpart.get_type('text/plain') > + mtype = subpart.get_main_type('text') > + if ctype in attach_filter or mtype in attach_filter: > + cctype = subpart.get_type() > + #size is off, just could not stand to call decode to correct, might just take off >20% and be done > + size = len(subpart.get_payload()) > + desc = subpart.get('content-description', (_('not available'))) > + filename = subpart.get_filename(_('not available')) > + url = save_attachment(mlist, subpart, strftime("attch/%Y%m/%d")) > + del subpart['content-type'] > + del subpart['content-transfer-encoding'] > + del subpart['content-disposition'] > + del subpart['content-description'] > + subpart.add_header('Content-Type', 'text/plain', charset='us-ascii') > + subpart.add_header('Content-Transfer-Encoding', '7bit') > + subpart.set_payload(_("""\ > +Name: %(filename)s Type: %(cctype)s Size: %(size)d bytes Desc: %(desc)s > +Url: %(url)s > +""")) > + return 1 > + else: > + return 0 > > > > > > ----- Original Message ----- > From: "Barry A. Warsaw" <[EMAIL PROTECTED]> > To: "Michael Meltzer" <[EMAIL PROTECTED]> > Cc: <[EMAIL PROTECTED]> > Sent: Tuesday, August 13, 2002 11:38 AM > Subject: Re: [Mailman-Developers] Scrubber.py confusion, 2.1b3 > > > > > > >>>>> "MM" == Michael Meltzer <[EMAIL PROTECTED]> writes: > > > > MM> Actually I "reusing" the code from Scrubber.py in MimeDel.py > > MM> to turn attachments into links :-) I hardwired it for image > > MM> types but it is generic enough. Some sample output from my > > MM> "staging": > > > > MM> Name: beach.jpg Type: image/jpeg Size: 18853 bytes Desc: > > MM> not_available Url: > > MM> >http://www.michaelmeltzer.com/pipermail/meltzer-list/attachments/200208/12/beach.jpg-0005.jpe > > > > Cool. I'm using a slightly different naming algorithm for the path. > > > > MM> It turned out to be a 4 line hack to filter_parts, 1 line at > > MM> the top and 10 lines to reformat the payload, the reset came > > MM> from save_attachment, very handle :-) > > > > Can you try to update it to current cvs? If it's really a 4 line > > hack, you've got to post it. :) I tried to write the Scrubber.py > > updates with you in mind, by factoring out some other functionality > > you might need. > > > > MM> I have to admit environment is nice to work in. > > > > :) > > > > MM> I am not sure my code it upto patch quality :-) The next step > > MM> would be a modification to the content filter page for the > > MM> type it should react to. > > > > MM> I would also subject(Scrubber.py needs this too) that the > > MM> filter pages list the extensions that it is allow to write. Or > > MM> the converse the extensions it should not write, > > MM> http://office.microsoft.com/Assistance/2000/Out2ksecFAQ.aspx. would > > MM> be my start :-), save the masses someday :-) > > > > I've been thinking about this. I vaguely remember that someone did a > > patch to support pass-or-block semantics to the filter, but I can't > > put my finger on it now. I want to link Dan Mick's name to that, but > > does this ring a bell with anyone? > > > > MM> The issue with the directory is the number of files, not a > > MM> name clash > > > > Yep, I know. > > > > MM> , `ls -d archives/private/listname/attachments/* | > > MM> wc -l` > 1000 I think system performance will be > > MM> effected. Above 10,000 I know it would(it would also be a > > MM> problem for the http server on access). I can understand that > > MM> keeping the attachment from each email in it own directory, > > MM> but this way the "files version control" :-) groups them > > MM> together for access(assuming least regency theory) and make > > MM> cleaning out for space/inodes simple. it was just strftime > > MM> wielded on. > > > > I'm not sure I followed all that, but the current Scrubber.py does add > > the date directory to the path, so I think we're good here. > > > > -Barry > > > _______________________________________________ > Mailman-Developers mailing list > [EMAIL PROTECTED] > http://mail.python.org/mailman-21/listinfo/mailman-developers _______________________________________________ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman-21/listinfo/mailman-developers