Re: [Dspam-devel] A memory leak and the reading of uninitialized bytes in 3.10.1... patch attached

Ladar Levison Thu, 01 Sep 2011 23:16:06 -0700

One 8/28/2011 6:12 AM, Stevan Bajic wrote:

Should be fixed in GIT

Further testing found another code path that leads to a memory leak. Ididn't realize these lines were inside a for loop the first time around;and without the loop a free call wouldn't be needed.

Most people use the Apache SpamAssassin corpi for testing or the TREC corpi.

I was wondering if there was an actively maintained corpus available.Something like compressionratings.com -- but for email classification.Something I could test my configuration/build against to see where it ranks.

You could use dspam_train and use dspam_stats to set/reset the snapshot.

Yeah! But I'll need a script or tool to automate the testing process? Iuse the library API so I'm not a good person to write such a test script.


I don't understand what you mean with this? Are you trying to get a
certain score/result that you can compare with the other DSPAM
users/developers?

Exactly! If I run my build against an identical corpus I should getidentical results! If the results vary, I know its time look for bugs.The goal is to catch a string function or memory allocator that behavesdifferently. I can always decide the deviation is small enough toignore... but only if I have results for comparison.

I don't know how other benchmark their setup (and if they even do
benchmark their setup)? I myself have developed over the years my own
testing and training method. I don't use stock DSPAM methods at all. I
guess other DSPAM users/admins have established their own test and
training procedures as well.

I was hoping to find the tools/scripts/notes to test mybuild/implementation. It would be nice if those tools became part of'make check'. If the testing bits take up too much space, justdistribute them as a separate tarball. The libxml2 project uses thatstrategy. Each release tarball is paired with its own test tarball.Check out ftp://xmlsoft.org/libxml2/ for the files.

This is difficult since the backend is configurable with ./configure but
it is most likely not initialized and a 'make check' would require to
have a properly configured backend (with all the schema and access
already setup), which is not available on a fresh/new setup during
compile time.

A good start might be to compile the command line utilities and/or testprograms using with the file system storage driver. If the checkvariants are compiled in response to 'make check' and stored inside thetest folder they shouldn't cause any problems. Then its just a matter ofautomating the test process. And if the results are stored under thebuild tree they could be purged easily pruged with 'make clean'. ClamAVships with a test corpus and 'make check' will test the corpus againstthe command line tools. It checks whether a reasonable amount of memorywas needed; that the program finished quickly and most importantly thatit generates the expected classification.

This strategy could be used to test libdspam and could allow limitedtesting of the command line utilities. IMO thats the most importantchunk of code.

When time allows; adding logic to test different storage configurationsshouldn't be possible. Just write the check script with the assumption avalid test database available. If the dspam user won't connect to thelocalhost using the password 'bajic' then 'make check' simply fails.

If you wanted to get a little more complicated, try executing the RDBMSbinary against a localized config file. Then initialize your blankdatabase schema and listen for connections via a file socket or namedpipe. Since the database files are stored inside the build tree, theycan be pruged and recreated each time 'make check 'is called. Checkoutthe MySQL tarball and run "./configure; make && make check" for the details.

P.S. If anyone else decides to test DSPAM using Valgrind, the currentrelease (3.6.1) will complain about glibc str functions reading dirtymemory via aligned reads. The issue is fixed in the valgrind coderepository -- for those willing/able to compile a 3.7.0 snapshot.

--- decode.c
+++ decode.c 
@@ -491,9 +493,13 @@
           free(header->concatenated_data);
           header->concatenated_data = decoded;
         }
-        else if (was_null) {
-          header->original_data = NULL;
-        }
+        else if (was_null && header->original_data) {
+                                       free(header->original_data);
+                                       header->original_data = NULL;
+                               }
+                               else if (was_null) {
+                                       header->original_data = NULL;
+                               }
       }
     }

------------------------------------------------------------------------------
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better 
price-free! And you'll get a free "Love Thy Logs" t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev

_______________________________________________
Dspam-devel mailing list
Dspam-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Re: [Dspam-devel] A memory leak and the reading of uninitialized bytes in 3.10.1... patch attached

Reply via email to