Hi, I have one main goal in my improvements to the PostgreSQL DSPAM driver -- make the performance of the PostgreSQL driver at the level of the MySQL driver. In my comparison of the two drivers, the MySQL driver uses features such as updates and inserts with multiple rows at once and "ON DUPLICATE KEY UPDATE" to greatly reduce the number of network round trips to the database backend from the DSPAM application. In my simple test case, MySQL needed about 20 INSERTS to update tokens for which the PostgreSQL driver needed 20X more. This means that everything else being equal and assuming a very fast database, the PostgreSQL backend will take many times longer to perform the same updates based on the inherent network latencies. While addressing that situation, by including the ability to pass the parameters out-of-band as binary data, the DSPAM driver does not have to convert the data into ASCII and the PostgreSQL database does not have to re-convert the ASCII token data back into binary format for insertion or updates. In addition to reducing the CPU overhead this also reduces the amount of data actually sent over the network by a factor of 2.
I would like to go ahead and remove the very old support for the legacy pre-3.4 non-BIGINT dspam_token_data schema. dspam-3.4.0 was released in 11 March 2005. I think it is time and will make the code easier to manage going forward. Regards, Ken On Fri, May 21, 2010 at 06:24:47PM +0200, imposit.com - Webmaster wrote: > Hello, > > Just my personal Optinion about > " if you think adding a minimum version requirement for the database backend > is acceptable...." > > I think ist the wrong end to start. I would go from which requirement is > needed for the features are needed. > I mean it make no real sense compromise some things only to lower the > requirements to save someone 2gig ram or so. > > Of course ther is nothing wrong saving resources and optimize things no > doubt but i think in first line should come the features and second the > resources. > otherwise we will be stuck forever on a half bakend service which need no > resources but is also only half usefull.. > > i mean even if you use a retended virtual host you get 8gig ram and one or 2 > core of an xenon/core7 64bit cpu for almost nothing.. > shure there are older system in production ... time to upgrade :-) > > br > rm > > Imposit.com > Phone: +43 (1) 9971636-30 > Fax: +43 (1) 9971636-90 > > E-mail: mailto:webmas...@imposit.com > Web: www.imposit.com > > Registered Office:? Wienerstrasse 130, > 8680 M?rzzuschlag, Austria, Company number FN310087K > > -----Urspr?ngliche Nachricht----- > Von: Kenneth Marshall [mailto:k...@rice.edu] > Gesendet: Freitag, 21. Mai 2010 16:16 > An: dspam-user@lists.sourceforge.net > Betreff: [Dspam-user] DSPAM driver for PostgreSQL > > > Hi DSPAM community, > > > > For myself, I really appreciate having the conversations about problems > > and solutions with DSPAM being carried out on the mailing list. It allows > > me to save a personal copy of key messages and provides a historical > > archive to search for answers. IRC conversations are only known to those > > who happened to be on-line at the time and then there is no record. > > > > Steve and Ion, I would like to thank you for all the hard work you have > put > > into the DSPAM project. We are going to be upgrading our DSPAM > infrastructure > > over the summer to version 3.9.0+ and hopefully switch from a MySQL > backend > > to a PostgreSQL backend. As a start I have built 3.9.0 on a Solaris 8 box > > against PostgreSQL 8.4.2. My current project is to allow for an option to > > have the PostgreSQL driver use out-of-band binary transmission of query > > parameters. This will have two benefits. First, less CPU will be used > since > > the atoi() calls will not be needed to send the data. Second, the size of > > the transmitted data will be reduced by 70% or more in the worst case of > > looking up the tokens. This is more important when the DSPAM DB is on a > > separate machine from the actual engine. > > > > The first cut will be using the libpqtypes.h library to do the work. > > Then I will do some comparisons between the binary transmission and > > the current non-binary transmission of the query parameters. Is this > > something that you would be interested in having folded back in to > > the DSPAM codebase? > > > A big YES from me. > > > Regards, > > Ken > > > > Hi DSPAM community/developers, > > I have been working on the update to the DSPAM PostgreSQL driver to > support binary out-of-band parameter transmission as well as other > performance optimizations. My goal is to make it a via backend alternative > for a high performance DSPAM installation. For code clarity reasons, I > would like to suggest the following changes to the non-binary parameter > version as well: > > 1. Remove the libdspam support for NUMERIC(20) and only include the > BIGINT support. > > I sent in the original patch when PostgreSQL 7.x was the current > release. In our testing, using the NUMERIC() option made the > performance so beneath that of the MySQL driver that it would > not be considered, even for very small DSPAM installations. > > 2. Announce a minimum PostgreSQL version requirement of 8.1 for non- > Windows and 8.2 for Windows. Or if there is consensus, require > version 8.2 or higher for all. > > Version 8.1 was the first release of PostgreSQL to support the > GREATEST() SQL function. Currently the code is using a more > confusing alternative CASE... statement to do the same thing. > This would allow all the drivers to use much clearer and more > similar code in the dspam_token_data UPDATE paths. Another > plug for making 8.2 the minimum version, other than the lack > of Windows support for 8.2, is that they added support for > multiple-row VALUES clauses, like MySQL and SQLite. This > would allow for some more reconvergence of the driver. > > The release date for PostgreSQL 8.1 is 8 November 2005 and > for PostgreSQL 8.2 is 5 December 2006 which is still almost > 4 years ago. > > I think that these changes would improve the performance of the > PostgreSQL driver to an enterprise level while making the codebase > closer to the other drivers which will improve manageability and > make consistent and correct changes to all drivers easier. > > Would anyone using DSPAM with the PostgreSQL driver/backend send > me an E-mail with your version of DSPAM and PostgreSQL, the > size of your DSPAM installation in users, and if you think adding > a minimum version requirement for the database backend is acceptable. > I will tally the responses and send an update next Friday. > > Cheers, > Ken > > > ---------------------------------------------------------------------------- > -- > > _______________________________________________ > Dspam-user mailing list > Dspam-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspam-user > > !DSPAM:1005,4bf694d458271081412768! > > ------------------------------------------------------------------------------ _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user