That's not even mentioning the metaprogramming and higher-order programming techniques that we use extensively in SpamAssassin -- those are basically *just not possible* in C/C++. ;)
--j. Matt Kettler writes: > Giampaolo Tomassoni wrote: > > From: Matt Kettler [mailto:[EMAIL PROTECTED] > > > > > >> That said, I agree, trying to implement SA in C++ would be a NIGHTMARE. > >> > >> C++ is NOT an optimal language for apps that are string-parsing intensive. > >> > > > > I don't agree in this: I think there are good ways to handle strings in C++ > > which are good enough for the purposes of SA and the security constraints > > which would need to be enforced. > > > I did not say there were no secure string handling methods. I said C++ > was not an optimal language for string parsing. Sure you can use STL's > string library and gain some security. However writing string parsing > in C++ is a pain in the tail and results in a lot of very long and > hard-to-maintain code. Writing string parsing in perl is easy and > results in very compact easy-to-maintain code. > > I know. I write C/C++ for a living. String parsing in C++ sucks. Period. > > Let's see here.. let's find the last , in a string and extract all the > characters after it as a new string.. > > c++: Urgh.. Make a loop, compare each character, storing the most recent > match, then do an ugly substring call using that index and length-index. > perl: an easy-to-write regex will do this. There are probably better > ways I don't know of. > > The perl code is slower, but the C++ code is hard to write and hard to > maintain. I'm sure there's another way to do the perl code that's faster > and comparable to C++ here. However, I've yet to see anyone do this > operation repeatedly in C++ without ever making an off-by-one error > somewhere. > > > > > > >> Drawbacks to C/C++: > >> - regex is not language native, added by PCRE library. > >> > > > > Which is opensource as well, so it may be used. A lot of things are not > > language-native in C/C++. That's because C/C++ is designed. It can't be > > regarded as a language limit, however: you can easily use external > > libraries for all the "natively unsupported" features. > > > True, but regexes in perl are NATIVE. You can use them ANYWHERE. Even as > a parameter to a function call. To do regexes in C++ you have to make an > external call to a library. Have you ever used PCRE? It's a pain. You > have to call multiple functions, one to set up the regex, and another to > do the match. That's not so bad for the rules, but do you know how many > little regexes are scattered around the SA code that would have to be > broken out? Urgh. > > > > > > >> - Too many folks write C/C++ badly, failing to watch their memory. > >> > > > > That's a problem which may afflict even perl or python programs and > > programmers. You're right: under C++ writing bad code often results in > > sharper effects. But of course if you want to squeeze more performances you > > need to trade off something. In the C/C++ case, ease of coding would be > > traded a bit off in spite of higher performances. > > > > > > > >> This is substantially more likely in anything involving string handling, > >> which is everything SA does. > >> > > > > > > > > > >> - C/C++ does not have many of the very nice libraries that perl has > >> for DNS, SPF, IP:Country, Base64, etc, etc. > >> > > > > Well, DNS and Base64 are base services which are provided anyway. They came > > in a different "shape", but still present. > > > As is SPF. But I would not call any of these libraries "nice". > > SPF and IP::Country would need to be somehow rewritten, of course. These > > falls under the "plugin problem". It wouldn't be probably easy to replicate > > the (good) behaviour of these perl modules, but I don't even think it > > wouldn't be possible or even not worth to try it. > > > > Worse, most of the Mail:: modules would need to be somehow rewritten or > > otherwise implemented. > > > > Of course, a SA "recode" in C/C++ wouldn't came gratis. > > > > > > > >> -Again, the development team is perl programmers, unless you've got > >> a set of equivalent spam experts, or can prove the existing devs all > >> know your proposed language, even suggesting ANY port to ANY other > >> language is inane. You may as well suggest changing the spoken language > >> of the documentation to something other than English. Thus far, all the > >> writers speak English. Many know other spoken languages besides > >> English, but I doubt you'd find another one that they ALL speak. > >> > > > > I agree with you that this would be a great problem, but it is not going to > > be the main problem, isn't it? > > > I would suggest it would be. > > Most programmers in this list seems to be very versatile about programming > > languages. Also, if you know perl, the next language you know is often > > C/C++. That's just because C/C++ is often the first serious language you > > learn. > > > Yes, but many of the SA team do not have a programming background. They > have a sysadmin background and learned perl to support CGI's and > maintenance scripting. From there, they expanded. > > And what motivation do they have to move to a language that would result > in substantially more work in order to maintain the code and a MASSIVE > effort to rewrite? (I'm thinking about 10 man-years for the rewrite > That's 10 man-years in which SA falls behind because there's no new > feature development while theo, justin, etc rewrite all the simple stuff > like mime parsers.) > > > In summary. It would be complex and difficult. It would probably even be > > painful, but I think there would be performance improvements in porting SA > > in C/C++. The question is another: would it be something worth to do? > > > > If a C/C++ SA achieves a /2 mem*cpu factor (ie: it takes half the CPU or > > half the memory it took in perl) I don't think it is worth the effort to > > recode it. > > > I'd say it would make a /4 memory factor and a /1.5 cpu factor, if your > rewrites are good. If not well tuned, could be a *2 cpu factor. Coupled > with a *20 effort factor. > > But if we are speaking of a /10 mem*cpu factor, well, it could easily be > > interesting, isn't it? > > > No. I think it would be patently stupid because of the massive effort > involved and loss of mind-power. But if you like, by all means, go for > it, prove us all wrong..