That's not even mentioning the metaprogramming and higher-order
programming techniques that we use extensively in SpamAssassin -- those
are basically *just not possible* in C/C++. ;)

--j.

Matt Kettler writes:
> Giampaolo Tomassoni wrote:
> > From: Matt Kettler [mailto:[EMAIL PROTECTED]
> >   
> >
> >> That said, I agree, trying to implement SA in C++ would be a NIGHTMARE.
> >>
> >> C++ is NOT an optimal language for apps that are string-parsing intensive.
> >>     
> >
> > I don't agree in this: I think there are good ways to handle strings in C++ 
> > which are good enough for the purposes of SA and the security constraints 
> > which would need to be enforced.
> >   
> I did not say there were no secure string handling methods. I said C++
> was not an optimal language for string parsing. Sure you can use STL's
> string library and gain some security.  However writing string parsing
> in C++ is a pain in the tail and results in a lot of very long and
> hard-to-maintain code. Writing string parsing in perl is easy and
> results in very compact easy-to-maintain code.
> 
> I know. I write C/C++ for a living. String parsing in C++ sucks. Period.
> 
> Let's see here.. let's find the last , in a string and extract all the
> characters after it as a new string..
> 
> c++: Urgh.. Make a loop, compare each character, storing the most recent
> match, then do an ugly substring call using that index and length-index.
> perl: an easy-to-write regex will do this. There are probably better
> ways I don't know of.
> 
> The perl code is slower, but the C++ code is hard to write and hard to
> maintain. I'm sure there's another way to do the perl code that's faster
> and comparable to C++ here. However, I've yet to see anyone do this
> operation repeatedly in C++ without ever making an off-by-one error
> somewhere.
> 
> >
> >   
> >> Drawbacks to C/C++:
> >>     - regex is not language native, added by PCRE library.
> >>     
> >
> > Which is opensource as well, so it may be used. A lot of things are not 
> > language-native in C/C++. That's because C/C++ is designed. It can't be 
> > regarded as a language limit, however: you can easily use external 
> > libraries for all the "natively unsupported" features.
> >   
> True, but regexes in perl are NATIVE. You can use them ANYWHERE. Even as
> a parameter to a function call. To do regexes in C++ you have to make an
> external call to a library. Have you ever used PCRE? It's a pain. You
> have to call multiple functions, one to set up the regex, and another to
> do the match. That's not so bad for the rules, but do you know how many
> little regexes are scattered around the SA code that would have to be
> broken out? Urgh.
> 
> >
> >   
> >>     - Too many folks write C/C++ badly, failing to watch their memory.
> >>     
> >
> > That's a problem which may afflict even perl or python programs and 
> > programmers. You're right: under C++ writing bad code often results in 
> > sharper effects. But of course if you want to squeeze more performances you 
> > need to trade off something. In the C/C++ case, ease of coding would be 
> > traded a bit off in spite of higher performances.
> >
> >
> >   
> >> This is substantially more likely in anything involving string handling,
> >> which is everything SA does.
> >>     
> >
> >
> >
> >   
> >>     - C/C++ does not have many of the very nice libraries that perl has
> >> for DNS, SPF, IP:Country, Base64, etc, etc.
> >>     
> >
> > Well, DNS and Base64 are base services which are provided anyway. They came 
> > in a different "shape", but still present.
> >   
> As is SPF. But I would not call any of these libraries "nice".
> > SPF and IP::Country would need to be somehow rewritten, of course. These 
> > falls under the "plugin problem". It wouldn't be probably easy to replicate 
> > the (good) behaviour of these perl modules, but I don't even think it 
> > wouldn't be possible or even not worth to try it.
> >
> > Worse, most of the Mail:: modules would need to be somehow rewritten or 
> > otherwise implemented.
> >
> > Of course, a SA "recode" in C/C++ wouldn't came gratis.
> >
> >
> >   
> >>     -Again, the development team is perl programmers, unless you've got
> >> a set of equivalent spam experts, or can prove the existing devs all
> >> know your proposed language, even suggesting ANY port to ANY other
> >> language is inane. You may as well suggest changing the spoken language
> >> of the documentation to something other than English. Thus far, all the
> >> writers speak English. Many know other spoken languages besides
> >> English,  but I doubt you'd find another one that they ALL speak.
> >>     
> >
> > I agree with you that this would be a great problem, but it is not going to 
> > be the main problem, isn't it?
> >   
> I would suggest it would be.
> > Most programmers in this list seems to be very versatile about programming 
> > languages. Also, if you know perl, the next language you know is often 
> > C/C++. That's just because C/C++ is often the first serious language you 
> > learn.
> >   
> Yes, but many of the SA team do not have a programming background. They
> have a sysadmin background and learned perl to support CGI's and
> maintenance scripting. From there, they expanded.
> 
> And what motivation do they have to move to a language that would result
> in substantially more work in order to maintain the code and a MASSIVE
> effort to rewrite? (I'm thinking about 10 man-years for the rewrite
> That's 10 man-years in which SA falls behind because there's no new
> feature development while theo, justin, etc rewrite all the simple stuff
> like mime parsers.)
> 
> > In summary. It would be complex and difficult. It would probably even be 
> > painful, but I think there would be performance improvements in porting SA 
> > in C/C++. The question is another: would it be something worth to do?
> >
> > If a C/C++ SA achieves a /2 mem*cpu factor (ie: it takes half the CPU or 
> > half the memory it took in perl) I don't think it is worth the effort to 
> > recode it.
> >   
> I'd say it would make a /4 memory factor and a /1.5 cpu factor, if your
> rewrites are good. If not well tuned, could be a *2 cpu factor. Coupled
> with a *20 effort factor.
> > But if we are speaking of a /10 mem*cpu factor, well, it could easily be 
> > interesting, isn't it?
> >   
> No. I think it would be patently stupid because of the massive effort
> involved and loss of mind-power. But if you like, by all means, go for
> it, prove us all wrong..

Reply via email to