Re: Another non-standard search/sort field :)

ellie timoney Mon, 14 Sep 2015 20:12:09 -0700

Apparently atof and strtod will respect the current locale, and expect
the decimal separator to be a ',' in places where that's a thing --
which is bad if the string you're parsing is always dot-separated,
regardless of locale, because you'll end up parsing it wrong depending
on where you run the software.


This function avoids the locale problem (i.e is "locale-safe") by
explicitly looking for a '.'

Seems to be a thing that everyone and their dog has implemented at some
point or another, lots of google results of {atof locale} and {strtod
locale}.  The proposed solutions generally come down to: write your own
parser, borrow a parser from somewhere else, or explicitly set the
locale before parsing (and maybe change it back after)

On Tue, Sep 15, 2015, at 12:40 PM, Bron Gondwana wrote:
> What's the problem with atof?
> 
> (or just skipping the dot and making it an integer?)
> 
> Bron.
> 
> On Tue, Sep 15, 2015, at 11:32, ellie timoney wrote:
> > This patch contains code from libspamc.c (SpamAssassin?)
> > 
> > > +/* Stolen from libspamc.c, and I'm not sure about license compatibility 
> > > */
> > > +static float _locale_safe_string_to_float(char *buf, int siz)
> > > +{
> > 
> > Here it is in their repo: 
> > http://svn.apache.org/viewvc/spamassassin/trunk/spamc/libspamc.c?view=markup#l929
> > 
> > SpamAssassin uses the Apache license: 
> > http://svn.apache.org/viewvc/spamassassin/trunk/LICENSE?view=markup
> > 
> > I'm not sure about license compatibility here either.  Thoughts, anyone?
> > 
> > On Mon, Sep 14, 2015, at 10:49 AM, ellie timoney wrote:
> > > Hi Vladislav,
> > > 
> > > Thanks for the patch.  I'll try to get it merged this week.
> > > 
> > > ellie
> > > 
> > > On Sat, Sep 12, 2015, at 10:06 AM, Bron Gondwana wrote:
> > > > On Fri, Sep 11, 2015, at 23:19, Vladislav Bogdanov wrote:
> > > > > 11.09.2015 14:49, Bron Gondwana wrote:
> > > > > > On Fri, Sep 11, 2015, at 20:00, Vladislav Bogdanov wrote:
> > > > > >> 11.09.2015 12:01, Bron Gondwana wrote:
> > > > > >>> On Fri, Sep 11, 2015, at 16:48, Vladislav Bogdanov wrote:
> > > > > >>>> 11.09.2015 03:52, Bron Gondwana wrote:
> > > > > >>>>> sort: spamscore search: spamabove / spambelow
> > > > > >>>>>
> > > > > >>>>> These use the X-Spam-score header which is a floating point 
> > > > > >>>>> number
> > > > > >>>>> with a single decimal place usually, i.e. 5.0, 17.3.  spamabove 
> > > > > >>>>> is GE
> > > > > >>>>> and spambelow is LT.
> > > > > >>>>>
> > > > > >>>>> I'm going to push this back, because it doesn't clash with 
> > > > > >>>>> anything.
> > > > > >>>>> It's kinda nice to be able to sort by spamscore to quickly put 
> > > > > >>>>> the
> > > > > >>>>> focus on the most likely to be be wrongly classified messages, 
> > > > > >>>>> and
> > > > > >>>>> we're going to support that in our interface at some stage.
> > > > > >>>>>
> > > > > >>>>> Bron.
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>> Ah, I have a nice patch for spamtest extension against 2.4.17.
> > > > > >>>>
> > > > > >>>> It connects to spamd itself from lmtpd, checks the message and 
> > > > > >>>> sets
> > > > > >>>> additional headers. Sieve integration is done too.
> > > > > >>>>
> > > > > >>>> Need to send it here.
> > > > > >>>
> > > > > >>> That would be great.
> > > > > >>
> > > > > >> Attached. Just found that it inconsistently uses tabs/spaces. I 
> > > > > >> hope
> > > > > >> that is not an issue at least for initial review.
> > > > > >
> > > > > > Initial impressions:
> > > > > >
> > > > > > I would rename 's' to 'fd'.  Everyone knows what 'fd' does, s could 
> > > > > > be anything,
> > > > > > and being an int, it's totally un-typesafe.
> > > > > 
> > > > > It was initially written against newly-released 2.3.8. That time 
> > > > > socket 
> > > > > fd was usually named 's' or 'sock'. 's' is shorter ;) Did that change?
> > > > 
> > > > Fair enough :)  I haven't dealt with that area of the code quite so 
> > > > much,
> > > > more the index and database filehandles.
> > > > 
> > > > > Yep, that never resulted in lost messages for last N+1 years in quite 
> > > > > busy setups (not as fastmail, but anyways...).
> > > > > 
> > > > > So, yes, that "works for me" and such change would be really cosmetic.
> > > > > 
> > > > > I just ported it to 2.4.17 recently without even looking much at the 
> > > > > code (but yes, that is tested on newer setups right from the patch 
> > > > > date).
> > > > > 
> > > > > I can send previous revision (against 2.3.13) as well.
> > > > 
> > > > No, that's fine.  We can work with this.
> > > > 
> > > > > > I've got a sneaking feeling that your entire spamtest_parse_hosts 
> > > > > > could be
> > > > > > turned into a tight little piece of code based on strarray_split() 
> > > > > > - but it looks fine.
> > > > > 
> > > > > No opinion. It was not available in 2.3.x. IMHO spamtest_parse_hosts 
> > > > > is 
> > > > > very straight-forward.
> > > > > 
> > > > > And, that _may_ conflict with the line in TODO list (which I probably 
> > > > > will never do anyways because I have no idea how to make that fair 
> > > > > enough without initial lookups):
> > > > > * Make load-balancing work if hostname that resolve to multiple A 
> > > > > records is used in "spamtest_spamd_hosts".
> > > > 
> > > > One interesting possibility (more for spam than virus checking) is to
> > > > send requests
> > > > for the same user to the same host always, which would require some form
> > > > of
> > > > consistent hashing.  We do that with nginx at FastMail for web requests:
> > > > 
> > > > upstream internalbackend {
> > > >   server web1.nyi.internal:8080;
> > > >   server web2.nyi.internal:8080;
> > > >   server web3.nyi.internal:8080 down;
> > > >   server web4.nyi.internal:8080;
> > > >   server web5.nyi.internal:8080;
> > > >   server web6.nyi.internal:8080;
> > > > }
> > > > 
> > > > So it knows web3 is down and hashes elsewhere, but when it comes back 
> > > > up,
> > > > the
> > > > same users will move back.
> > > > 
> > > > We do the same for postfix lmtp delivery with some magic code in a thing
> > > > called lmtpforwardd,
> > > > which just listens on localhost and forwards to the correct server based
> > > > on the list of up spam
> > > > scan machines.
> > > > 
> > > > > > All the code looks like it works (which is not a surprise, because 
> > > > > > it's been used).
> > > > > > My main concerns would be around signal safety in the file IO 
> > > > > > syscalls.
> > > > > 
> > > > > Feel free to convert them to prot ones, but I do not feel it is 
> > > > > strictly 
> > > > > required.
> > > > 
> > > > Sure thing :)
> > > > 
> > > > Bron.
> > > > 
> > > > -- 
> > > >   Bron Gondwana
> > > >   br...@fastmail.fm
> 
> 
> -- 
>   Bron Gondwana
>   br...@fastmail.fm

Re: Another non-standard search/sort field :)

Reply via email to