OK, I've just tested the patch and the new query. Works great, query takes 0.0027 seconds, now (instead of ~8 seconds). Thank you Mark for the quick response and the patch.
I guess these are useful changes for everyone, so good idea to include it in next releases. Marco On Fri, Sep 24, 2010 at 10:39 AM, Marco Fretz <[email protected]> wrote: > Hi Mark, > > Wow, that's exactly what I need, Perfect! Thank you very much. I'll > try the patch and change my query. Feedback follows. > > > Marco > > On Fri, Sep 17, 2010 at 5:29 PM, Mark Martinec > <[email protected]> wrote: >> Marco, >> >>> We build a clustered mail system with postfix, amavisd, clamd, SA, >>> dovecot, mysql, etc. 2 Servers are handling incoming mail (MX records >>> pointing to us) and 2 are handling outgoing mail (relaying). mysql >>> database is stored on 2 master-master replicated servers. >>> >>> so far, everything is working well. but we have problems with the >>> sql_lookup from amavisd-new that we need to lookup per recipient >>> whitelisting. >>> >>> $sql_select_policy = 'select amavis_policy.*,amavis_recipients.id as id '. >>> 'FROM (hostings,domains,amavis_recipients,products,amavis_policy) LEFT >>> JOIN (users,accounts) '. >>> 'ON accounts.hosting_id=hostings.id AND users.account_id=accounts.id '. >>> 'WHERE (amavis_recipients.hosting_id = hostings.id OR >>> amavis_recipients.account_id = accounts.id) '. >>> 'AND domains.hosting_id=hostings.id '. >>> 'AND hostings.product_id = products.id '. >>> 'AND hostings.amavis_policy_id = amavis_policy.id '. >>> 'AND ( (is_mailbox=1 AND concat(username,\'@\',domainname) IN (%k)) '. >>> ' OR ((is_transport=1 OR is_mxbackup=1) AND concat(\'@\',domainname) IN >>> (%k))'. >>> ')'; >>> [...] >>> The DB layout is like this: one account can have multiple usernames >>> (aliases), one hosting can have multiple domains, one hosting can have >>> multiple accounts. >>> that means, that we don't have a email address field. so we have to >>> merge (concat) domainname and username field to check the %k strings. >>> >>> we also need per recipient email address and domain lookup, not just >>> email address. because we also do something like email proxy, where we >>> don't know the usernames, just the domains for a customer. >>> >>> the query takes about 8 seconds with 3500 usernames and 400 domains. >>> >>> is there a way to change the query like this (separate variables in >>> the query for domain and username)? >>> >>> select amavis_policy.*,amavis_recipients.id as id FROM >>> (hostings,domains,amavis_recipients,products,amavis_policy) LEFT JOIN >>> (users,accounts) ON accounts.hosting_id=hostings.id AND >>> users.account_id=accounts.id WHERE (amavis_recipients.hosting_id = >>> hostings.id OR amavis_recipients.account_id = accounts.id) AND >>> domains.hosting_id=hostings.id AND hostings.product_id = products.id >>> AND hostings.amavis_policy_id = amavis_policy.id AND ( >>> (is_mailbox=1 AND username='<<<<username>>>>' and >>> domainname='<<<<domain>>>>') OR ((is_transport=1 OR is_mxbackup=1) AND >>> domainname='<<<<domain>>>>')); >>> >>> this query is executed in ~0.04 sec... what would be okay... >> >> Adding a few more placeholders to existing %k and %a would be >> a small change. I can add the following: >> >> %l -> full unmodified localpart, >> %u -> lowercased username (a localpart without extension) >> %e -> lowercased address extension (including a delimiter), >> %d -> lowercased domain, >> >> if that will help in your situation. >> >> For example, given the: [email protected] >> >> %a [email protected] >> %l User+Foo >> %u user >> %e +foo >> %d sub.example.com >> >> %k [email protected] >> [email protected] >> [email protected] >> user+foo >> user >> @sub.example.com >> @.sub.example.com >> @.example.com >> @.com >> @. >> >> >> >> Below is a patch for 2.6.4. It will go into in 2.7.0-pre8. >> >> --- amavisd.orig 2009-06-25 14:39:01.000000000 +0200 >> +++ amavisd 2010-09-17 17:13:34.139201079 +0200 >> @@ -1138,9 +1138,12 @@ >> # for a recipient (e.g. a full address, domain only, catchall), %a will be >> # replaced by an exact recipient address (same as the first entry in %k, >> - # suitable for pattern matching). Use ORDER, if there is a chance that >> - # multiple records will match - the first match wins (i.e. the first >> - # returned record). If field names are not unique (e.g. 'id'), the later >> - # field overwrites the earlier in a hash returned by lookup, which is why >> - # we use 'users.*, policy.*, users.id', i.e. the id is repeated at the >> end. >> + # suitable for pattern matching), %l by a full unmodified localpart, %u by >> + # a lowercased username (a localpart without extension), %e by lowercased >> + # addr extension (which includes a delimiter), and %d for lowercased >> domain. >> + # Use ORDER if there is a chance that multiple records will match - the >> + # first match wins (i.e. the first returned record). If field names are >> + # not unique (e.g. 'id'), the later field overwrites the earlier in a hash >> + # returned by lookup, which is why we use 'users.*, policy.*, users.id', >> + # i.e. the id is repeated at the end. >> # This is a legacy variable for upwards compatibility, now only referenced >> # by the program through a %sql_clause entry 'sel_policy' - newer config >> @@ -14730,5 +14733,6 @@ >> # $clause_name is a key into %sql_clause of the currently selected >> # policy bank; one level of indirection is allowed in %sql_clause result, >> - # the resulting SQL clause may include %k or %a, to be expanded >> + # the resulting SQL clause may include %k, %a, %l, %u, %e, %d >> placeholders, >> + # to be expanded >> bless { conn_h => $conn_h, incarnation => 0, clause_name => $clause_name >> }, >> $class; >> @@ -14831,14 +14835,22 @@ >> local($1); my(@pos_args); my($sel_taint) = substr($sel,0,0); # taintedness >> my($datatype) = $sql_allow_8bit_address ? SQL_VARBINARY : SQL_VARCHAR; >> - $sel =~ s{ ( %k | %a | \? ) } # substitute %k for keys, %a for exact mail >> - # address, and ? for each extra arg >> + >> + # substitute %k for a list of keys, %a for unmodified full mail address, >> + # %l for full unmodified localpart, %u for lowercased username (a >> localpart >> + # without extension), %e for lowercased extension, %d for lowercased >> domain, >> + # and ? for each extra argument >> + $sel =~ s{ ( %[kalued] | \? ) } >> { push(@pos_args, $1 eq '%k' ? map { [$_, $datatype] } @$keys_ref >> - : $1 eq '%a' ? [$keys_ref->[0], $datatype] >> - # same as first >> in %k >> + : $1 eq '%a' ? [$rhs_ref->[0], $datatype] #full >> addr >> + : $1 eq '%l' ? [$rhs_ref->[1], $datatype] >> #localpart >> + : $1 eq '%u' ? [$rhs_ref->[2], $datatype] >> #username >> + : $1 eq '%e' ? [$rhs_ref->[3], $datatype] >> #extension >> + : $1 eq '%d' ? [$rhs_ref->[4], $datatype] #domain >> : shift @extras_tmp), >> $1 eq '%k' ? join(',', ('?') x $n) : '?' }gxe; >> $sel = untaint($sel) . $sel_taint; # keep original clause taintedness >> - ll(4) && do_log(4,"lookup_sql \"%s\", query args: %s", >> - $addr, join(', ', map{"\"$_\""} @pos_args) ); >> + ll(4) && do_log(4,"lookup_sql \"%s\", query args: %s", $addr, >> + join(', ', map { !ref $_ ? '"'.$_.'"' : '['.join(',',@$_).']'} >> + �...@pos_args) ); >> ll(4) && do_log(4,"lookup_sql select: %s", $sel); >> my($a_ref,$found); my($match) = {}; my($conn_h) = $self->{conn_h}; >> >> >> >> Mark >> > ------------------------------------------------------------------------------ Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev _______________________________________________ AMaViS-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/amavis-user Please visit http://www.ijs.si/software/amavisd/ regularly For administrativa requests please send email to rainer at openantivirus dot org
