Wouldn’t that be SQL “IN” instead of LIKE?

First off, see if you can do this in memory. If it’s just a “rate sheet” list 
of prefixes + billing, it’s probably not that much data. I’d guess maybe 64-128 
bytes per record (in Ruby it’s probably more, maybe 256 bytes?) inside a 
hashtable. Even so, we’re only talking about 1.5 to 3GB of RAM. Stick it in a 
nice performing dictionary of some sort and you’re set.

Just as a quick test, on one core of a Core2, it takes about 150ms to do 100K 
lookups against a 1M item hashtable (F#) – that’s just a 1 line loop I tried 
out with no optimization, doing about 650K/sec. For comparison on the same 
machine, looping on a simple “SELECT @@VERSION” command only achieved 5000 
executions/second on a single thread – that’s just the SQL overhead (to a local 
SQL instance!).

If the memory usage is too excessive, consider caching only subsets of the 
prefixes. Keep track of which leading prefix (1 or 2 digits) are the most used, 
and keep those in memory. Have a reasonably fast DB to fall back to.

If you’re sure the storage requirements or the change frequency truly 
eliminates keeping it in your own memory, here are some suggestions for working 
on DB performance:

A little perf gain is to pre-calculate possible lengths, so you’re performing 
less lookups. Just have a table based on the first 2 or 3 digits, giving you 
the valid prefix lengths. That can easily cut down the number of lookups from 
10-12 to, say, 3-4. Compare the perf of temp tables, CTEs, and dynamic SQL “IN”.

But I would be hesitant to do this from a higher level ORM or via dynamic SQL 
queries:

-          You’ll have a much harder time doing DB-specific optimizations/hints

-          Dynamic SQL needs to be parsed each time (possibly minor issue)

-          If the ad hoc queries are not properly parameterized, query plans 
might not get reused very well

-          You can easily send many times more data to the server (your entire 
SQL, versus just the sproc name + parameters)

Really look into doing programmability on the server. There’s all sorts of 
things you can do to minimize what you need to send to the database, and let 
the DB engine figure out the most efficient way. You might want to use a 
product that’s a bit more advanced than MySQL.

And finally, test, test, test. You should have a nice benchmark with several 
million rows and be able to execute, say, 10-50K lookups (on multiple threads) 
after each change. Modify your query, execute it, _inspect the execution plan_. 
Check your indexes – make sure they are covering the query in the right way. 
(For example, if the table is (ID, Prefix, Price), you want an index of 
(Prefix, Price).) Review possible hints (for example, SQL Server usually needs 
a hint to choose a indexed view). Make sure your transaction level is set 
appropriately (can you do the query without locking?). Can you batch up several 
numbers in one command, saving on the per-command overhead?

Also consider it in light of the rest of your application. If everything is in 
the same database, then you can probably do the entire rate call/save 
cdr/update balances in a single command and transaction.

For us, our initial draft of routing inside SQL Server performed well under 100 
calls/sec – not usable. After a few days of playing with things, it went over 
1000 calls/sec, which is comfortable enough for a $500 server. This is matching 
100 gateways and dialplans with several thousand entries per dialplan, across 4 
million+ routes, doing QBR/LCR in the process.

-Michael


From: [email protected] 
[mailto:[email protected]] On Behalf Of Muhammad 
Shahzad
Sent: Wednesday, October 14, 2009 1:55 AM
To: [email protected]
Subject: Re: [Freeswitch-users] Some help with my post-paid billing project

I fully agree that direct matching is much faster then pattern matching in SQL.

One of my clients had same problem, he had around 12 million number prefixes in 
a table and during each call an AGI script use to query that table to find 
longest prefix match, but this use to take like 3-5 seconds even with indexed 
columns. So after a lots of R & D we come up with following logic,

1. We break the destination number length-wise, e.g. suppose number is 
923344224088 then length chunks would be,

923344224088, 92334422408, 9233442240, 923344224, 92334422, 9233442, 923344, 
92334, 9233, 923, 92, 9

2. Then use SQL LIKE function in WHERE clause (you can also use SQL OR function 
if your DBMS doesn't support SQL LIKE function), and pass all these chunks to 
it, e.g.

WHERE prefix LIKE (923344224088, 92334422408, 9233442240, 923344224, 92334422, 
9233442, 923344, 92334, 9233, 923, 92, 9)

3. Lastly we ORDER the result by prefix length, e.g.

ORDER BY LENGTH(prefix) DESC LIMIT 1

4. The complete query will be,

SELECT * FROM prefixes
WHERE prefix LIKE (923344224088, 92334422408, 9233442240, 923344224, 92334422, 
9233442, 923344, 92334, 9233, 923, 92, 9)
ORDER BY LENGTH(prefix) DESC LIMIT 1

Now the query takes less then 150 ms to execute. :-)

Here is an STL method that can generate this query, i am sure you can convert 
it to any programming language of your choice easily.

=========================================================
std::string GetQuery(std::string destination) {
        std::string query = "SELECT * FROM prefixes WHERE prefix LIKE ('" + 
destination;

        for(int i=1; i<destination.length(); i++) {
                query += "','" + destination.substr(0, (i * -1));

        }

        query += "') ORDER BY LENGTH(prefix) DESC LIMIT 1";
        return query;
}
=========================================================

I am pretty sure this query is 100% ANSI SQL compatible 
(http://en.wikipedia.org/wiki/SQL).

Thank you.
On Wed, Oct 14, 2009 at 10:15 AM, Michael Giagnocavo 
<[email protected]<mailto:[email protected]>> wrote:
In our testing with SQL Server, we found that executing several queries for 
direct matches yielded far better performance than one query trying to check 
prefixes. (The column was also part of the clustered index, but AFAIK MySQL 
doesn’t support defining your own clustered indexes; you get the PK always.)

-Michael

From: 
[email protected]<mailto:[email protected]>
 
[mailto:[email protected]<mailto:[email protected]>]
 On Behalf Of Diego Viola
Sent: Tuesday, October 13, 2009 7:54 PM
To: 
[email protected]<mailto:[email protected]>
Subject: Re: [Freeswitch-users] Some help with my post-paid billing project

Wrong question.

Is there a way to compare numbers with prefixes without using the prefix module?

Diego
On Wed, Oct 14, 2009 at 1:36 AM, Diego Viola 
<[email protected]<mailto:[email protected]>> wrote:
I'm using MySQL now but I will try PostgreSQL with the prefix module, is there 
a way to do that without the prefix module and with regular SQL?

Any examples?

Diego

On Tue, Oct 13, 2009 at 10:45 PM, Even André Fiskvik 
<[email protected]<mailto:[email protected]>> wrote:
What database are you using?
You could do this with regular SQL, but it would by a costly operation,
for PostgreSQL we're using the prefix module: 
http://pgfoundry.org/projects/prefix/

You can then match the closest prefix by using something like
"WHERE myprefix_col @> caller_destination_number ORDER BY LENGTH
(myprefix_col::text) LIMIT 1;"


Best regards,
Even André


On 13. okt. 2009, at 23.53, Diego Viola wrote:

> Hello,
>
> I'm trying to write a post-paid billing script, I have the CDR on my
> database and also a "rates" table, the CDR contains fields like
> caller_destination_number, variable_duration, etc. and the rates
> table contains: destination, prefix, rate (cost).
>
> The problem is that I can't just strip the destination number to
> take the prefix from it because I have to deal with destination
> numbers from different countries and they all have different prefix
> lengths... so I need to find another way to take the prefix from the
> destination number.
>
> Any ideas how to do this?
>
> Thanks,
>
> Diego
>
> _______________________________________________
> FreeSWITCH-users mailing list
> [email protected]<mailto:[email protected]>
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-
> users
> http://www.freeswitch.org


_______________________________________________
FreeSWITCH-users mailing list
[email protected]<mailto:[email protected]>
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org



_______________________________________________
FreeSWITCH-users mailing list
[email protected]<mailto:[email protected]>
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org



--
________________________________________________________
|                                                                               
                 |
| FATAL ERROR                                                           --- O X 
|
|_______________________________________________________|
|                        You have moved the mouse.                            |
| Windows must be restarted for the changes to take effect.   |
|                                        <OK>                                   
           |
####################################/


Muhammad Shahzad
-----------------------------------
CISCO Rich Media Communication Specialist (CRMCS)
CISCO Certified Network Associate (CCNA)
Cell: +92 334 422 40 88
MSN: [email protected]<mailto:[email protected]>
Email: [email protected]<mailto:[email protected]>
_______________________________________________
FreeSWITCH-users mailing list
[email protected]
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org

Reply via email to