Re: [Zope] Finding a match in a large dataset - btrees?
On Mon, Dec 05, 2005 at 04:57:03PM +1300, Cameron Beattie wrote: I have a large set of data (that will be stored in MySQL) that I wish to match to and am wondering what the best method is. Assume the following data in table LOCATION_MATCH: LOCATION_IDLOCATION_PATTERNPARENT_ID 106 11410 13211 12911 14113 15213 The string 6438 should return 11, 6421 14, 6422 15 and 6499 12. I stopped reading here because I couldn't make any sense of the example. How does 6438 return 11? -- Paul Winkler http://www.slinkp.com ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Finding a match in a large dataset - btrees?
On 12/5/05, Cameron Beattie [EMAIL PROTECTED] wrote: I have a large set of data (that will be stored in MySQL) that I wish to match to and am wondering what the best method is. Assume the following data in table LOCATION_MATCH: LOCATION_IDLOCATION_PATTERNPARENT_ID 106 11410 13211 12911 14113 15213 The string 6438 should return 11, 6421 14, 6422 15 and 6499 12. Eh, what? OK, I understand what is happening, but why? It seems to me that this is an implementation of a btree, but why? SQL databases have indexes, why not just use them? :) I've read a bit about btrees on the zope wiki and wonder if that's the best way. Probably, but only if you actually let them do the indexing. :-) -- Lennart Regebro, Nuxeo http://www.nuxeo.com/ CPS Content Management http://www.cps-project.org/ ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Finding a match in a large dataset - btrees?
I cannot make any sense out of your example data. Since you already have the data in a MySQL table, I would think the best way to find the match would be to make a MySQL query. On Mon, 5 Dec 2005, Cameron Beattie wrote: I have a large set of data (that will be stored in MySQL) that I wish to match to and am wondering what the best method is. Assume the following data in table LOCATION_MATCH: LOCATION_IDLOCATION_PATTERNPARENT_ID 106 11410 13211 12911 14113 15213 The string 6438 should return 11, 6421 14, 6422 15 and 6499 12. I've read a bit about btrees on the zope wiki and wonder if that's the best way. However I am struggling with the basics: 1. How do I get the data from MySQL into a btree in Zope? Something like: from BTrees.IIBTree import * t = IIBTree() t.update(context.select_from_LOCATION_MATCH) # errr, no 2. How do I find the matching node i.e. when I want to know that 6422 relates to location_id 15? Any help or pointers to further documentation would be appreciated. Regards Cameron ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev ) -- ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Finding a match in a large dataset - btrees?
Thanks for the many replies. I apologise for the original message which was obviously very unclear - I will try to correct that. Basically I want to match a telephone number to a rate table. Assume the following rate table: Telephone codeRate 6491 64212 64222.5 64 1.5 The following table sets out the appropriate rate to return for a given telephone number Dialled numberReturned rate 649123 1 6431231.5 6422 64221 2.5 Things to note: If there is no match the right-most digit is removed until there is a match e.g. 643123 matches to 64 The length of the dialled number is not consistent e.g. 649123 vs 64221 The length of the telephone code is not consistent e.g. 64 vs 6422 I want to do this frequently and at low cost i.e. ideally in memory. Perhaps the best way is to write a procedure in MySQL however I am interested in any python-based alternatives. Regards Cameron ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Finding a match in a large dataset - btrees?
--On 6. Dezember 2005 14:55:08 +1300 Cameron Beattie [EMAIL PROTECTED] wrote: I want to do this frequently and at low cost i.e. ideally in memory. Perhaps the best way is to write a procedure in MySQL however I am interested in any python-based alternatives. I pointed you already to the BTrees documentation. -aj pgprF4vRE4Ede.pgp Description: PGP signature ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Finding a match in a large dataset - btrees?
--On 6. Dezember 2005 14:55:08 +1300 Cameron Beattie [EMAIL PROTECTED] wrote: I want to do this frequently and at low cost i.e. ideally in memory. Perhaps the best way is to write a procedure in MySQL however I am interested in any python-based alternatives. Your problem is more an algorithmic problem than q question of the data representation. Store the mapping as some BTree (possibly IIBTree using normalized integers for the values). Then you have to iterate over all prefixes of the dialed number starting with the longest prefix to the smallest prefix. Then perform a lookup of the prefix in your BTree... this is should not take more than five lines of code and requires a maximum of #(length_of_dialed_number) comparisons. If you want it more complicated construct a tree and you'll have a logarithmic number of comparisons. I doubt that it is worth the effort since the first solution should be fast enough unless your dialed number is hundreds of digits long :-) -aj pgp2ARghXrI2q.pgp Description: PGP signature ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Finding a match in a large dataset - btrees?
Am Dienstag, den 06.12.2005, 14:55 +1300 schrieb Cameron Beattie: Thanks for the many replies. I apologise for the original message which was obviously very unclear - I will try to correct that. ... I want to do this frequently and at low cost i.e. ideally in memory. Perhaps the best way is to write a procedure in MySQL however I am interested in any python-based alternatives. *hint* use postgres instead where you can write stored functions in python too :-) (With some little modifications on the zope database adaptor you could even return a dictionary or btree object as result of your query then :-) SCNR Tino ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Finding a match in a large dataset - btrees?
--On 5. Dezember 2005 16:57:03 +1300 Cameron Beattie [EMAIL PROTECTED] wr I've read a bit about btrees on the zope wiki and wonder if that's the best way. However I am struggling with the basics: 1. How do I get the data from MySQL into a btree in Zope? Something like: from BTrees.IIBTree import * t = IIBTree() t.update(context.select_from_LOCATION_MATCH) # errr, no Reading helps: http://www.zope.org/Members/ajung/ZopeHomeOfAndreasJung/BTrees/FrontPage 2. How do I find the matching node i.e. when I want to know that 6422 relates to location_id 15? I still have no idea what your example should tell me. A BTree basically implements the same API as a Python dictionary. If you can implement your solution in pure Python then you can just switch to BTrees. But I have no idea about your example especially BTrees implement a 1:1 relationship (when using an IIBTree). -aj pgpJEXUnC9XUz.pgp Description: PGP signature ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )