Re: [Zope] Design Question
Hi Tim ! Tim Cook wrote: Tim Cook wrote: Anyway I pickled the dictionary and it's just over 1.3MB so I thought I'd use an external method to read the pickled object, pass it the paragraph and test for the correct code(s), then return a list. Is this the most effective way to use Python/Zope for this situation? I guess this is effective, but not efficient whatever language you use ;-) Adding to my own post: I did play around with shelving but did not see that there was much of an increase in speed. The real estate requirement went up to 8MB+. Thinking outloud again... I think I'll take the original comma delimited file. Get all words and remove the common ones (the, of, and, etc) Find every line where each particular word appears. That uses substring matching I guess, which is not efficient in this case. Store those codes in a dictionary with the word as the key. (Basically, I'm turning the file around backwards I guess?) Then for every keyword hit from the paragraph I'll have a list of codes that I can count the nmber of positive hits on each code. Here you use the quality of dictionaries, which provide quick lookups. How to search for the codes with the most hits is another story :-) Hmm, sounds like a search engine to me. Anybody got one written in Python g. That's faster than a Zope Catalog? I guess the Catalog doesn't support OR searches, which would do what you need. Regards, Maik Röder -- Uzopia - Digging la vida Zopa - http://uzopia.editthispage.com ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Design Question
Maik Roeder wrote: I guess the Catalog doesn't support OR searches, which would do what you need. Actually the ZCatalog solution works. But having 15,212 instances of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that an external python method might provide a better solution? In the Real World(tm), there won't be but maybe one-half as many codes in an installation. But I like 'worst case' testing. g -- Tim Cook -- Cook Information Systems | Office: (901) 884-4126 8am-5pm CDT * Specializing in Open Source Business Systems * FreePM Project Coordinator http://www.freepm.org OSHCA Founding Supporter http://www.oshca.org ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Design Question
Hi Tim ! Tim Cook wrote: I guess the Catalog doesn't support OR searches, which would do what you need. Actually the ZCatalog solution works. But having 15,212 instances of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that an external python method might provide a better solution? I think the ZCatalog itself can handle this much objects, and handing the problem off to an external method doesn't change the problem. In the Real World(tm), there won't be but maybe one-half as many codes in an installation. But I like 'worst case' testing. g Then why don't you store the objects in different folders ? For example, I have just implemented a way to store objects in year/month/day folder automatically to handle incoming news in a news site. It's almost what KMNetNews does, but automatically. Regards, Maik Röder -- Uzopia - Digging la vida Zopa - http://uzopia.editthispage.com ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )