Re: [Zope] Design Question
Tim Cook wrote: > Maik Roeder wrote: > > >> I guess the Catalog doesn't support OR searches, which would >> do what you need. It supports keyword indexes, which will give you what you want. > > Actually the ZCatalog solution works. But having 15,212 instances > of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that > an external python method might provide a better solution? Don't store them as ZClasses, and also in a ZCatalog. Just store them in the ZCatalog. You can set the ZCatalog up to store as metadata your key, and the list of values. Then, also set up a field index on your key, and a keyword index for your values. Write an external method that puts your keys and values into the ZCatalog. You won't be storing these anywhere else in the ZODB; they will exist only as data within the catalog. You'll access them only as Catalog Brains. -- Steve Alexander Software Engineer Cat-Box limited http://www.cat-box.net ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Design Question
Hi Tim ! Tim Cook wrote: > > I guess the Catalog doesn't support OR searches, which would > > do what you need. > > Actually the ZCatalog solution works. But having 15,212 instances > of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that > an external python method might provide a better solution? I think the ZCatalog itself can handle this much objects, and handing the problem off to an external method doesn't change the problem. > In the Real World(tm), there won't be but maybe one-half as many > codes in an installation. But I like 'worst case' testing. Then why don't you store the objects in different folders ? For example, I have just implemented a way to store objects in year/month/day folder automatically to handle incoming news in a news site. It's almost what KMNetNews does, but automatically. Regards, Maik Röder -- Uzopia - Digging la vida Zopa - http://uzopia.editthispage.com ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Design Question
Maik Roeder wrote: > > I guess the Catalog doesn't support OR searches, which would > do what you need. > Actually the ZCatalog solution works. But having 15,212 instances of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that an external python method might provide a better solution? In the Real World(tm), there won't be but maybe one-half as many codes in an installation. But I like 'worst case' testing. -- Tim Cook -- Cook Information Systems | Office: (901) 884-4126 8am-5pm CDT * Specializing in Open Source Business Systems * FreePM Project Coordinator http://www.freepm.org OSHCA Founding Supporter http://www.oshca.org ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Design Question
Hi Tim ! Tim Cook wrote: > > Tim Cook wrote: > > > > > Anyway I pickled the dictionary and it's just over 1.3MB so I > > thought I'd use an external method to read the pickled object, > > pass it the paragraph and test for the correct code(s), then > > return a list. Is this the most effective way to use Python/Zope > > for this situation? I guess this is effective, but not efficient whatever language you use ;-) > Adding to my own post: > I did play around with shelving but did not see that there was > much of an increase in speed. The real estate requirement went up > to 8MB+. > > Thinking outloud again... > I think I'll take the original comma delimited file. > Get all words and remove the common ones (the, of, and, etc) > Find every line where each particular word appears. That uses substring matching I guess, which is not efficient in this case. > Store those codes in a dictionary with the word as the key. > (Basically, I'm turning the file around backwards I guess?) > Then for every keyword hit from the paragraph I'll have a list of > codes that I can count the nmber of positive hits on each code. Here you use the quality of dictionaries, which provide quick lookups. How to search for the codes with the most hits is another story :-) > Hmm, sounds like a search engine to me. Anybody got one written > in Python . That's faster than a Zope Catalog? I guess the Catalog doesn't support OR searches, which would do what you need. Regards, Maik Röder -- Uzopia - Digging la vida Zopa - http://uzopia.editthispage.com ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Design Question
Tim Cook wrote: > > Anyway I pickled the dictionary and it's just over 1.3MB so I > thought I'd use an external method to read the pickled object, > pass it the paragraph and test for the correct code(s), then > return a list. Is this the most effective way to use Python/Zope > for this situation? > Adding to my own post: I did play around with shelving but did not see that there was much of an increase in speed. The real estate requirement went up to 8MB+. Thinking outloud again... I think I'll take the original comma delimited file. Get all words and remove the common ones (the, of, and, etc) Find every line where each particular word appears. Store those codes in a dictionary with the word as the key. (Basically, I'm turning the file around backwards I guess?) Then for every keyword hit from the paragraph I'll have a list of codes that I can count the nmber of positive hits on each code. Hmm, sounds like a search engine to me. Anybody got one written in Python . That's faster than a Zope Catalog? -- Tim Cook -- Cook Information Systems | Office: (901) 884-4126 8am-5pm CDT * Specializing in Open Source Business Systems * FreePM Project Coordinator http://www.freepm.org OSHCA Founding Supporter http://www.oshca.org ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )
[Zope] Design Question
>From a performance standpoint I'd like to know how to best implement this. Speed is important: I have a comma delimited file that I imported into a (Python) dictionary. This may not be the best way to do this though. Maybe just nested lists would be better? The comma delimited file will only need to be updated annually. So it's pretty static. Original line from file: 1234, This list describes this code My decidely backwards dictionary: {1234: ['This','list', 'describes', 'this', 'code']} I have a set of codes 2 - 6 characters long. A string of varying lengths (usually 3 - 10 words) describe the code. I will need to take a paragraph of text and determine the most appropriate code based on key words from the paragraph that match up to code descriptions. I was going to do this in DTML but 15,000+ codes (even using a BTree folder) is painfully slow. Anyway I pickled the dictionary and it's just over 1.3MB so I thought I'd use an external method to read the pickled object, pass it the paragraph and test for the correct code(s), then return a list. Is this the most effective way to use Python/Zope for this situation? Thanks, -- Tim Cook -- Cook Information Systems | Office: (901) 884-4126 8am-5pm CDT * Specializing in Open Source Business Systems * FreePM Project Coordinator http://www.freepm.org OSHCA Founding Supporter http://www.oshca.org ___ Zope maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope-dev )