Re: [Zope] Design Question

2000-09-19 Thread Steve Alexander

Tim Cook wrote:

> Maik Roeder wrote:
> 
> 
>> I guess the Catalog doesn't support OR searches, which would
>> do what you need.

It supports keyword indexes, which will give you what you want.


> 
> Actually the ZCatalog solution works. But having 15,212 instances
> of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that
> an external python method might provide a better solution?

Don't store them as ZClasses, and also in a ZCatalog. Just store them in 
the ZCatalog.

You can set the ZCatalog up to store as metadata your key, and the list 
of values. Then, also set up a field index on your key, and a keyword 
index for your values. Write an external method that puts your keys and 
values into the ZCatalog. You won't be storing these anywhere else in 
the ZODB; they will exist only as data within the catalog. You'll access 
them only as Catalog Brains.

--
Steve Alexander
Software Engineer
Cat-Box limited
http://www.cat-box.net


___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] Design Question

2000-09-18 Thread Maik Roeder

Hi Tim !

Tim Cook wrote:
> > I guess the Catalog doesn't support OR searches, which would
> > do what you need.
> 
> Actually the ZCatalog solution works. But having 15,212 instances
> of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that
> an external python method might provide a better solution?

I think the ZCatalog itself can handle this much objects, and handing
the problem off to an external method doesn't change the problem.
 
> In the Real World(tm), there won't be but maybe one-half as many
> codes in an installation. But I like 'worst case' testing. 

Then why don't you store the objects in different folders ?
For example, I have just implemented a way to store objects in
year/month/day folder automatically to handle incoming news in
a news site. It's almost what KMNetNews does, but automatically.

Regards,

Maik Röder

-- 
Uzopia - Digging la vida Zopa - http://uzopia.editthispage.com

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] Design Question

2000-09-18 Thread Tim Cook

Maik Roeder wrote:

> 
> I guess the Catalog doesn't support OR searches, which would
> do what you need.
> 

Actually the ZCatalog solution works. But having 15,212 instances
of a ZClass in one BTree folder was pretty slow. I ASS-U-MEd that
an external python method might provide a better solution?

In the Real World(tm), there won't be but maybe one-half as many
codes in an installation. But I like 'worst case' testing. 

-- Tim Cook --
Cook Information Systems | Office: (901) 884-4126 8am-5pm CDT
* Specializing in Open Source Business Systems *
FreePM Project Coordinator http://www.freepm.org
OSHCA Founding Supporter http://www.oshca.org

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] Design Question

2000-09-18 Thread Maik Roeder

Hi Tim !

Tim Cook wrote:
> 
> Tim Cook wrote:
> 
> >
> > Anyway I pickled the dictionary and it's just over 1.3MB so I
> > thought I'd use an external method to read the pickled object,
> > pass it the paragraph and test for the correct code(s), then
> > return a list. Is this the most effective way to use Python/Zope
> > for this situation?

I guess this is effective, but not efficient whatever language
you use ;-)

> Adding to my own post:
> I did play around with shelving but did not see that there was
> much of an increase in speed. The real estate requirement went up
> to 8MB+.
> 
> Thinking outloud again...
> I think I'll take the original comma delimited file.
> Get all words and remove the common ones (the, of, and, etc)
> Find every line where each particular word appears.

That uses substring matching I guess, which is not efficient
in this case.

> Store those codes in a dictionary with the word as the key.
> (Basically, I'm turning the file around backwards I guess?)
> Then for every keyword hit from the paragraph I'll have a list of
> codes that I can count the nmber of positive hits on each code.

Here you use the quality of dictionaries, which provide quick
lookups. 

How to search for the codes with the most hits is another story :-)
 
> Hmm, sounds like a search engine to me.  Anybody got one written
> in Python . That's faster than a Zope Catalog?

I guess the Catalog doesn't support OR searches, which would
do what you need.
 
Regards,

Maik Röder

-- 
Uzopia - Digging la vida Zopa - http://uzopia.editthispage.com

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




Re: [Zope] Design Question

2000-09-17 Thread Tim Cook

Tim Cook wrote:

> 
> Anyway I pickled the dictionary and it's just over 1.3MB so I
> thought I'd use an external method to read the pickled object,
> pass it the paragraph and test for the correct code(s), then
> return a list. Is this the most effective way to use Python/Zope
> for this situation?
>

Adding to my own post:
I did play around with shelving but did not see that there was
much of an increase in speed. The real estate requirement went up
to 8MB+.

Thinking outloud again...
I think I'll take the original comma delimited file.
Get all words and remove the common ones (the, of, and, etc)
Find every line where each particular word appears.
Store those codes in a dictionary with the word as the key.
(Basically, I'm turning the file around backwards I guess?)
Then for every keyword hit from the paragraph I'll have a list of
codes that I can count the nmber of positive hits on each code.

Hmm, sounds like a search engine to me.  Anybody got one written
in Python . That's faster than a Zope Catalog?

-- Tim Cook --
Cook Information Systems | Office: (901) 884-4126 8am-5pm CDT
* Specializing in Open Source Business Systems *
FreePM Project Coordinator http://www.freepm.org
OSHCA Founding Supporter http://www.oshca.org

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )




[Zope] Design Question

2000-09-17 Thread Tim Cook


>From a performance standpoint I'd like to know how to best
implement this. Speed is important:

I have a comma delimited file that I imported into a (Python)
dictionary. This may not be the best way to do this though. Maybe
just nested lists would be better? The comma delimited file will
only need to be updated annually. So it's pretty static.

Original line from file: 1234, This list describes this code

My decidely backwards dictionary: {1234:  ['This','list',
'describes', 'this', 'code']}

I have a set of codes 2 - 6 characters long. A string of varying
lengths (usually 3 - 10 words) describe the code.  I will need to
take a paragraph of text and determine the most appropriate code
based on key words from the paragraph that match up to code
descriptions.

I was going to do this in DTML but 15,000+ codes (even using a
BTree folder)  is painfully slow.

Anyway I pickled the dictionary and it's just over 1.3MB so I
thought I'd use an external method to read the pickled object,
pass it the paragraph and test for the correct code(s), then
return a list. Is this the most effective way to use Python/Zope
for this situation?

Thanks,
-- Tim Cook --
Cook Information Systems | Office: (901) 884-4126 8am-5pm CDT
* Specializing in Open Source Business Systems *
FreePM Project Coordinator http://www.freepm.org
OSHCA Founding Supporter http://www.oshca.org

___
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )