Re: Databases and python

2006-02-20 Thread Bryan Olson
Dan Stromberg wrote: > I've been putting a little bit of time into a file indexing engine [...] To solve the O.P.'s first problem, the facility we need is an efficient externally-stored multimap. A multimap is like a map, except that each key is associated with a collection of values, not just a s

Re: Databases and python

2006-02-17 Thread Rene Pijlman
Dan Stromberg: >Rene Pijlman: >> Right. My second attempt would be: a BTree with the word as key, and a >> BTree of filenames as value >Would ZODB let me do that? Yes. >I'm puzzled, because: d1={} d={} d[d1] = '' >TypeError: dict objects are unhashable This is using a dict as _ke

Re: Databases and python

2006-02-17 Thread Dan Stromberg
On Fri, 17 Feb 2006 12:32:52 +0100, Rene Pijlman wrote: > Dan Stromberg: >>> My first attempt would be: a BTree with the word as key, and a 'list of >>> filenames' as value. >>> http://www.zope.org/Wikis/ZODB/FrontPage/guide/node6.html#SECTION00063 >> >>This is basically what I'm d

Re: Databases and python

2006-02-17 Thread Rene Pijlman
Dan Stromberg: >> My first attempt would be: a BTree with the word as key, and a 'list of >> filenames' as value. >> http://www.zope.org/Wikis/ZODB/FrontPage/guide/node6.html#SECTION00063 > >This is basically what I'm doing now, Right. My second attempt would be: a BTree with the

Re: Databases and python

2006-02-17 Thread Bryan Olson
Dan Stromberg wrote: > Bryan Olson wrote: [...] >> Well, you could use simple files instead of fancy database tables. > > That's an interesting thought. Perhaps especially if australopithecine > were saved in a filename like: > > ~/indices/au/st/ra/lo/pi/th/ec/in/e Right, though the better fi

Re: Databases and python

2006-02-17 Thread Dan Stromberg
On Thu, 16 Feb 2006 10:09:42 +0100, Rene Pijlman wrote: > Dan Stromberg: >>is there a python database interface that would allow me to define a >>-lot- of tables? Like, each word becomes a table, and then the fields >>in that table are just the filenames that contained that word. > > Give ZODB

Re: Databases and python

2006-02-17 Thread Jonathan Gardner
About indexes everywhere: Yes, you don't have to be a DB expert to know that indexes everywhere is bad. But look at this example. There are really two ways that the data is going to get accessed in regular use. Either they are going to ask for all files that have a word (most likely) or they are go

Re: Databases and python

2006-02-17 Thread Jonathan Gardner
About the filename ID - word ID table: Any good database (good with large amounts of data) will handle the memory management for you. If you get enough data, it may make sense to get bothered with PostgreSQL. That has a pretty good record on handling very large sets of data, and intermediate sets a

Re: Databases and python

2006-02-16 Thread Dan Stromberg
On Wed, 15 Feb 2006 23:37:31 -0800, Jonathan Gardner wrote: > I'm no expert in BDBs, but I have spent a fair amount of time working > with PostgreSQL and Oracle. It sounds like you need to put some > optimization into your algorithm and data representation. > > I would do pretty much like you are

Re: Databases and python

2006-02-16 Thread Dan Stromberg
On Thu, 16 Feb 2006 13:45:28 +, Bryan Olson wrote: > Dan Stromberg wrote: >> I've been putting a little bit of time into a file indexing engine > [...] > >> So far, I've been taking the approach of using a single-table database >> like gdbm or dbhash [...] and making each entry keyed by >> a

Re: Databases and python

2006-02-16 Thread Bryan Olson
Dan Stromberg wrote: > I've been putting a little bit of time into a file indexing engine [...] > So far, I've been taking the approach of using a single-table database > like gdbm or dbhash [...] and making each entry keyed by > a word, and under the word in the database is a null terminated list

Re: Databases and python

2006-02-16 Thread bruno at modulix
Jonathan Gardner wrote: > I'm no expert in BDBs, but I have spent a fair amount of time working > with PostgreSQL and Oracle. It sounds like you need to put some > optimization into your algorithm and data representation. > > I would do pretty much like you are doing, except I would only have the

Re: Databases and python

2006-02-16 Thread Rene Pijlman
Dan Stromberg: >is there a python database interface that would allow me to define a >-lot- of tables? Like, each word becomes a table, and then the fields >in that table are just the filenames that contained that word. Give ZODB a try. http://www.zope.org/Wikis/ZODB/FrontPage http://www.pytho

Re: Databases and python

2006-02-15 Thread Jonathan Gardner
I'm no expert in BDBs, but I have spent a fair amount of time working with PostgreSQL and Oracle. It sounds like you need to put some optimization into your algorithm and data representation. I would do pretty much like you are doing, except I would only have the following relations: - word to wo