"Shitiz Bansal" <[EMAIL PROTECTED]> wrote > I need to implement a system which stores Strings(average length 50 > chars). > For every input String it would need to tell the user wether that > string > already exists in the system. It would also need to add that input > String to the system if it did not exist.
Sounds like a job for a dictionary, except... > It will also be useful to know the last accessed datetime value of > that string. That can be done with a bit of effort. > The number of strings is in millions and i also need persistence > so keeping all Strings in memory is not an option. 10 million x 50 chars = 500MB. So if you have a Gig of RAM and not much else running on the machine memory might still be a valid option... but if not... This rules out a normal dictionary, but what about a shelf? Have a look at the shelve module, it makes a file look a lot like a dictionary. It should solve your problem. And you can store either a string or a string/date tuple. I'm not sure how a shelf would perform compared to a database, but its a lot simpler to manage. > Would it be wiser to keep these Strings in an indexed column > of the DB or would it be better to keep these strings as filenames > on the filesystem in a folder hiearchy of some sort. I'd definitely go for the database approach if not using shelve. > Please also bear in mind the time required to insert the > strings (for eg. i tried using a database but found the insertion > time to be very high once i indexed the particular column. That's common, so I'd suggest not indexing. Its the rebuild of the index that takes the time. Or if you can break the strings into categories to reduce the size of the tables that would help. But that depends on how easy it is to categorise the strings such that you know where to insert/search. Also consider using the dbm family of moidules, for simple data access they often out perform a full SQL database. HTH, -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor