Hi Philppe, I think your question is a bit too generic. There are many ways to use PyTables, and although there is normally a sensible solution for each scenario, it would help if you can produce more focused questions. My advice is that you read the docs and try make your own experiments; then try asking again.
Francesc 2011/4/3 Philippe Ombredanne <pombreda...@gmail.com Howdy: > > anyone with an input? > > On 2011-03-25 02:19, Philippe Ombredanne wrote: > > Hello Pytables wolrd! > > I am a python open source hacker and programmer. > > > > I need to store files metadata for several 100's of terabytes > > /billions of files and I am considering Pytables. Postgres is making > > the job too hard. > > > > The metadata themselves are for files and directories, and represent a > > few terabytes for now up to the low 10 terabytes in the long run. > > There are inherently hierarchic in the sense that directory level > > metadata apply down to all child files/dirs unless overridden at a sub > > level. > > The metadata are characterized by a reasonably high level of > > redundancy: several files share the same value for a column, and in > > some cases a couple millions files do share the same value for a > > certain column/attribute. > > These highly duplicated columns need to be indexed for fast access > > (think about an IR-style inverted index at least conceptually), and > > are the keys used for the look-ups/queries. > > The metadata themselves can be either single values, or a list of > > values. Some node can have up to a few millions of values in a list of > > variable length. > > > > The metadata are otherwise mostly numbers of well defined types with a > > pseudo random distribution: the whole range of a numeric type is used. > > (typically 64 bits to 512 bits numbers) > > > > The metadata are mostly static: they are written once in batches of > > several 100MB, very rarely updated once written. > > > > The read load requires querying and possibly traversing the whole > > file-system-like metadata tree about 100 to a 1000 times per day. > > The response time for such queries is not critical as long as it takes > > less than 24 hours. The load can be spread on several (10 to 100) > > hosts as needed with data possibly replicated. The querying takes care > > of de-duplication on duplicated retrieved records. > > > > Is Pytable suitable for the job? > > Any tips? example of similar usage? > > Is the right approach to use the object tree to model the file system > > tree? (aka filenode? http://www.pytables.org/docs/manual/ch06.html ) > > though the file content is not meant to be stored in Pytables, only > > metadata. > > > > Any tool to help with replication/distribution on several hosts? > > I am not looking for getting complete answers right away of course, > > but any tips will be warmly welcomed! > > > > > > -- > Cordially > Philippe > > philippe ombredanne | 1 650 799 0949 | pombredanne at nexb.com > nexB - Open by Design (tm) - http://www.nexb.com > http://eclipse.org/atf - http://eclipse.org/soc - http://eclipse.org/vep > http://drools.org/ - http://easyeclipse.org - http://phpeclipse.com > > > > ------------------------------------------------------------------------------ > Create and publish websites with WebMatrix > Use the most popular FREE web apps or write code yourself; > WebMatrix provides all the features you need to develop and > publish your website. http://p.sf.net/sfu/ms-webmatrix-sf > _______________________________________________ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users > -- Francesc Alted
------------------------------------------------------------------------------ Create and publish websites with WebMatrix Use the most popular FREE web apps or write code yourself; WebMatrix provides all the features you need to develop and publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users