Hi Philppe,

I think your question is a bit too generic.  There are many ways to use
PyTables, and although there is normally a sensible solution for each
scenario, it would help if you can produce more focused questions.  My
advice is that you read the docs and try make your own experiments; then try
asking again.

Francesc

2011/4/3 Philippe Ombredanne <pombreda...@gmail.com
Howdy:
>
> anyone with an input?
>
> On 2011-03-25 02:19, Philippe Ombredanne wrote:
> > Hello Pytables wolrd!
> > I am a python open source hacker and programmer.
> >
> > I need to store files metadata for several 100's of terabytes
> > /billions of files and I am considering Pytables. Postgres is making
> > the job too hard.
> >
> > The metadata themselves are for files and directories, and represent a
> > few terabytes  for now up to the low 10 terabytes in the long run.
> > There are inherently hierarchic in the sense that directory level
> > metadata apply down to all child files/dirs unless overridden at a sub
> > level.
> > The metadata are characterized by a reasonably high level of
> > redundancy: several files share the same value for a column, and in
> > some cases a couple millions files do share the same value for a
> > certain column/attribute.
> > These highly duplicated columns need to be indexed for fast access
> > (think about an IR-style inverted index at least conceptually), and
> > are the keys used for the look-ups/queries.
> > The metadata themselves can be either single values, or a list of
> > values. Some node can have up to a few millions of values in a list of
> > variable length.
> >
> > The metadata are otherwise mostly numbers of well defined types with a
> > pseudo random distribution: the whole range of a numeric type is used.
> > (typically 64 bits to 512 bits numbers)
> >
> > The metadata are mostly static: they are written once in batches of
> > several 100MB, very rarely updated once written.
> >
> > The read load requires querying and possibly traversing the whole
> > file-system-like metadata tree about 100 to a 1000 times per day.
> > The response time for such queries is not critical as long as it takes
> > less than 24 hours. The load can be spread on several (10 to 100)
> > hosts as needed with data possibly replicated. The querying takes care
> > of de-duplication on duplicated retrieved records.
> >
> > Is Pytable suitable for the job?
> > Any tips? example of similar usage?
> > Is the right approach to use the object tree to model the file system
> > tree? (aka filenode? http://www.pytables.org/docs/manual/ch06.html )
> > though the file content is not meant to be stored in Pytables, only
> > metadata.
> >
> > Any tool to help with replication/distribution on several hosts?
> > I am not looking for getting complete answers right away of course,
> > but any tips will be warmly welcomed!
> >
> >
>
> --
> Cordially
> Philippe
>
> philippe ombredanne | 1 650 799 0949 | pombredanne at nexb.com
> nexB - Open by Design (tm) - http://www.nexb.com
> http://eclipse.org/atf - http://eclipse.org/soc - http://eclipse.org/vep
> http://drools.org/ - http://easyeclipse.org - http://phpeclipse.com
>
>
>
> ------------------------------------------------------------------------------
> Create and publish websites with WebMatrix
> Use the most popular FREE web apps or write code yourself;
> WebMatrix provides all the features you need to develop and
> publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>



-- 
Francesc Alted
------------------------------------------------------------------------------
Create and publish websites with WebMatrix
Use the most popular FREE web apps or write code yourself; 
WebMatrix provides all the features you need to develop and 
publish your website. http://p.sf.net/sfu/ms-webmatrix-sf
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to