On 5/10/12 12:14 PM, Alvaro Tejero Cantero wrote: > The graphical explanation of the different containers is masterly, and > I believe, supersedes the table that we had talked about for the > documentation. > > I think it the schematics deserve a prominent place in the web page. > They are a very good symbolic explanation of the basics of PyTables.
Glad that you like it. In fact I think you are right: this is perhaps the first time that some schematics have been used for describing the basic objects in PyTables. And my impression from the talk yesterday is that people really get the gist of PyTables very quickly. > > As for the tables.Expr example of an in-kernel query, > > [ r[‘c1’] for r in table.where(‘(c2>2.1)&(c3==True)’) ] > > now that there exists thanks to Josh a facility to obtain dataset > sizes, perhaps some interesting things become possible I think you are mixing concepts here. tables.Expr is for out-of-core operations. I suppose you mean Numexpr here. > > a) I have always wondered why tables.Expr 'must' be used in an > iterative context, i.e. pay the prize of building the Python list, > which is not the best container to iterate on afterwards. My > explanation for it is that you don't know how big the result set will > be, and thus want to avoid returning a big object in memory. But now > it would be possible that if the size of the columns that are involved > fits in memory (or, let's say a fraction of the total RAM that is > configurable), PyTables returns a numpy mask, or an index array, which > are certainly very useful for further numpy work. A new function name > could be provided for this functionality. Hmm, the Table.where() iterator is very fast already (I can assure you that a lot of optimizations and caching stuff is there), but I agree that, for the indexed case, there would be situations where returning a mask or an index array would be better (read faster). > > b) more generally, expanding on this, knowing the size of datasets and > the available memory, PyTables could eventually decide whether to > perform operations in memory or in kernel. In-memory or in-kernel? You probably mean indexed or in-kernel, right? Yes, that's certainly another nice place for further optimizations. -- Francesc Alted ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users