El dj 14 de 12 del 2006 a les 07:37 -0800, en/na Curious Jan va escriure: > Well, that piece of information is certainly useful, because I didn't > find it anywhere else. > What is the default behavior of pytables 2.0 then going to be ? > It kind of changes the behavior of my code whether I have to construct > a string by adding paths and '/', or traverse a hierarchy > recursively. > > Actually what confuses me is that the behvavior is different for > for item in table: > do_something(item) > than it is for > for item in table[:]: > do_something(item) > > The first allows me to use > item['path/to/child'] > while in the latter case I have to write > item['path']['to']['child'] > > Is that behavior going to be unified ?
Mmm, that's a good question. First of all, the: for item in table: do_something(item) approach is an iterator over the table *on-disk*, so it fetches one record and offers it to the user wrapped in a tables.Row object for manipulation (well, in fact, things are a bit more complicated because the records are read in bunches, for efficency; but this is mostly irrelevant for the end user). However, in: for item in table[:]: do_something(item) you completely read the table *in-memory* and then proceed iterating over each row. Of course, if table on-disk is large enough, this second approach is overkill and the first one should normally be preferred. So, this is the main reason why you are seeing different behaviours when accessing the records of the table. In the first case, the Row accessor does implement a __getitem__() which understands that a '/' works as a separator for nested records. In the second case, the NumPy __getitem__() does not understand such a notation. This is probably a matter of tastes, but I like more the possibility of specify nestedrecords 'à la PyTables' way (i.e. 'field/subfield/subsubfield') than the NumPy way (i.e. ['field']['subfield']['subsubfield']) mainly because of two reasons: 1. It's more compact and easier to type 2. It's faster to retrieve a nested field So, I'd say that we should keep using the slash-separated way in 2.0. Regarding implementing the NumPy way in the Row accessor (in order to uniformize the access), we can try to implement it if there is interest enough from the users, but I'd like not to have to, mainly because we should have preferably one and only one way to do things (even though those ways are different in different packages). My current position in that regard is to maintain things like they are now, and that in 2.0 the user should be aware that they are using a Table iterator or a NumPy one, and act in consequence. Who knows, perhaps NumPy developers can be convinced that the slash-separated way to specify nested records is a good one and they might accept to implement it. Cheers, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users