Hi, Francesc Alted >> I like pytables a lot this morning it is driving me up the walls. I have >> a file with particles and I want to modify it. I basically want to split >> _some_ of the particles into smaller subparticles. The HDF5 file is >> generated by some other code (written in C++), but I don't want to use >> the terrible C++ interface to for this simple task. >> >> So my first idea was to use "for p in f.root.table:" to iterate over the >> table, check if the condition is true (basically |v| > 0.1 c) and if >> that is the case delete the particle using removeRows and appending new >> particles. > > Caveat emptor: Table.removeRows() is a slow operation due to how HDF5 is > designed. In general, it is better to copy rows into separate tables and > delete the old tables. See below. Well yes. But then again deleting the one million rows takes about 10 seconds on my system which is fast enough.
>> This turned out to be a bad idea for two (or three) reasons. I would >> have to avoid splitting the particles further and further. A bit ugly >> but manageable. >> >> Second problem: p contains particles, but removeRows wants row number. >> How do I find that out? Well. The documentation doesn't say. > > It does: `Row.nrow`, as can be seen in: > http://www.pytables.org/docs/manual/ch04.html#RowClassDescr Ah. Thanks. >> So. second idea. Create a second file, copy the particles which are slow >> enough, and add subparticles insted of the fast particles. Sounds good >> right? Well. Problem. I can't find any way to say "create a table just >> like this one over there, but without the couple of millions of rows". > > There are several ways to accomplish this. One is using `table1.description` > as the description for your new table. Now that I knew what to look for I found it in the documentation. It's quite well hidden though. >> Ok. So I though "If I can't create a empty table, I can copy the file to >> a new name, and drop all the rows in the table. That gets me a nice and >> empty table I can fill." Turn out I can't: >> >> "NotImplementedError: You are trying to delete all the rows in table >> "/table1". This is not supported right now due to limitations on the >> underlying HDF5 library. Sorry!" > > As the error says, this is a limitation of the HDF5 1.6.x series. Link > PyTables against HDF5 1.8.x and you will get rid of this error. Problem is that the current C++ code relies on HDF5 1.6. But ok. Good to know that it will get better once we switch to HDF5 1.8. >> Except ... I can't. If I have the input file i and the output file o. >> Both with identical tables defined. And I do: >> >> for p in i.root.table1: >> o.root.table1.append(p) >> >> it breaks with: >> >> File "/usr/lib/python2.5/site-packages/tables/table.py", line 1758, >> in append >> "rows parameter cannot be converted into a recarray object >> compliant with table '%s'. The error was: <%s>" % (str(self), exc) >> ValueError: rows parameter cannot be converted into a recarray object >> compliant with table '/table1 (Table(1L,), zlib(6)) 'table1''. The error >> was: <objects of type ``Row`` are not supported in this context, sorry; >> supported objects are: NumPy array, record or scalar; homogeneous list >> or tuple, integer, float, complex or string> > > Yeah, two problems here. First, rows is a data *accessor*, not a container. > If you want to get the *contents* of the row, you should use `p[:]`. Also, > `p[:]` is a single row, so it is an scalar, but `Table.append()` wants an > *array* (or list) of rows. With this, the next idiom: > > for p in i.root.table1: > o.root.table1.append([p[:]]) > > will do the trick. These two paragraphs were immensely helpful. I'm still not sure where I could have found in the documentation. But those two lines of code along with our explanation saved my day. >> At that point I decided that pytables is fine for reading but just >> doesn't cut it for modifying tables. Which is a shame given its name. >> >> Patrick "Using pytables ro for now" Kilian > > Hope you can proceed with 'rw' mode soon ;-) Doing that now. Thanks again. > Ooops! I forgot the handy and efficient `Table.whereAppend()` method. With > it, your problem is reduced to something like: > > # Fast particles... > fast = f2.createTable(f2.root, 'fast', table1.description) > table1.whereAppend(fast, "sqrt(x_g**2+y_g**2+z_g**2) > cut") > > # Slow particles... > slow = f2.createTable(f2.root, 'slow', table1.description) > table1.whereAppend(slow, "sqrt(x_g**2+y_g**2+z_g**2) <= cut") This is not exactly what I need but quite elegant. Patrick "a happy pytables user" Kilian ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users