On 10/31/12 4:05 PM, Francesc Alted wrote:
> On 10/31/12 4:02 PM, Francesc Alted wrote:
>> On 10/31/12 10:12 AM, Andrea Gavana wrote:
>>> Hi Francesc & All,
>>>
>>> On 31 October 2012 14:13, Francesc Alted wrote:
>>>> On 10/31/12 4:30 AM, Andrea Gavana wrote:
>>>>> Thank you for all your suggestions. I managed to slightly modify the
>>>>> script you attached and I am also experimenting with compression.
>>>>> However, in the newly attached script the underlying table is not
>>>>> modified, i.e., this assignment:
>>>>>
>>>>> for p in table:
>>>>>       p['results'][:NUM_SIM, :, :] = 
>>>>> numpy.random.random(size=(NUM_SIM,
>>>>> len(ALL_DATES), 7))
>>>>>       table.flush()
>>>> For modifying row values you need to assign a complete row object.
>>>> Something like:
>>>>
>>>> for i in range(len(table)):
>>>>       myrow = table[i]
>>>>       myrow['results'][:NUM_SIM, :, :] =
>>>> numpy.random.random(size=(NUM_SIM, len(ALL_DATES), 7))
>>>>       table[i] = myrow
>>>>
>>>> You may also use Table.modifyColumn() for better efficiency. Look at
>>>> the different modification methods here:
>>>>
>>>> http://pytables.github.com/usersguide/libref/structured_storage.html#table-methods-writing
>>>>  
>>>>
>>>>
>>>> and experiment with them.
>>> Thank you, I have tried different approaches and they all seem to run
>>> more or less at the same speed (see below). I had to slightly modify
>>> your code from:
>>>
>>> table[i] = myrow
>>>
>>> to
>>>
>>> table[i] = [myrow]
>>>
>>> To avoid exceptions.
>>>
>>> In the newly attached file, I switched to blosc for compression (but
>>> with compression level 1) and run a few sensitivities. By calling the
>>> attached script as:
>>>
>>> python pytables_test.py NUM_SIM
>>>
>>> where "NUM_SIM" is an integer, I get the following timings and file 
>>> sizes:
>>>
>>> C:\MyProjects\Phaser\tests>python pytables_test.py 10
>>> Number of simulations   : 10
>>> H5 file creation time   : 0.879s
>>> Saving results for table: 6.413s
>>> H5 file size (MB)       : 193
>>>
>>>
>>> C:\MyProjects\Phaser\tests>python pytables_test.py 100
>>> Number of simulations   : 100
>>> H5 file creation time   : 4.155s
>>> Saving results for table: 86.326s
>>> H5 file size (MB)       : 1935
>>>
>>>
>>> I dont think I will try the 1,000 simulations case :-) . I believe I
>>> still don't understand what the best strategy would be for my problem.
>>> I basically need to save all the simulation results for all the 1,200
>>> "objects", each of which has a timeseries matrix of 600x7 size. In the
>>> GUI I have, these 1,200 "objects" are grouped into multiple
>>> categories, and multiple categories can reference the same "object",
>>> i.e.:
>>>
>>> Category_1: object_1, object_23, object_543, etc...
>>> Category_2: object_23, object_100, object_543, etc...
>>>
>>> So my idea was to save all the "objects" results to disk and, upon the
>>> user's choice, build the categories results "on the fly", i.e. by
>>> seeking the H5 file on disk for the "objects" belonging to that
>>> specific category and summing up all their results (over time, i.e.
>>> the 600 time-steps). Maybe I would be better off with a 4D array
>>> (NUM_OBJECTS, NUM_SIM, TSTEPS, 7) as a table, but then I will lose the
>>> ability to reference the "objects" by their names...
>>
>> You should keep trying experimenting with different approaches and 
>> discover the one that works for you the best.  Regarding using the 4D 
>> array as a table, I might be misunderstanding your problem, but you 
>> can still reference objects by name by using:
>>
>> row = table.where("name == %s" % my_name)
>> table[row.nrow] = ...
>
> Uh, I rather meant:
>
> row = table.readWhere("name == %s" % my_name)
> table[row.nrow] = ...
>
> but you probably got the idea already.
>

Ups, that does not work either.  It is probably something more like:

rowid = table.readWhereList("name == %s" % my_name)[0]
myrow = table[rowid]
table[rowid] = ...

(assuming that 'name' is a primary key here, i.e. values are not repeated).

-- 
Francesc Alted


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to