As requested I am replying with some feedback regarding some timing tests of the different variations of sollutions to the problem mentioned earlier. Like every good experiment The experiments were run three times and the averages were taken. These times also include overhead of reading the data which is a constant throughout all experiments and therefore is controlled. I have listed my results bellow.
There may be issues relative to the number of partitions within my type of data upon certain structures. But after my experimentation these factors can only push me towards my choice of design. Again this is related to my type of data (as you mentioned). On Wed, Feb 06, 2008 at 05:33:38PM +0100, Ivan Vilata i Balaguer wrote: > However, you may create a ``VLArray`` of ``ObjectAtom``, which will save > every row as a pickled Python object. Experiment 1: VLArray of Object Atoms 547 Records Writing: 28.397 s Reading: 2.744 s > Pickling into a fixed width field This is not really possible due to the nature of the data as there is no maximum to the length of this structure. There is one that can be calculated from the current data but it may need changing in the future. Which I have read is possible to do with pytables tables. I am avoiding this area for now. > in a table (as you mention) or into a row in an enlargeable array are > also possible solutions, but involve manual (un)pickling. Experiment 2: 2 EArrays One for Single Chars = Pickled data One for data offsets I am not sure if there is a better way of doing this. Maybe someone can inspire me. Writing: 32.343s Reading: 3.441s > You may also use two ``VLArray`` nodes, one for the flat list of > numbers > and another one for the indexes where the list is splitted:: > > vlarray1 = [ vlarray2 = [ > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [0, 3, 5, 6], > [1, 2, 3, 4, 5, 7, 8, 9, 6], [0, 2, 5], > [4, 5, 6, 7, 3, 2, 1], [0, 2, 3], > [1, 2, 3, 4, 5, 6], [0, 3, 5], > [1, 4, 5, 7, 8, 2, 3], [0, 1, 3], > ... ... > ] ] Experiment 3: This is essentially the same as Exp 2 but more intuitive. Writing: 27.394s Reading: 20.755s Type 3 was consistently faster in writing which is intuitive as it does not invlove pickling. Where as recreating the structure through the access of two arrays slows everything down. Type 2 is as consistent as Type 1 with variations between reading and writing however is a very convoluted method of doing things. Type 1 is the most intuitive to program and use and happens to be the fastest to read which will be done more often that writing. And is only slightly slower than type 3. I would be looking ofrward to the inclusion of variable arrays or variable length strings or pickled objects or variable length within the standard table as that would be the most suitable for my application. Until then I believe that I will be going with pickling into a variable length array with a link in the table linking to the index of its object into the vlarray. Thanks, -- Hatem Nassrat BCSc. FCS - Dalhousie University ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users