Il 27/03/2013 09:27, Michael Schnell ha scritto:
On 03/26/2013 03:44 PM, Andrea Mauri wrote:
one more thing. my data is more similar to a huge spreadsheet than a
relational DB

"store":  Do you mean for working with it in "realtime" or for keeping
it for the next time the program is started ?

Ok I will explain better.

I have a GUI app.

The user loads samples (the rows), my app performs calculations on samples and for every sample give as output thousands of values (the columns). Samples could be from tens to hundreds of thousands.
Columns can be tens to thousands.
Every column is an attribute of the sample defined by a unique name.
After calculations the app/user should be able to search for one/more samples (or for one/more columns) getting all/some values for the sample/column. Briefly I need to be able to rapidly get some values from this huge data matrix.

I have two scenarios, both are possible, depending on calculations type:

1. all the columns (the number/type of attributes to be calculated on every sample) are defined by the user before performing calculations, so I know the total number of columns before starting the calculations. Actually I use a sqlite db that I define before calculations started. I create as many tables as I need where every table has 200 fields. The samples are stored in a table, every sample is a row. The attributes are stored in all the other tables (every sample has a row in every attributes tables).

2. The number of columns can vary from a sample to another. The number of columns/attributes for every sample is known only after calculation is performed on that sample. The total number of columns/attributes is known only after all samples are examined/calculated. During calculations I need to modify the number of columns. Using sqlite is quite annoying since I need to alter tables when a new column occurs also closing all the queries, modifying the insert queries and so on. During a calculation process thousands of columns can be created and need to be added.

The app can be splitted in 4 different tasks:

1. load of samples (then the user can look at samples, load new samples adding to existing ones..); 2. calculations (can take seconds, minutes, hours depending on the number of samples in the batch and on the number of attributes to be calculated), the user can leave the app working - no interaction, the user is just able to stop/pause the process; 2. calculated data analysis (interactive), the user can perform analysis on data (charts, data views, data manipulation [average, standard deviation...]...) (this should be as fast as possible, data retrieval from the db/datasheet fast); 3. save of calculated data to a txt file/spreadsheet... (not necessarily all calculated data will be saved by the user).

The user should also be able to delete/remove samples/columns or add samples/columns.

When doing a with a 64 bit program it might be possible and appropriate
just to use an array and have the OS do the dirty work of virtual memory
management, using a huge swap partition.

My app is actually compiled both on 32 and 64 bit. Maybe because I am old ;) (I started programming when RAM was to be used carefully, 1Mb of RAM was typical) but I am inhabited to not put all the data in memory, maybe I need to change my mind. Is it an option to not store on files but to take everything in memory and let app and os to perform the dirty work?


Andrea

--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Reply via email to