This is from your last e-mail: > So, I can't see the advantage of normalization here. (Note that I do not > plan to join or index > on measurement columns!)
This is from your first e-mail: > Queries on this big table will be rather straight-forward: either on the > table alone (SELECT * FROM table_large) or one join on one field to one of > the smaller tables. As you probably can see these sentences contradict to each other. If all queries that you need on your database are - get all measurements for some unit_id; - get some measurements for all unit_ids; then of course you won't gain anything from normalization and your queries will never contain any O(N^2) algorithms (because you have no joins on measurements). So I'd say go ahead and raise the limit to 32K. Besides some ugly-looking column names and some schema uncomfortable for database theory supporter, you won't get any troubles with that. Pavel On Sun, Jan 17, 2010 at 6:08 PM, Stefan_E <se_m...@hotmail.com> wrote: > > Hi all, > > based on Pavels input, I did some quick trials and thought I let you know > the results. > I'm running on an old pentiu...@3.2ghz with 3G memory and Win-7. So, don't > expect > super-fast times! > > 1. I have one DB with 5411 units and 1517 measurements with columns > unit_id, m_1, .. m_1517 > No indices and not 100% populated: .db size is 48'834kB, whereas for > full population > I'd expect 5411*1517*8 = 65'668kB. > > To dump the entire db into a .csv file with sqlite3.exe takes ~35s > To dump a random column takes 0.42 .. 1.02s with a mean of 0.65s > To dump 5 randomly selected columns takes 0.65 .. 1.16s with mean 0.88s > To dump 100 randomly selected columns takes 2.34 .. 3.11s with mean > 2.54s > (I always dump all units and the times include the time to start sqlite3 > - so it could be improved...; > The run on one column shows the most dispersion and it is not clear > which columns cost most... > Each series above includes 100 random selections) > > 2. I quickly created a DB for 5500 units and 1500 measurements in the format > unit_id, meas_id, meas_val > This database is fully populated with random values for meas_val and is > 191MByte in size. > Dumping the whole db costs 77s, but this may be an unfair measurement > since I simply run > select * from table; > rather than restoring the rectangular format of units x measurements. > > Of course, now dumping one meas_id for all unit_id is expensive, so I > need an index on meas_id > to run queries like 'select * from table where meas_id=n'. That > increases .db size to 293MByte. > Likely, I'd need another index on unit_id as well... so db size would > grow another 100MByte, > which I clearly woudn't like! > > With the index, dumping one meas_id with the query above costs 0.61s (I > didn't do multiple > runs here...), but if I now want to select multiple meas_id with queries > like > select * from table where meas_id IN (n1, n2, ..) > it costs for 5 meas_id 1.62s and for 100 meas_id 27.2s (!!), so I'm > penalized not only in > size but also in time compared to case 1 above. > > So, I can't see the advantage of normalization here. (Note that I do not > plan to join or index > on measurement columns!) > > So, back to my first question... what do I pay when I increase the compile > time limits of SQLite > from 2000 to 32k columns? To be honest, I still consider staying in the 2000 > column limit with > a blocked table layout similar as follows: > unit_id, block_id, meas_1, .... meas_2000 > so that measurements 2001 ... 4000 would come on a new row. (Yea, that's > 2002 cols, so I need > take care... :-) ) > > Thanks for your help anyway - and let me know if you have further input or > remarks, either regarding > above measurements (did I miss some important stuff?) or for the case of >>2000 measurements. > > Greetings - Stefan > > -- > View this message in context: > http://old.nabble.com/Limitation-on-Column-count-tp27117364p27203919.html > Sent from the SQLite mailing list archive at Nabble.com. > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users