Re: [sqlite] Limitation on Column count

2010-01-19 Thread Stefan_E

Hi Pavel,

yes - you are quite right, I wasn't precise here! Sorry about that!
What I meant was a join on unit_id, which obviously will have an index if
that's needed.

Thanks for your help!

Kind regards Stefan

Btw., in the same context, I noticed that sparse data of the database (with 
lots of NULL values), which would otherwise bloat the database doesn't seem
to be a problem in SQLite with it's type affinity scheme! Nice!

-- 
View this message in context: 
http://old.nabble.com/Limitation-on-Column-count-tp27117364p27231605.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limitation on Column count

2010-01-19 Thread Pavel Ivanov
This is from your last e-mail:

> So, I can't see the advantage of normalization here. (Note that I do not
> plan to join or index
> on measurement columns!)

This is from your first e-mail:

> Queries on this big table will be rather straight-forward: either on the
> table alone (SELECT * FROM table_large) or one join on one field to one of
> the smaller tables.

As you probably can see these sentences contradict to each other.
If all queries that you need on your database are
- get all measurements for some unit_id;
- get some measurements for all unit_ids;
then of course you won't gain anything from normalization and your
queries will never contain any O(N^2) algorithms (because you have no
joins on measurements). So I'd say go ahead and raise the limit to
32K. Besides some ugly-looking column names and some schema
uncomfortable for database theory supporter, you won't get any
troubles with that.


Pavel

On Sun, Jan 17, 2010 at 6:08 PM, Stefan_E  wrote:
>
> Hi all,
>
> based on Pavels input, I did some quick trials and thought I let you know
> the results.
> I'm running on an old pentiu...@3.2ghz with 3G memory and Win-7. So, don't
> expect
> super-fast times!
>
> 1. I have one DB with 5411 units and 1517 measurements with columns
>       unit_id, m_1, .. m_1517
>    No indices and not 100% populated: .db size is 48'834kB, whereas for
> full population
>    I'd expect 5411*1517*8 = 65'668kB.
>
>    To dump the entire db into a .csv file with sqlite3.exe takes ~35s
>    To dump a random column takes 0.42 .. 1.02s with a mean of 0.65s
>    To dump 5 randomly selected columns takes 0.65 .. 1.16s with mean 0.88s
>    To dump 100 randomly selected columns takes 2.34 .. 3.11s with mean
> 2.54s
>    (I always dump all units and the times include the time to start sqlite3
> - so it could be improved...;
>     The run on one column shows the most dispersion and it is not clear
> which columns cost most...
>     Each series above includes 100 random selections)
>
> 2. I quickly created a DB for 5500 units and 1500 measurements in the format
>       unit_id, meas_id, meas_val
>    This database is fully populated with random values for meas_val and is
> 191MByte in size.
>    Dumping the whole db costs 77s, but this may be an unfair measurement
> since I simply run
>       select * from table;
>    rather than restoring the rectangular format of units x measurements.
>
>    Of course, now dumping one meas_id for all unit_id is expensive, so I
> need an index on meas_id
>    to run queries like 'select * from table where meas_id=n'. That
> increases .db size to 293MByte.
>    Likely, I'd need another index on unit_id as well... so db size would
> grow another 100MByte,
>    which I clearly woudn't like!
>
>    With the index, dumping one meas_id with the query above costs 0.61s (I
> didn't do multiple
>    runs here...), but if I now want to select multiple meas_id with queries
> like
>        select * from table where meas_id IN (n1, n2, ..)
>    it costs for 5 meas_id 1.62s and for 100 meas_id 27.2s (!!), so I'm
> penalized not only in
>    size but also in time compared to case 1 above.
>
> So, I can't see the advantage of normalization here. (Note that I do not
> plan to join or index
> on measurement columns!)
>
> So, back to my first question... what do I pay when I increase the compile
> time limits of SQLite
> from 2000 to 32k columns? To be honest, I still consider staying in the 2000
> column limit with
> a blocked table layout similar as follows:
>     unit_id, block_id, meas_1,  meas_2000
> so that measurements 2001 ... 4000 would come on a new row. (Yea, that's
> 2002 cols, so I need
> take care... :-) )
>
> Thanks for your help anyway - and let me know if you have further input or
> remarks, either regarding
> above measurements (did I miss some important stuff?) or for the case of
>>2000 measurements.
>
> Greetings - Stefan
>
> --
> View this message in context: 
> http://old.nabble.com/Limitation-on-Column-count-tp27117364p27203919.html
> Sent from the SQLite mailing list archive at Nabble.com.
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limitation on Column count

2010-01-17 Thread Stefan_E

Hi all,

based on Pavels input, I did some quick trials and thought I let you know
the results.
I'm running on an old pentiu...@3.2ghz with 3G memory and Win-7. So, don't
expect
super-fast times!

1. I have one DB with 5411 units and 1517 measurements with columns
   unit_id, m_1, .. m_1517
No indices and not 100% populated: .db size is 48'834kB, whereas for
full population
I'd expect 5411*1517*8 = 65'668kB.

To dump the entire db into a .csv file with sqlite3.exe takes ~35s
To dump a random column takes 0.42 .. 1.02s with a mean of 0.65s
To dump 5 randomly selected columns takes 0.65 .. 1.16s with mean 0.88s
To dump 100 randomly selected columns takes 2.34 .. 3.11s with mean
2.54s
(I always dump all units and the times include the time to start sqlite3
- so it could be improved...;
 The run on one column shows the most dispersion and it is not clear
which columns cost most...
 Each series above includes 100 random selections)

2. I quickly created a DB for 5500 units and 1500 measurements in the format
   unit_id, meas_id, meas_val
This database is fully populated with random values for meas_val and is
191MByte in size.
Dumping the whole db costs 77s, but this may be an unfair measurement
since I simply run
   select * from table;
rather than restoring the rectangular format of units x measurements.

Of course, now dumping one meas_id for all unit_id is expensive, so I
need an index on meas_id
to run queries like 'select * from table where meas_id=n'. That
increases .db size to 293MByte.
Likely, I'd need another index on unit_id as well... so db size would
grow another 100MByte,
which I clearly woudn't like!

With the index, dumping one meas_id with the query above costs 0.61s (I
didn't do multiple
runs here...), but if I now want to select multiple meas_id with queries
like 
select * from table where meas_id IN (n1, n2, ..)
it costs for 5 meas_id 1.62s and for 100 meas_id 27.2s (!!), so I'm
penalized not only in
size but also in time compared to case 1 above.

So, I can't see the advantage of normalization here. (Note that I do not
plan to join or index
on measurement columns!)

So, back to my first question... what do I pay when I increase the compile
time limits of SQLite
from 2000 to 32k columns? To be honest, I still consider staying in the 2000
column limit with
a blocked table layout similar as follows:
 unit_id, block_id, meas_1,  meas_2000
so that measurements 2001 ... 4000 would come on a new row. (Yea, that's
2002 cols, so I need
take care... :-) )

Thanks for your help anyway - and let me know if you have further input or
remarks, either regarding
above measurements (did I miss some important stuff?) or for the case of
>2000 measurements.

Greetings - Stefan

-- 
View this message in context: 
http://old.nabble.com/Limitation-on-Column-count-tp27117364p27203919.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limitation on Column count

2010-01-12 Thread Pavel Ivanov
> When doing so: Any idea of what would be the worst case column select
> strategy on the huge table to compare with?

Take some query from your typical usage where only last column of the
huge table is used and all other columns either not used at all or
just one-two of them used for other conditions (assuming you don't
have index on this last column). I believe it would be the worst case
for huge table usage.

Pavel

On Tue, Jan 12, 2010 at 4:44 PM, Stefan_E  wrote:
>
> Hi Pavel,
>
> thanks a lot; I will reconsider this option and run some test cases to
> compare.
> But that will be tomorrow I guess...
> When doing so: Any idea of what would be the worst case column select
> strategy on the huge table to compare with?
>
> Stefan
>
>
> Pavel Ivanov-2 wrote:
>>
>>> so normalization would lead to a doubling
>>> of the storage space (add a measurement_id to each measurement).
>>
>> My strong belief is that when you try this normalization you'll see
>> that such doubling of storage is a good enough trade-off for the speed
>> you'll achieve. I don't think that speed of queries on the table with
>> 100+ columns would be any useful (of course unless you *always* select
>> all columns and *never* try to select only a few ones).
>>
>>> Second, the most common use case is to view the table in the currently
>>> foreseen format - so, I'd pay both in space and time...
>>
>> Most probably you view your table from your application which can
>> denormalize the table very quickly. Even if you view your table from
>> sqlite3 command line tool you still can write denormalizer even using
>> bash scripts and I believe it will still work fast enough and it will
>> be better than creating such huge table.
>>
>>
>> Pavel
>>
>> On Tue, Jan 12, 2010 at 1:09 PM, Stefan_E  wrote:
>>>
>>> Hi Adam,
>>>
>>> thanks for your suggestion. Unfortunately, it doesn't help in my case.
>>> Essentially, we are talking about a time series (rows) of n different
>>> measurements (columns) - so normalization would lead to a doubling
>>> of the storage space (add a measurement_id to each measurement).
>>>
>>> Second, the most common use case is to view the table in the currently
>>> foreseen format - so, I'd pay both in space and time...
>>>
>>> Anyway, thanks for the suggestion!
>>>
>>> Regards, Stefan
>>>
>>>
>>> Adam DeVita wrote:

 Good day,

 In general I try to work within the limits of any database engine that I
 am
 using.  Often, the limits are there for good reasons (such as speed
 problems).  I would suggest seeing if there is a way to normalize the
 big
 tables such that infrequently used columns are split into tables that
 aren't
 joined in often.  (The principal I'm using is borrowed from hardware
 architecture "Make the common case fast, and ensure the uncommon case is
 correct.")

 It may or may not be sensible given your data, but there may be an
 opportunity to reduce the number of columns  by making an encoded column
 to
 aggregate, such as lots of mutually exclusive binary flag fields.

 regards,
 Adam

 On Mon, Jan 11, 2010 at 3:46 PM, Stefan_E  wrote:

>
> Hello all,
>
> I'm considering using SQLite for a new application. The schema will
> contain
> a bunch of small tables with few columns (~10) plus one large table
> with
> many columns and 1000...1 rows.
>
> 'Many columns' typically fits into the default 2000 column limit, but
> can
> exceed it at times (that is, on some of the foreseen databases). It
> will
> never exceed the theoretical / compile time selectable limit of 32k
> columns.
> Queries on this big table will be rather straight-forward: either on
> the
> table alone (SELECT * FROM table_large) or one join on one field to one
> of
> the smaller tables.
>
> The  http://www.sqlite.org/limits.html Limits page  warns: "There are
> places
> in the SQLite code generator that use algorithms that are O(N²) where N
> is
> the number of columns." which is kind of discouraging to increase max.
> column count at compile time, but is not very specific about when this
> happens...
>
> I now have two design options:
> - increase max. column count at compile time (possibly setting
> SQLITE_LIMIT_COLUMN on databases where  I don't need more than (say)
> 1000
> columns) and accept the quoted performance degradation.
> - alternatively, in the client handle cases with more than 2000
> columns,
> splitting the storage up into two (or more) tables
>
> Any advise, experience - or more specifics on the "O(N²)" remark are
> highly
> welcome!
>
> Thanks for your help - Stefan
> --
> View this message in context:
> http://old.nabble.com/Limitation-on-Column-count-tp27117364p27117364.html
> Sent 

Re: [sqlite] Limitation on Column count

2010-01-12 Thread Stefan_E

Hi Pavel,

thanks a lot; I will reconsider this option and run some test cases to
compare.
But that will be tomorrow I guess...
When doing so: Any idea of what would be the worst case column select
strategy on the huge table to compare with?

Stefan


Pavel Ivanov-2 wrote:
> 
>> so normalization would lead to a doubling
>> of the storage space (add a measurement_id to each measurement).
> 
> My strong belief is that when you try this normalization you'll see
> that such doubling of storage is a good enough trade-off for the speed
> you'll achieve. I don't think that speed of queries on the table with
> 100+ columns would be any useful (of course unless you *always* select
> all columns and *never* try to select only a few ones).
> 
>> Second, the most common use case is to view the table in the currently
>> foreseen format - so, I'd pay both in space and time...
> 
> Most probably you view your table from your application which can
> denormalize the table very quickly. Even if you view your table from
> sqlite3 command line tool you still can write denormalizer even using
> bash scripts and I believe it will still work fast enough and it will
> be better than creating such huge table.
> 
> 
> Pavel
> 
> On Tue, Jan 12, 2010 at 1:09 PM, Stefan_E  wrote:
>>
>> Hi Adam,
>>
>> thanks for your suggestion. Unfortunately, it doesn't help in my case.
>> Essentially, we are talking about a time series (rows) of n different
>> measurements (columns) - so normalization would lead to a doubling
>> of the storage space (add a measurement_id to each measurement).
>>
>> Second, the most common use case is to view the table in the currently
>> foreseen format - so, I'd pay both in space and time...
>>
>> Anyway, thanks for the suggestion!
>>
>> Regards, Stefan
>>
>>
>> Adam DeVita wrote:
>>>
>>> Good day,
>>>
>>> In general I try to work within the limits of any database engine that I
>>> am
>>> using.  Often, the limits are there for good reasons (such as speed
>>> problems).  I would suggest seeing if there is a way to normalize the
>>> big
>>> tables such that infrequently used columns are split into tables that
>>> aren't
>>> joined in often.  (The principal I'm using is borrowed from hardware
>>> architecture "Make the common case fast, and ensure the uncommon case is
>>> correct.")
>>>
>>> It may or may not be sensible given your data, but there may be an
>>> opportunity to reduce the number of columns  by making an encoded column
>>> to
>>> aggregate, such as lots of mutually exclusive binary flag fields.
>>>
>>> regards,
>>> Adam
>>>
>>> On Mon, Jan 11, 2010 at 3:46 PM, Stefan_E  wrote:
>>>

 Hello all,

 I'm considering using SQLite for a new application. The schema will
 contain
 a bunch of small tables with few columns (~10) plus one large table
 with
 many columns and 1000...1 rows.

 'Many columns' typically fits into the default 2000 column limit, but
 can
 exceed it at times (that is, on some of the foreseen databases). It
 will
 never exceed the theoretical / compile time selectable limit of 32k
 columns.
 Queries on this big table will be rather straight-forward: either on
 the
 table alone (SELECT * FROM table_large) or one join on one field to one
 of
 the smaller tables.

 The  http://www.sqlite.org/limits.html Limits page  warns: "There are
 places
 in the SQLite code generator that use algorithms that are O(N²) where N
 is
 the number of columns." which is kind of discouraging to increase max.
 column count at compile time, but is not very specific about when this
 happens...

 I now have two design options:
 - increase max. column count at compile time (possibly setting
 SQLITE_LIMIT_COLUMN on databases where  I don't need more than (say)
 1000
 columns) and accept the quoted performance degradation.
 - alternatively, in the client handle cases with more than 2000
 columns,
 splitting the storage up into two (or more) tables

 Any advise, experience - or more specifics on the "O(N²)" remark are
 highly
 welcome!

 Thanks for your help - Stefan
 --
 View this message in context:
 http://old.nabble.com/Limitation-on-Column-count-tp27117364p27117364.html
 Sent from the SQLite mailing list archive at Nabble.com.

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

>>>
>>>
>>>
>>> --
>>> VerifEye Technologies Inc.
>>> 905-948-0015x245
>>> 7100 Warden Ave, Unit 3
>>> Markham ON, L3R 8B5
>>> Canada
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>>
>>
>> --
>> View this message in context:
>> 

Re: [sqlite] Limitation on Column count

2010-01-12 Thread Adam DeVita
One may be able to make "measurement type" a column, thus eliminating the
need for a column for each type.

Some speed may be recoverable with indexing.

regards,
Adam

On Tue, Jan 12, 2010 at 1:21 PM, Pavel Ivanov  wrote:

> > so normalization would lead to a doubling
> > of the storage space (add a measurement_id to each measurement).
>
> My strong belief is that when you try this normalization you'll see
> that such doubling of storage is a good enough trade-off for the speed
> you'll achieve. I don't think that speed of queries on the table with
> 100+ columns would be any useful (of course unless you *always* select
> all columns and *never* try to select only a few ones).
>
> > Second, the most common use case is to view the table in the currently
> > foreseen format - so, I'd pay both in space and time...
>
> Most probably you view your table from your application which can
> denormalize the table very quickly. Even if you view your table from
> sqlite3 command line tool you still can write denormalizer even using
> bash scripts and I believe it will still work fast enough and it will
> be better than creating such huge table.
>
>
> Pavel
>
> On Tue, Jan 12, 2010 at 1:09 PM, Stefan_E  wrote:
> >
> > Hi Adam,
> >
> > thanks for your suggestion. Unfortunately, it doesn't help in my case.
> > Essentially, we are talking about a time series (rows) of n different
> > measurements (columns) - so normalization would lead to a doubling
> > of the storage space (add a measurement_id to each measurement).
> >
> > Second, the most common use case is to view the table in the currently
> > foreseen format - so, I'd pay both in space and time...
> >
> > Anyway, thanks for the suggestion!
> >
> > Regards, Stefan
> >
> >
> > Adam DeVita wrote:
> >>
> >> Good day,
> >>
> >> In general I try to work within the limits of any database engine that I
> >> am
> >> using.  Often, the limits are there for good reasons (such as speed
> >> problems).  I would suggest seeing if there is a way to normalize the
> big
> >> tables such that infrequently used columns are split into tables that
> >> aren't
> >> joined in often.  (The principal I'm using is borrowed from hardware
> >> architecture "Make the common case fast, and ensure the uncommon case is
> >> correct.")
> >>
> >> It may or may not be sensible given your data, but there may be an
> >> opportunity to reduce the number of columns  by making an encoded column
> >> to
> >> aggregate, such as lots of mutually exclusive binary flag fields.
> >>
> >> regards,
> >> Adam
> >>
> >> On Mon, Jan 11, 2010 at 3:46 PM, Stefan_E  wrote:
> >>
> >>>
> >>> Hello all,
> >>>
> >>> I'm considering using SQLite for a new application. The schema will
> >>> contain
> >>> a bunch of small tables with few columns (~10) plus one large table
> with
> >>> many columns and 1000...1 rows.
> >>>
> >>> 'Many columns' typically fits into the default 2000 column limit, but
> can
> >>> exceed it at times (that is, on some of the foreseen databases). It
> will
> >>> never exceed the theoretical / compile time selectable limit of 32k
> >>> columns.
> >>> Queries on this big table will be rather straight-forward: either on
> the
> >>> table alone (SELECT * FROM table_large) or one join on one field to one
> >>> of
> >>> the smaller tables.
> >>>
> >>> The  http://www.sqlite.org/limits.html Limits page  warns: "There are
> >>> places
> >>> in the SQLite code generator that use algorithms that are O(N²) where N
> >>> is
> >>> the number of columns." which is kind of discouraging to increase max.
> >>> column count at compile time, but is not very specific about when this
> >>> happens...
> >>>
> >>> I now have two design options:
> >>> - increase max. column count at compile time (possibly setting
> >>> SQLITE_LIMIT_COLUMN on databases where  I don't need more than (say)
> 1000
> >>> columns) and accept the quoted performance degradation.
> >>> - alternatively, in the client handle cases with more than 2000
> columns,
> >>> splitting the storage up into two (or more) tables
> >>>
> >>> Any advise, experience - or more specifics on the "O(N²)" remark are
> >>> highly
> >>> welcome!
> >>>
> >>> Thanks for your help - Stefan
> >>> --
> >>> View this message in context:
> >>>
> http://old.nabble.com/Limitation-on-Column-count-tp27117364p27117364.html
> >>> Sent from the SQLite mailing list archive at Nabble.com.
> >>>
> >>> ___
> >>> sqlite-users mailing list
> >>> sqlite-users@sqlite.org
> >>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >>>
> >>
> >>
> >>
> >> --
> >> VerifEye Technologies Inc.
> >> 905-948-0015x245
> >> 7100 Warden Ave, Unit 3
> >> Markham ON, L3R 8B5
> >> Canada
> >> ___
> >> sqlite-users mailing list
> >> sqlite-users@sqlite.org
> >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >>
> >>
> >
> > --
> > 

Re: [sqlite] Limitation on Column count

2010-01-12 Thread Pavel Ivanov
> so normalization would lead to a doubling
> of the storage space (add a measurement_id to each measurement).

My strong belief is that when you try this normalization you'll see
that such doubling of storage is a good enough trade-off for the speed
you'll achieve. I don't think that speed of queries on the table with
100+ columns would be any useful (of course unless you *always* select
all columns and *never* try to select only a few ones).

> Second, the most common use case is to view the table in the currently
> foreseen format - so, I'd pay both in space and time...

Most probably you view your table from your application which can
denormalize the table very quickly. Even if you view your table from
sqlite3 command line tool you still can write denormalizer even using
bash scripts and I believe it will still work fast enough and it will
be better than creating such huge table.


Pavel

On Tue, Jan 12, 2010 at 1:09 PM, Stefan_E  wrote:
>
> Hi Adam,
>
> thanks for your suggestion. Unfortunately, it doesn't help in my case.
> Essentially, we are talking about a time series (rows) of n different
> measurements (columns) - so normalization would lead to a doubling
> of the storage space (add a measurement_id to each measurement).
>
> Second, the most common use case is to view the table in the currently
> foreseen format - so, I'd pay both in space and time...
>
> Anyway, thanks for the suggestion!
>
> Regards, Stefan
>
>
> Adam DeVita wrote:
>>
>> Good day,
>>
>> In general I try to work within the limits of any database engine that I
>> am
>> using.  Often, the limits are there for good reasons (such as speed
>> problems).  I would suggest seeing if there is a way to normalize the big
>> tables such that infrequently used columns are split into tables that
>> aren't
>> joined in often.  (The principal I'm using is borrowed from hardware
>> architecture "Make the common case fast, and ensure the uncommon case is
>> correct.")
>>
>> It may or may not be sensible given your data, but there may be an
>> opportunity to reduce the number of columns  by making an encoded column
>> to
>> aggregate, such as lots of mutually exclusive binary flag fields.
>>
>> regards,
>> Adam
>>
>> On Mon, Jan 11, 2010 at 3:46 PM, Stefan_E  wrote:
>>
>>>
>>> Hello all,
>>>
>>> I'm considering using SQLite for a new application. The schema will
>>> contain
>>> a bunch of small tables with few columns (~10) plus one large table with
>>> many columns and 1000...1 rows.
>>>
>>> 'Many columns' typically fits into the default 2000 column limit, but can
>>> exceed it at times (that is, on some of the foreseen databases). It will
>>> never exceed the theoretical / compile time selectable limit of 32k
>>> columns.
>>> Queries on this big table will be rather straight-forward: either on the
>>> table alone (SELECT * FROM table_large) or one join on one field to one
>>> of
>>> the smaller tables.
>>>
>>> The  http://www.sqlite.org/limits.html Limits page  warns: "There are
>>> places
>>> in the SQLite code generator that use algorithms that are O(N²) where N
>>> is
>>> the number of columns." which is kind of discouraging to increase max.
>>> column count at compile time, but is not very specific about when this
>>> happens...
>>>
>>> I now have two design options:
>>> - increase max. column count at compile time (possibly setting
>>> SQLITE_LIMIT_COLUMN on databases where  I don't need more than (say) 1000
>>> columns) and accept the quoted performance degradation.
>>> - alternatively, in the client handle cases with more than 2000 columns,
>>> splitting the storage up into two (or more) tables
>>>
>>> Any advise, experience - or more specifics on the "O(N²)" remark are
>>> highly
>>> welcome!
>>>
>>> Thanks for your help - Stefan
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Limitation-on-Column-count-tp27117364p27117364.html
>>> Sent from the SQLite mailing list archive at Nabble.com.
>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>
>>
>>
>> --
>> VerifEye Technologies Inc.
>> 905-948-0015x245
>> 7100 Warden Ave, Unit 3
>> Markham ON, L3R 8B5
>> Canada
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/Limitation-on-Column-count-tp27117364p27131144.html
> Sent from the SQLite mailing list archive at Nabble.com.
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limitation on Column count

2010-01-12 Thread Stefan_E

Hi Adam,

thanks for your suggestion. Unfortunately, it doesn't help in my case. 
Essentially, we are talking about a time series (rows) of n different
measurements (columns) - so normalization would lead to a doubling 
of the storage space (add a measurement_id to each measurement).

Second, the most common use case is to view the table in the currently
foreseen format - so, I'd pay both in space and time...

Anyway, thanks for the suggestion!

Regards, Stefan


Adam DeVita wrote:
> 
> Good day,
> 
> In general I try to work within the limits of any database engine that I
> am
> using.  Often, the limits are there for good reasons (such as speed
> problems).  I would suggest seeing if there is a way to normalize the big
> tables such that infrequently used columns are split into tables that
> aren't
> joined in often.  (The principal I'm using is borrowed from hardware
> architecture "Make the common case fast, and ensure the uncommon case is
> correct.")
> 
> It may or may not be sensible given your data, but there may be an
> opportunity to reduce the number of columns  by making an encoded column
> to
> aggregate, such as lots of mutually exclusive binary flag fields.
> 
> regards,
> Adam
> 
> On Mon, Jan 11, 2010 at 3:46 PM, Stefan_E  wrote:
> 
>>
>> Hello all,
>>
>> I'm considering using SQLite for a new application. The schema will
>> contain
>> a bunch of small tables with few columns (~10) plus one large table with
>> many columns and 1000...1 rows.
>>
>> 'Many columns' typically fits into the default 2000 column limit, but can
>> exceed it at times (that is, on some of the foreseen databases). It will
>> never exceed the theoretical / compile time selectable limit of 32k
>> columns.
>> Queries on this big table will be rather straight-forward: either on the
>> table alone (SELECT * FROM table_large) or one join on one field to one
>> of
>> the smaller tables.
>>
>> The  http://www.sqlite.org/limits.html Limits page  warns: "There are
>> places
>> in the SQLite code generator that use algorithms that are O(N²) where N
>> is
>> the number of columns." which is kind of discouraging to increase max.
>> column count at compile time, but is not very specific about when this
>> happens...
>>
>> I now have two design options:
>> - increase max. column count at compile time (possibly setting
>> SQLITE_LIMIT_COLUMN on databases where  I don't need more than (say) 1000
>> columns) and accept the quoted performance degradation.
>> - alternatively, in the client handle cases with more than 2000 columns,
>> splitting the storage up into two (or more) tables
>>
>> Any advise, experience - or more specifics on the "O(N²)" remark are
>> highly
>> welcome!
>>
>> Thanks for your help - Stefan
>> --
>> View this message in context:
>> http://old.nabble.com/Limitation-on-Column-count-tp27117364p27117364.html
>> Sent from the SQLite mailing list archive at Nabble.com.
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> 
> 
> 
> -- 
> VerifEye Technologies Inc.
> 905-948-0015x245
> 7100 Warden Ave, Unit 3
> Markham ON, L3R 8B5
> Canada
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Limitation-on-Column-count-tp27117364p27131144.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limitation on Column count

2010-01-12 Thread Adam DeVita
Good day,

In general I try to work within the limits of any database engine that I am
using.  Often, the limits are there for good reasons (such as speed
problems).  I would suggest seeing if there is a way to normalize the big
tables such that infrequently used columns are split into tables that aren't
joined in often.  (The principal I'm using is borrowed from hardware
architecture "Make the common case fast, and ensure the uncommon case is
correct.")

It may or may not be sensible given your data, but there may be an
opportunity to reduce the number of columns  by making an encoded column to
aggregate, such as lots of mutually exclusive binary flag fields.

regards,
Adam

On Mon, Jan 11, 2010 at 3:46 PM, Stefan_E  wrote:

>
> Hello all,
>
> I'm considering using SQLite for a new application. The schema will contain
> a bunch of small tables with few columns (~10) plus one large table with
> many columns and 1000...1 rows.
>
> 'Many columns' typically fits into the default 2000 column limit, but can
> exceed it at times (that is, on some of the foreseen databases). It will
> never exceed the theoretical / compile time selectable limit of 32k
> columns.
> Queries on this big table will be rather straight-forward: either on the
> table alone (SELECT * FROM table_large) or one join on one field to one of
> the smaller tables.
>
> The  http://www.sqlite.org/limits.html Limits page  warns: "There are
> places
> in the SQLite code generator that use algorithms that are O(N²) where N is
> the number of columns." which is kind of discouraging to increase max.
> column count at compile time, but is not very specific about when this
> happens...
>
> I now have two design options:
> - increase max. column count at compile time (possibly setting
> SQLITE_LIMIT_COLUMN on databases where  I don't need more than (say) 1000
> columns) and accept the quoted performance degradation.
> - alternatively, in the client handle cases with more than 2000 columns,
> splitting the storage up into two (or more) tables
>
> Any advise, experience - or more specifics on the "O(N²)" remark are highly
> welcome!
>
> Thanks for your help - Stefan
> --
> View this message in context:
> http://old.nabble.com/Limitation-on-Column-count-tp27117364p27117364.html
> Sent from the SQLite mailing list archive at Nabble.com.
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
VerifEye Technologies Inc.
905-948-0015x245
7100 Warden Ave, Unit 3
Markham ON, L3R 8B5
Canada
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users