Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

2019-10-18 Thread Hick Gunter
If you should happen to have the need for storing columns that each have only a 
small set of possible values, maybe you would be better off looking into 
FastBit, which has coulmn-oriented storage

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Mitar
Gesendet: Donnerstag, 17. Oktober 2019 15:46
An: SQLite mailing list 
Betreff: Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

 Hi!

Thanks for this input. So you are saying that sqlite3_column 100k times per row 
is slow, but retrieving 100k rows to construct one "original" row will be 
faster? So not sure if I understand why reading and decoding cells in over 
multiple columns is so much slower than reading and decoding cells in over 
multiple rows?

Mitar

On Thu, Oct 17, 2019 at 3:38 PM Hick Gunter  wrote:
>
> I have the impression that you still do not grasp the folly of a 100k column 
> schema.
>
> See the example below, which only has 6 fields. As you can see, each field 
> requires a Column opcode and arguments (about 10 bytes) and a "register" to 
> hold the value (48 bytes), which for 100k columns uses about 5.5Megabytes to 
> retrieve a row from the database. It ill also involve SQLite decoding 100k 
> field values and your application calling sqlite3_column interface 100k times 
> for each and every row, which yield an expected performance of about 2 rows 
> per second. Can you afford to use that much memory and time?
>
> asql> create temp table genes (id integer primary key, name char, f1
> asql> char, f2 char, f3 char, f4 char); .explain explain select * from
> asql> genes;
> addr  opcode p1p2p3p4 p5  comment
>   -        -  --  -
> 0 Init   0 13000  Start at 13
> 1 OpenRead   0 2 1 6  00  root=2 iDb=1; genes
> 2 Explain2 0 0 SCAN TABLE genes  00
> 3 Rewind 0 12000
> 4   Rowid  0 1 000  r[1]=rowid
> 5   Column 0 1 200  r[2]=genes.name
> 6   Column 0 2 300  r[3]=genes.f1
> 7   Column 0 3 400  r[4]=genes.f2
> 8   Column 0 4 500  r[5]=genes.f3
> 9   Column 0 5 600  r[6]=genes.f4
> 10  ResultRow  1 6 000  output=r[1..6]
> 11Next   0 4 001
> 12Halt   0 0 000
> 13Transaction1 0 1 0  01  usesStmtJournal=0
> 14Goto   0 1 000
>
> -Ursprüngliche Nachricht-
> Von: sqlite-users
> [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von
> Mitar
> Gesendet: Donnerstag, 17. Oktober 2019 15:11
> An: SQLite mailing list 
> Betreff: [EXTERNAL] Re: [sqlite] Limit on number of columns in SQLite
> table
>
> Hi!
>
> On Thu, Oct 17, 2019 at 3:04 PM Eric Grange  wrote:
> > my suggestion would be to store them as JSON in a blob, and use the
> > JSON functions of SQLite to extract the data
>
> JSON has some crazy limitations like by standard it does not support full 
> floating point spec, so NaN and infinity cannot be represented there. So JSON 
> is really no a great format when you want to preserve as much of the input as 
> possible (like, integers, floats, text, and binary). SQLite seems to be spot 
> on in this regard.
>
> But yes, if there would be some other standard to SQLite and supported format 
> to embed, that approach would be useful. Like composite value types.
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
>  Gunter Hick | Software Engineer | Scientific Games International GmbH
> | Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013
> | (O) +43 1 80100 - 0
>
> May be privileged. May be confidential. Please delete if not the addressee.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



--
http://mitar.tnode.com/
https://twitter.com/mitar_m
___
sqlite-users mailing list
sqlite-u

Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

2019-10-17 Thread Hick Gunter
If you need to perform queries over the 100k rows, then normalization is the 
only practical way and reconstructing a row will similarly slow.

But you have stated that you use case is "retrieving complete rows". In this 
case, SQLite does not need to know the 100k details of the row. Keep your data 
in whatever format they come in - which is another requirement you have 
expressed. All you have to extract from the 100k details is the columns that 
you need to identify the row(s) you want back.

Create table gene_data (sample_id integer primary key, name char, raw_data 
blob);

Anything else you need to do with the row data goes into your application.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Mitar
Gesendet: Donnerstag, 17. Oktober 2019 15:46
An: SQLite mailing list 
Betreff: Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

 Hi!

Thanks for this input. So you are saying that sqlite3_column 100k times per row 
is slow, but retrieving 100k rows to construct one "original" row will be 
faster? So not sure if I understand why reading and decoding cells in over 
multiple columns is so much slower than reading and decoding cells in over 
multiple rows?

Mitar

On Thu, Oct 17, 2019 at 3:38 PM Hick Gunter  wrote:
>
> I have the impression that you still do not grasp the folly of a 100k column 
> schema.
>
> See the example below, which only has 6 fields. As you can see, each field 
> requires a Column opcode and arguments (about 10 bytes) and a "register" to 
> hold the value (48 bytes), which for 100k columns uses about 5.5Megabytes to 
> retrieve a row from the database. It ill also involve SQLite decoding 100k 
> field values and your application calling sqlite3_column interface 100k times 
> for each and every row, which yield an expected performance of about 2 rows 
> per second. Can you afford to use that much memory and time?
>
> asql> create temp table genes (id integer primary key, name char, f1
> asql> char, f2 char, f3 char, f4 char); .explain explain select * from
> asql> genes;
> addr  opcode p1p2p3p4 p5  comment
>   -        -  --  -
> 0 Init   0 13000  Start at 13
> 1 OpenRead   0 2 1 6  00  root=2 iDb=1; genes
> 2 Explain2 0 0 SCAN TABLE genes  00
> 3 Rewind 0 12000
> 4   Rowid  0 1 000  r[1]=rowid
> 5   Column 0 1 200  r[2]=genes.name
> 6   Column 0 2 300  r[3]=genes.f1
> 7   Column 0 3 400  r[4]=genes.f2
> 8   Column 0 4 500  r[5]=genes.f3
> 9   Column 0 5 600  r[6]=genes.f4
> 10  ResultRow  1 6 000  output=r[1..6]
> 11Next   0 4 001
> 12Halt   0 0 000
> 13Transaction1 0 1 0  01  usesStmtJournal=0
> 14Goto   0 1 000
>
> -Ursprüngliche Nachricht-
> Von: sqlite-users
> [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von
> Mitar
> Gesendet: Donnerstag, 17. Oktober 2019 15:11
> An: SQLite mailing list 
> Betreff: [EXTERNAL] Re: [sqlite] Limit on number of columns in SQLite
> table
>
> Hi!
>
> On Thu, Oct 17, 2019 at 3:04 PM Eric Grange  wrote:
> > my suggestion would be to store them as JSON in a blob, and use the
> > JSON functions of SQLite to extract the data
>
> JSON has some crazy limitations like by standard it does not support full 
> floating point spec, so NaN and infinity cannot be represented there. So JSON 
> is really no a great format when you want to preserve as much of the input as 
> possible (like, integers, floats, text, and binary). SQLite seems to be spot 
> on in this regard.
>
> But yes, if there would be some other standard to SQLite and supported format 
> to embed, that approach would be useful. Like composite value types.
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
>  Gunter Hick | Software Engineer | Scientific Games International GmbH
> | Klitschgasse 2-4, A-1130 Vienna | F

Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

2019-10-17 Thread Mitar
 Hi!

Thanks for this input. So you are saying that sqlite3_column 100k
times per row is slow, but retrieving 100k rows to construct one
"original" row will be faster? So not sure if I understand why reading
and decoding cells in over multiple columns is so much slower than
reading and decoding cells in over multiple rows?

Mitar

On Thu, Oct 17, 2019 at 3:38 PM Hick Gunter  wrote:
>
> I have the impression that you still do not grasp the folly of a 100k column 
> schema.
>
> See the example below, which only has 6 fields. As you can see, each field 
> requires a Column opcode and arguments (about 10 bytes) and a "register" to 
> hold the value (48 bytes), which for 100k columns uses about 5.5Megabytes to 
> retrieve a row from the database. It ill also involve SQLite decoding 100k 
> field values and your application calling sqlite3_column interface 100k times 
> for each and every row, which yield an expected performance of about 2 rows 
> per second. Can you afford to use that much memory and time?
>
> asql> create temp table genes (id integer primary key, name char, f1 char, f2 
> char, f3 char, f4 char);
> asql> .explain
> asql> explain select * from genes;
> addr  opcode p1p2p3p4 p5  comment
>   -        -  --  -
> 0 Init   0 13000  Start at 13
> 1 OpenRead   0 2 1 6  00  root=2 iDb=1; genes
> 2 Explain2 0 0 SCAN TABLE genes  00
> 3 Rewind 0 12000
> 4   Rowid  0 1 000  r[1]=rowid
> 5   Column 0 1 200  r[2]=genes.name
> 6   Column 0 2 300  r[3]=genes.f1
> 7   Column 0 3 400  r[4]=genes.f2
> 8   Column 0 4 500  r[5]=genes.f3
> 9   Column 0 5 600  r[6]=genes.f4
> 10  ResultRow  1 6 000  output=r[1..6]
> 11Next   0 4 001
> 12Halt   0 0 000
> 13Transaction1 0 1 0  01  usesStmtJournal=0
> 14Goto   0 1 000
>
> -Ursprüngliche Nachricht-
> Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
> Auftrag von Mitar
> Gesendet: Donnerstag, 17. Oktober 2019 15:11
> An: SQLite mailing list 
> Betreff: [EXTERNAL] Re: [sqlite] Limit on number of columns in SQLite table
>
> Hi!
>
> On Thu, Oct 17, 2019 at 3:04 PM Eric Grange  wrote:
> > my suggestion would be to store them as JSON in a blob, and use the
> > JSON functions of SQLite to extract the data
>
> JSON has some crazy limitations like by standard it does not support full 
> floating point spec, so NaN and infinity cannot be represented there. So JSON 
> is really no a great format when you want to preserve as much of the input as 
> possible (like, integers, floats, text, and binary). SQLite seems to be spot 
> on in this regard.
>
> But yes, if there would be some other standard to SQLite and supported format 
> to embed, that approach would be useful. Like composite value types.
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
>  Gunter Hick | Software Engineer | Scientific Games International GmbH | 
> Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) 
> +43 1 80100 - 0
>
> May be privileged. May be confidential. Please delete if not the addressee.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

2019-10-17 Thread Hick Gunter
I have the impression that you still do not grasp the folly of a 100k column 
schema.

See the example below, which only has 6 fields. As you can see, each field 
requires a Column opcode and arguments (about 10 bytes) and a "register" to 
hold the value (48 bytes), which for 100k columns uses about 5.5Megabytes to 
retrieve a row from the database. It ill also involve SQLite decoding 100k 
field values and your application calling sqlite3_column interface 100k times 
for each and every row, which yield an expected performance of about 2 rows per 
second. Can you afford to use that much memory and time?

asql> create temp table genes (id integer primary key, name char, f1 char, f2 
char, f3 char, f4 char);
asql> .explain
asql> explain select * from genes;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Init   0 13000  Start at 13
1 OpenRead   0 2 1 6  00  root=2 iDb=1; genes
2 Explain2 0 0 SCAN TABLE genes  00
3 Rewind 0 12000
4   Rowid  0 1 000  r[1]=rowid
5   Column 0 1 200  r[2]=genes.name
6   Column 0 2 300  r[3]=genes.f1
7   Column 0 3 400  r[4]=genes.f2
8   Column 0 4 500  r[5]=genes.f3
9   Column 0 5 600  r[6]=genes.f4
10  ResultRow  1 6 000  output=r[1..6]
11Next   0 4 001
12Halt   0 0 000
13Transaction1 0 1 0  01  usesStmtJournal=0
14Goto   0 1 000

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Mitar
Gesendet: Donnerstag, 17. Oktober 2019 15:11
An: SQLite mailing list 
Betreff: [EXTERNAL] Re: [sqlite] Limit on number of columns in SQLite table

Hi!

On Thu, Oct 17, 2019 at 3:04 PM Eric Grange  wrote:
> my suggestion would be to store them as JSON in a blob, and use the
> JSON functions of SQLite to extract the data

JSON has some crazy limitations like by standard it does not support full 
floating point spec, so NaN and infinity cannot be represented there. So JSON 
is really no a great format when you want to preserve as much of the input as 
possible (like, integers, floats, text, and binary). SQLite seems to be spot on 
in this regard.

But yes, if there would be some other standard to SQLite and supported format 
to embed, that approach would be useful. Like composite value types.


Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

2019-10-17 Thread Mitar
Hi!

In that case we would have to define a standard BLOB storage format,
slightly defeating the idea of using SQLite to define such standard
future-proof format. :-)


Mitar

On Thu, Oct 17, 2019 at 11:19 AM Hick Gunter  wrote:
>
> Since your data is at least mostly opaque in the sense that SQLite is not 
> expected to interpret the contents, why not split your data into "stuff you 
> want to query ins SQLite" and "stuff you want to just store"? The former 
> means individual columns, whereas the latter could be stored in a single BLOB 
> field, which only your application knows how to extract data from.
>
> This allows SQLite to efficiently process the fields it needs to know about, 
> and return BLOB data efficiently as one single field instead of having to 
> pick it apart into 100k bits.
>
> -Ursprüngliche Nachricht-
> Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
> Auftrag von Mitar
> Gesendet: Donnerstag, 17. Oktober 2019 10:56
> An: Richard Hipp 
> Cc: SQLite mailing list 
> Betreff: [EXTERNAL] Re: [sqlite] Limit on number of columns in SQLite table
>
> Hi!
>
> I can see how this is a reasonable limit when SQLite is used for querying 
> power it provides. In our case we are really focusing on it as a standard 
> long-term storage format. So in the "Appropriate Uses For SQLite" document 
> [1] you have a section called "File archive and/or data container" and this 
> is why we started considering SQLite as a dataset archive format. We would 
> not like to store files directly, but contents of those files (like contents 
> of CSV). But try to not modify them more than necessary. So we got interested 
> especially in the "SQLite is a good solution for any situation that requires 
> bundling diverse content into a self-contained and self-describing package 
> for shipment across a network." statement. So I can understand how supporting 
> a large number of columns might be inappropriate when you want to run 
> complicated SQL queries on data, but to just store data and then extract all 
> rows to do some data processing, Or as the most complicated query it would be 
> to extract just a subsample of rows. But not really do to any JOIN queries or 
> something like that. it looks like except for artificial limit in SQLite, 
> because it is not useful for general case, there is no other reason why it 
> could not be supported.
>
> So why not increase the limit to 2 billion, and have it at runtime by default 
> limited to 2000. And then using PRAGMA one could increase this if needed to 2 
> billion? PRAGMA already can decrease the limit, so we can keep the existing 
> 2000 limit, but to support it without having to recompile, people could 
> increase it all the way to 2 billion. Is there any significant performance 
> downside to this?
>
> [1] https://www.sqlite.org/whentouse.html
>
>
> Mitar
>
> On Wed, Oct 16, 2019 at 8:21 PM Richard Hipp  wrote:
> >
> > SQLite could, in theory, be enhanced (with just a few minor tweaks) to
> > support up to 2 billion columns.  But having a relation with a large
> > number of columns seems like a very bad idea stylistically.  That's
> > not how relational databases are intended to be used.  Normally when a
> > table acquires more than a couple dozen columns, that is a good
> > indication that you need normalize and/or refactor your schema. Schema
> > designers almost unanimously follow that design principle.  And so
> > SQLite is optimized for the overwhelmingly common case of a small
> > number of columns per table.
> >
> > Hence, adding the ability to have a table with a huge number of
> > columns is not something that I am interested in supporting in SQLite
> > at this time.
> >
> > --
> > D. Richard Hipp
> > d...@sqlite.org
>
>
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ___
>  Gunter Hick | Software Engineer | Scientific Games International GmbH | 
> Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) 
> +43 1 80100 - 0
>
> May be privileged. May be confidential. Please delete if not the addressee.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] [EXTERNAL] Re: Limit on number of columns in SQLite table

2019-10-17 Thread Hick Gunter
Since your data is at least mostly opaque in the sense that SQLite is not 
expected to interpret the contents, why not split your data into "stuff you 
want to query ins SQLite" and "stuff you want to just store"? The former means 
individual columns, whereas the latter could be stored in a single BLOB field, 
which only your application knows how to extract data from.

This allows SQLite to efficiently process the fields it needs to know about, 
and return BLOB data efficiently as one single field instead of having to pick 
it apart into 100k bits.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Mitar
Gesendet: Donnerstag, 17. Oktober 2019 10:56
An: Richard Hipp 
Cc: SQLite mailing list 
Betreff: [EXTERNAL] Re: [sqlite] Limit on number of columns in SQLite table

Hi!

I can see how this is a reasonable limit when SQLite is used for querying power 
it provides. In our case we are really focusing on it as a standard long-term 
storage format. So in the "Appropriate Uses For SQLite" document [1] you have a 
section called "File archive and/or data container" and this is why we started 
considering SQLite as a dataset archive format. We would not like to store 
files directly, but contents of those files (like contents of CSV). But try to 
not modify them more than necessary. So we got interested especially in the 
"SQLite is a good solution for any situation that requires bundling diverse 
content into a self-contained and self-describing package for shipment across a 
network." statement. So I can understand how supporting a large number of 
columns might be inappropriate when you want to run complicated SQL queries on 
data, but to just store data and then extract all rows to do some data 
processing, Or as the most complicated query it would be to extract just a 
subsample of rows. But not really do to any JOIN queries or something like 
that. it looks like except for artificial limit in SQLite, because it is not 
useful for general case, there is no other reason why it could not be supported.

So why not increase the limit to 2 billion, and have it at runtime by default 
limited to 2000. And then using PRAGMA one could increase this if needed to 2 
billion? PRAGMA already can decrease the limit, so we can keep the existing 
2000 limit, but to support it without having to recompile, people could 
increase it all the way to 2 billion. Is there any significant performance 
downside to this?

[1] https://www.sqlite.org/whentouse.html


Mitar

On Wed, Oct 16, 2019 at 8:21 PM Richard Hipp  wrote:
>
> SQLite could, in theory, be enhanced (with just a few minor tweaks) to
> support up to 2 billion columns.  But having a relation with a large
> number of columns seems like a very bad idea stylistically.  That's
> not how relational databases are intended to be used.  Normally when a
> table acquires more than a couple dozen columns, that is a good
> indication that you need normalize and/or refactor your schema. Schema
> designers almost unanimously follow that design principle.  And so
> SQLite is optimized for the overwhelmingly common case of a small
> number of columns per table.
>
> Hence, adding the ability to have a table with a huge number of
> columns is not something that I am interested in supporting in SQLite
> at this time.
>
> --
> D. Richard Hipp
> d...@sqlite.org



--
http://mitar.tnode.com/
https://twitter.com/mitar_m
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users