Hi! I'm late to the party, but want to contribute. I did not read all messages in the thread, but in those I've read did not contain a question about number of ROWS. We know how many columns you desire, but how many rows are there?
No matter how I like SQLite, I would not store this kind of data in it, neither short term nor long term. If there are millions or billions of rows in the data set I consider it big data and the only reasonable format for storing it is a column store format. It gives up to 1:100 compression (depending on the data of course) and incredible querying speeds (if it doesn't have to read some columns it just doesn't do that). If it can skip row groups altogether based on metadata, it will. None of row storage engines will give you comparable efficiency. And especially because you do not need transactional semantic, you can consider faster engines, tailored to this use case (without WAL, locking or MVCC). The ones I can think of now are: Apache Parquet, Cassandra and MonetDB, not to mention excellent implementation and support in Microsoft SQL Server. You have to do your own research on this subject. On Thu, Oct 17, 2019 at 7:38 PM Mitar <mmi...@gmail.com> wrote: > ""Hi! > > On Thu, Oct 17, 2019 at 5:38 PM Jens Alfke <j...@mooseyard.com> wrote: > > Why should SQLite make changes, which would introduce performance > problems if used, just to save your particular application the trouble of > concatenating some vectors into single columns, when it uses SQLite for an > edge use-case that’s pretty far removed from its main purpose? > > Then maybe section "File archive and/or data container" in > "Appropriate Uses For SQLite" should explain that this is not the > purpose of SQLite anymore. Because "SQLite is a good solution for any > situation that requires bundling diverse content into a self-contained > and self-describing package for shipment across a network." seem to > work only when "diverse" is a table with less 2000 columns. Somehow > describing a table with key/value columns can hardly be called > self-describing. > > I am on purpose ironic, because I am not sure if talking about "main > purpose" is really a constructive conversation here if there is a list > of many "non-main" but still suggested use cases for SQLite. Not to > mention the "Data analysis" use case, where again, if I am used to do > analysis on datasets with many columns now would have to change the > algorithms how I do my analysis to adapt to limited number of columns. > It does not seem that putting vectors into single columns would really > enable many "Data analysis" options inside SQLite. I am even surprised > that it says "Many bioinformatics researchers use SQLite in this way." > With limit on 2000 columns this is a very strange claim. I would love > to see a reference here and see how they do that. I might learn > something new. > > > Mitar > > -- > http://mitar.tnode.com/ > https://twitter.com/mitar_m > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users