Re: [sqlite] Limit on number of columns in SQLite table
On Wed, Nov 6, 2019 at 7:22 PM Jens Alfke wrote: > > > > On Nov 6, 2019, at 9:25 AM, Digital Dog wrote: > > > > If there are millions or billions of rows > > in the data set I consider it big data and the only reasonable format for > > storing it is a column store format. > > There are many types of stores for "big data". My employer, Couchbase, has > customers who would disagree with you; they use our document database to > store millions or billions of rows. > I was talking about this specific data set with very well defined columns. However I'm not sure if Parquet will be able to store this amount of columns, or query them efficiently. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Limit on number of columns in SQLite table
> On Nov 6, 2019, at 9:25 AM, Digital Dog wrote: > > If there are millions or billions of rows > in the data set I consider it big data and the only reasonable format for > storing it is a column store format. There are many types of stores for "big data". My employer, Couchbase, has customers who would disagree with you; they use our document database to store millions or billions of rows. It depends on the data type. For highly structured data with many sparse columns you may be right. For less structured data a key-value or document store is better. I'm sure there are other types of big-data storage I'm unaware of. In the original poster's case, it didn't seem like the data set really had zillions of columns, since [IIRC] they didn't need to be queried separately. You could put something like that in a key-value store with each value an encoded C array, for example. —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Limit on number of columns in SQLite table
Hi! I'm late to the party, but want to contribute. I did not read all messages in the thread, but in those I've read did not contain a question about number of ROWS. We know how many columns you desire, but how many rows are there? No matter how I like SQLite, I would not store this kind of data in it, neither short term nor long term. If there are millions or billions of rows in the data set I consider it big data and the only reasonable format for storing it is a column store format. It gives up to 1:100 compression (depending on the data of course) and incredible querying speeds (if it doesn't have to read some columns it just doesn't do that). If it can skip row groups altogether based on metadata, it will. None of row storage engines will give you comparable efficiency. And especially because you do not need transactional semantic, you can consider faster engines, tailored to this use case (without WAL, locking or MVCC). The ones I can think of now are: Apache Parquet, Cassandra and MonetDB, not to mention excellent implementation and support in Microsoft SQL Server. You have to do your own research on this subject. On Thu, Oct 17, 2019 at 7:38 PM Mitar wrote: > ""Hi! > > On Thu, Oct 17, 2019 at 5:38 PM Jens Alfke wrote: > > Why should SQLite make changes, which would introduce performance > problems if used, just to save your particular application the trouble of > concatenating some vectors into single columns, when it uses SQLite for an > edge use-case that’s pretty far removed from its main purpose? > > Then maybe section "File archive and/or data container" in > "Appropriate Uses For SQLite" should explain that this is not the > purpose of SQLite anymore. Because "SQLite is a good solution for any > situation that requires bundling diverse content into a self-contained > and self-describing package for shipment across a network." seem to > work only when "diverse" is a table with less 2000 columns. Somehow > describing a table with key/value columns can hardly be called > self-describing. > > I am on purpose ironic, because I am not sure if talking about "main > purpose" is really a constructive conversation here if there is a list > of many "non-main" but still suggested use cases for SQLite. Not to > mention the "Data analysis" use case, where again, if I am used to do > analysis on datasets with many columns now would have to change the > algorithms how I do my analysis to adapt to limited number of columns. > It does not seem that putting vectors into single columns would really > enable many "Data analysis" options inside SQLite. I am even surprised > that it says "Many bioinformatics researchers use SQLite in this way." > With limit on 2000 columns this is a very strange claim. I would love > to see a reference here and see how they do that. I might learn > something new. > > > Mitar > > -- > http://mitar.tnode.com/ > https://twitter.com/mitar_m > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Klocwork static analysis report
On Wed, 6 Nov 2019 at 22:21, Richard Hipp wrote: > On 11/6/19, Raitses, Alex wrote: > > Hello, > > Please find attached Klocwork static analysis report for “C source code > as > > an amalgamation”, version 3.30.1 (sqlite3.c). > > Can you please review the report attached and update which bugs can be > > fixed. > > > > (1) This mailing list strips attachments. > > (2) Klocworks does not find "bugs". Rather, it finds warnings. The > overwhelming majority of warnings found by klocworks are > false-positives. I do not recall an occasion where klockworks found > an actual bug in SQLite. Mostly, klocworks warning list are just a > distraction for the developers that take their time away from finding > real bugs. Please ignore klocworks, at it is not a useful tool for > finding errors in SQLite. Having done a triage of Klocwork issues on an earlier amalgamation (to assuage organisational worry about open source), by far the majority were null pointer warnings on code paths that could never be null. It did not inspire confidence. Regards, Donald Shepherd. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Klocwork static analysis report
On 11/6/19, Raitses, Alex wrote: > Hello, > Please find attached Klocwork static analysis report for “C source code as > an amalgamation”, version 3.30.1 (sqlite3.c). > Can you please review the report attached and update which bugs can be > fixed. > (1) This mailing list strips attachments. (2) Klocworks does not find "bugs". Rather, it finds warnings. The overwhelming majority of warnings found by klocworks are false-positives. I do not recall an occasion where klockworks found an actual bug in SQLite. Mostly, klocworks warning list are just a distraction for the developers that take their time away from finding real bugs. Please ignore klocworks, at it is not a useful tool for finding errors in SQLite. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Klocwork static analysis report
Hello, Please find attached Klocwork static analysis report for “C source code as an amalgamation”, version 3.30.1 (sqlite3.c). Can you please review the report attached and update which bugs can be fixed. Regards, Alex - Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users