Hi!

I'm late to the party, but want to contribute. I did not read all messages
in the thread, but in those I've read did not contain a question about
number of ROWS. We know how many columns you desire, but how many rows are
there?

No matter how I like SQLite, I would not store this kind of data in it,
neither short term nor long term. If there are millions or billions of rows
in the data set I consider it big data and the only reasonable format for
storing it is a column store format. It gives up to 1:100 compression
(depending on the data of course) and incredible querying speeds (if it
doesn't have to read some columns it just doesn't do that). If it can skip
row groups altogether based on metadata, it will. None of row storage
engines will give you comparable efficiency. And especially because you do
not need transactional semantic, you can consider faster engines, tailored
to this use case (without WAL, locking or MVCC). The ones I can think of
now are: Apache Parquet, Cassandra and MonetDB, not to mention excellent
implementation and support in Microsoft SQL Server. You have to do your own
research on this subject.


On Thu, Oct 17, 2019 at 7:38 PM Mitar <mmi...@gmail.com> wrote:

> ""Hi!
>
> On Thu, Oct 17, 2019 at 5:38 PM Jens Alfke <j...@mooseyard.com> wrote:
> > Why should SQLite make changes, which would introduce performance
> problems if used, just to save your particular application the trouble of
> concatenating some vectors into single columns, when it uses SQLite for an
> edge use-case that’s pretty far removed from its main purpose?
>
> Then maybe section "File archive and/or data container" in
> "Appropriate Uses For SQLite" should explain that this is not the
> purpose of SQLite anymore. Because "SQLite is a good solution for any
> situation that requires bundling diverse content into a self-contained
> and self-describing package for shipment across a network." seem to
> work only when "diverse" is a table with less 2000 columns. Somehow
> describing a table with key/value columns can hardly be called
> self-describing.
>
> I am on purpose ironic, because I am not sure if talking about "main
> purpose" is really a constructive conversation here if there is a list
> of many "non-main" but still suggested use cases for SQLite. Not to
> mention the "Data analysis" use case, where again, if I am used to do
> analysis on datasets with many columns now would have to change the
> algorithms how I do my analysis to adapt to limited number of columns.
> It does not seem that putting vectors into single columns would really
> enable many "Data analysis" options inside SQLite. I am even surprised
> that it says "Many bioinformatics researchers use SQLite in this way."
> With limit on 2000 columns this is a very strange claim. I would love
> to see a reference here and see how they do that. I might learn
> something new.
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to