Re: [sqlite] Limit on number of columns in SQLite table

2019-11-06 Thread Digital Dog
On Wed, Nov 6, 2019 at 7:22 PM Jens Alfke  wrote:

>
>
> > On Nov 6, 2019, at 9:25 AM, Digital Dog  wrote:
> >
> > If there are millions or billions of rows
> > in the data set I consider it big data and the only reasonable format for
> > storing it is a column store format.
>
> There are many types of stores for "big data". My employer, Couchbase, has
> customers who would disagree with you; they use our document database to
> store millions or billions of rows.
>

I was talking about this specific data set with very well defined columns.
However I'm not sure if Parquet will be able to store this amount of
columns, or query them efficiently.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limit on number of columns in SQLite table

2019-11-06 Thread Jens Alfke


> On Nov 6, 2019, at 9:25 AM, Digital Dog  wrote:
> 
> If there are millions or billions of rows
> in the data set I consider it big data and the only reasonable format for
> storing it is a column store format.

There are many types of stores for "big data". My employer, Couchbase, has 
customers who would disagree with you; they use our document database to store 
millions or billions of rows.

It depends on the data type. For highly structured data with many sparse 
columns you may be right. For less structured data a key-value or document 
store is better. I'm sure there are other types of big-data storage I'm unaware 
of.

In the original poster's case, it didn't seem like the data set really had 
zillions of columns, since [IIRC] they didn't need to be queried separately. 
You could put something like that in a key-value store with each value an 
encoded C array, for example.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limit on number of columns in SQLite table

2019-11-06 Thread Digital Dog
Hi!

I'm late to the party, but want to contribute. I did not read all messages
in the thread, but in those I've read did not contain a question about
number of ROWS. We know how many columns you desire, but how many rows are
there?

No matter how I like SQLite, I would not store this kind of data in it,
neither short term nor long term. If there are millions or billions of rows
in the data set I consider it big data and the only reasonable format for
storing it is a column store format. It gives up to 1:100 compression
(depending on the data of course) and incredible querying speeds (if it
doesn't have to read some columns it just doesn't do that). If it can skip
row groups altogether based on metadata, it will. None of row storage
engines will give you comparable efficiency. And especially because you do
not need transactional semantic, you can consider faster engines, tailored
to this use case (without WAL, locking or MVCC). The ones I can think of
now are: Apache Parquet, Cassandra and MonetDB, not to mention excellent
implementation and support in Microsoft SQL Server. You have to do your own
research on this subject.


On Thu, Oct 17, 2019 at 7:38 PM Mitar  wrote:

> ""Hi!
>
> On Thu, Oct 17, 2019 at 5:38 PM Jens Alfke  wrote:
> > Why should SQLite make changes, which would introduce performance
> problems if used, just to save your particular application the trouble of
> concatenating some vectors into single columns, when it uses SQLite for an
> edge use-case that’s pretty far removed from its main purpose?
>
> Then maybe section "File archive and/or data container" in
> "Appropriate Uses For SQLite" should explain that this is not the
> purpose of SQLite anymore. Because "SQLite is a good solution for any
> situation that requires bundling diverse content into a self-contained
> and self-describing package for shipment across a network." seem to
> work only when "diverse" is a table with less 2000 columns. Somehow
> describing a table with key/value columns can hardly be called
> self-describing.
>
> I am on purpose ironic, because I am not sure if talking about "main
> purpose" is really a constructive conversation here if there is a list
> of many "non-main" but still suggested use cases for SQLite. Not to
> mention the "Data analysis" use case, where again, if I am used to do
> analysis on datasets with many columns now would have to change the
> algorithms how I do my analysis to adapt to limited number of columns.
> It does not seem that putting vectors into single columns would really
> enable many "Data analysis" options inside SQLite. I am even surprised
> that it says "Many bioinformatics researchers use SQLite in this way."
> With limit on 2000 columns this is a very strange claim. I would love
> to see a reference here and see how they do that. I might learn
> something new.
>
>
> Mitar
>
> --
> http://mitar.tnode.com/
> https://twitter.com/mitar_m
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Klocwork static analysis report

2019-11-06 Thread Donald Shepherd
On Wed, 6 Nov 2019 at 22:21, Richard Hipp  wrote:

> On 11/6/19, Raitses, Alex  wrote:
> > Hello,
> > Please find attached Klocwork static analysis report for “C source code
> as
> > an amalgamation”, version 3.30.1 (sqlite3.c).
> > Can you please review the report attached and update which bugs can be
> > fixed.
> >
>
> (1) This mailing list strips attachments.
>
> (2) Klocworks does not find "bugs".  Rather, it finds warnings.  The
> overwhelming majority of warnings found by klocworks are
> false-positives.  I do not recall an occasion where klockworks found
> an actual bug in SQLite.  Mostly, klocworks warning list are just a
> distraction for the developers that take their time away from finding
> real bugs.  Please ignore klocworks, at it is not a useful tool for
> finding errors in SQLite.


Having done a triage of Klocwork issues on an earlier amalgamation (to
assuage organisational worry about open source), by far the majority were
null pointer warnings on code paths that could never be null.  It did not
inspire confidence.

Regards,
Donald Shepherd.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Klocwork static analysis report

2019-11-06 Thread Richard Hipp
On 11/6/19, Raitses, Alex  wrote:
> Hello,
> Please find attached Klocwork static analysis report for “C source code as
> an amalgamation”, version 3.30.1 (sqlite3.c).
> Can you please review the report attached and update which bugs can be
> fixed.
>

(1) This mailing list strips attachments.

(2) Klocworks does not find "bugs".  Rather, it finds warnings.  The
overwhelming majority of warnings found by klocworks are
false-positives.  I do not recall an occasion where klockworks found
an actual bug in SQLite.  Mostly, klocworks warning list are just a
distraction for the developers that take their time away from finding
real bugs.  Please ignore klocworks, at it is not a useful tool for
finding errors in SQLite.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Klocwork static analysis report

2019-11-06 Thread Raitses, Alex
Hello,
Please find attached Klocwork static analysis report for “C source code as an 
amalgamation”, version 3.30.1 (sqlite3.c).
Can you please review the report attached and update which bugs can be fixed.


Regards,
Alex
-
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users