Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-18 Thread Amirouche Boubekki
Le lun. 17 juin 2019 à 14:36, Simon Slavin a écrit : > On 17 Jun 2019, at 9:14am, Dominique Devienne wrote: > > > SQLite4's LSM backend is now an extension in SQLite3 called LSM1 > > https://www.sqlite.org/cgi/src/dir?ci=trunk=ext/lsm1=tree > > > > Which has been discussed in this list before.

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-17 Thread Simon Slavin
On 17 Jun 2019, at 9:14am, Dominique Devienne wrote: > SQLite4's LSM backend is now an extension in SQLite3 called LSM1 > https://www.sqlite.org/cgi/src/dir?ci=trunk=ext/lsm1=tree > > Which has been discussed in this list before. > Few people are using it it seems, given the low volume of

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-17 Thread Dominique Devienne
On Sun, Jun 16, 2019 at 9:02 PM Simon Slavin wrote: > On 16 Jun 2019, at 7:35pm, Amirouche Boubekki < > amirouche.boube...@gmail.com> wrote: > > > Isn't this a use-case of LSM extension? > > It would seem a very good thing to do using LSM, but I can find > documentation for LSM only in SQLite4,

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-16 Thread Simon Slavin
On 16 Jun 2019, at 7:54pm, Jens Alfke wrote: > As far as I know, there is no benefit to storing each element of such a > vector as a separate column in SQLite. Instead, the entire vector should be > stored as a single blob — for example, as a concatenation of 3072 IEEE floats > in some fixed

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-16 Thread Simon Slavin
On 16 Jun 2019, at 7:35pm, Amirouche Boubekki wrote: > Isn't this a use-case of LSM extension? It would seem a very good thing to do using LSM, but I can find documentation for LSM only in SQLite4, not SQLite3. I did find this: " I've had the

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-16 Thread Jens Alfke
> On Jun 15, 2019, at 6:42 AM, Dan Kaminsky wrote: > > One of the more useful and usable packages for Natural Language > Processing, Magnitude[1], leverages SQLite to efficiently handle the real > valued but entirely abstract collections of numbers -- vector spaces -- > that modern machine

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-16 Thread Amirouche Boubekki
Le sam. 15 juin 2019 à 20:29, Simon Slavin a écrit : > On 15 Jun 2019, at 2:42pm, Dan Kaminsky wrote: > > [about the 32676 hard limit on the number of columns in a table] > > > I spent quite a bit of time hacking large column support into a working > > Python pipeline, and I'd prefer never to

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-16 Thread Dan Kaminsky
As it happens, there is in fact a formal analytical technique that yields appropriate numbers of "well normalized" dimensions for word embeddings: https://github.com/ziyin-dl/word-embedding-dimensionality-selection

Re: [sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-15 Thread Simon Slavin
On 15 Jun 2019, at 2:42pm, Dan Kaminsky wrote: [about the 32676 hard limit on the number of columns in a table] > I spent quite a bit of time hacking large column support into a working > Python pipeline, and I'd prefer never to run that in production. > Converting this compile time variable

[sqlite] SQLITE_MAX_COLUMN should be a runtime knob for Machine Learning

2019-06-15 Thread Dan Kaminsky
Sqlite3 has something of a normative declaration in its source code: * ** This is the maximum number of ** *** Columns in a table *** Columns in an index *** Columns in a view *** Terms in the SET clause of an UPDATE statement *** Terms in the result set of a SELECT statement