[sqlite] Does column order matter for performance?

2015-04-30 Thread Pol-Online
> (2) Even if the row is all on one page, SQLite has to compute sizes of
> all prior columns and add those sizes together in order to find the
> offset to a particular column in that row. The fewer prior columns
> there are, the less work is involved for this step.

I?m curious: how much work are we talking about here? If I have 99 INTEGER 
fields followed by a BLOB one, to access the BLOB field, does SQLite need to 
compute the size of each previous INTEGER field first?




-Pol


Pol-Online
info at pol-online.net (mailto:info at pol-online.net)



[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Dan Kennedy
On 04/30/2015 02:08 PM, Dominique Pell? wrote:
> Dan Kennedy  wrote:
>
>> On 04/30/2015 07:41 AM, Scott Robison wrote:
>>>
>>> I wrote up some test code today that definitely forces a SQLITE_NOMEM
>>> condition, and there are no leaks (though there are a couple really large
>>> outstanding allocations to force an error without needing to insert
>>> gigabytes of data).
>>>
>>> In thinking about what Dan wrote last night (many short word matches) I
>>> decided that maybe my supposition was wrong and it wasn't the number of
>>> relatively unique words in my old data set. Using the FTS4AUX virtual
>>> table, I recall looking at the info in the DB at the time and seeing a
>>> huge
>>> number of words with 1 or very few matches, and assumed that was the
>>> problem in that data set. However, given the random nature of that data
>>> set, and the fact that there are only 26 single letter "words" and 676
>>> double letter "words" (and so on), I could have easily missed the
>>> relatively few rows of data that had very large numbers of docids /
>>> occurrences in the aux table output.
>>>
>>> My test app goes to the other extreme. It inserts as many rows as possible
>>> consisting of the single letter word "a" 256 times, and in my case, it
>>> fails after 1,052,641 rows were inserted (1,048,576 rows were committed).
>>>
>>> In any case, my memory of the "merging" of things was correct, though the
>>> precise location may not have been.
>>>
>>> Notes:
>>>
>>> 1. Depending on how much memory pressure I put on the system by
>>> pre-allocating even more big chunks of memory, the NOMEM error moves
>>> around
>>> a bit. I've seen it happen in a malloc as well.
>>>
>>> 2. The reality is that FTS was designed around certain assumptions, and
>>> these extra large data sets don't fit into those cases. In my case, the
>>> only time I've seen errors was due to synthetic / unrealistic test data.
>>> The exponential growth related to segment directory merges seems to
>>> dictate
>>> that eventually, after a lot of inserts / updates, the data structures are
>>> going to get quite large.
>>
>> Thanks for doing this. I'll look at it properly later on today to see if the
>> results suggest anything we can do.
>>
>> I'm thinking there's another problem though. At some point soon, we run into
>> this:
>>
>>https://www.sqlite.org/limits.html#max_length
>>
>> The default limit is 10^9 bytes.
>>
>> The fully-merged doclist generated for the term "a" in the above is probably
>> around 256MiB in size. So if we could successfully merge it in memory, it
>> could be inserted into the database. However, once you get up over a billion
>> records there might be doclists for common terms that exceed this limit.
>>
>> Dan.
>
> What about FTS5? I understand that it will use less memory than
> FTS3/FTS4. Will it solve this problem?

In theory, yes. But I haven't actually tried it yet.

Dan.



[sqlite] dropping a constraint

2015-04-30 Thread Scott Doctor

I noticed that the sqlite documentation does not show the 
CONSTRAINT keyword, but it looks as though sqlite accepts the 
keyword. Also it appears that sqlite does not support DROP 
CONSTRAINT, but does support DROP VIEW and other similar.

Is this correct?

-- 

Scott Doctor
scott at scottdoctor.com
--



[sqlite] SQLite using internally by Windows 10

2015-04-30 Thread E. Timothy Uy
Congratulations.

On Thu, Apr 30, 2015 at 8:24 PM, Richard Hipp  wrote:

> https://twitter.com/john_lam/status/593837681945092096
>
> --
> D. Richard Hipp
> drh at sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Does column order matter for performance?

2015-04-30 Thread Simon Slavin

On 30 Apr 2015, at 6:16pm, Drago, William @ CSG - NARDA-MITEQ  wrote:

> Read Mr. Hipp's reply to me when I asked a similar question:

To rephrase (1) slightly, when reading a row from a table, SQLite reads up to 
the last column asked for and no further.  So if you have columns you rarely 
need to read, put them at the end of the list.

Simon.


[sqlite] Does column order matter for performance?

2015-04-30 Thread Drago, William @ CSG - NARDA-MITEQ
Read Mr. Hipp's reply to me when I asked a similar question:


"Two things to be aware of:

(1) When reading a row, SQLite reads from beginning to end.  So if you have
some small integer or boolean fields, it is better to put them first in the
table. Otherwise, SQLite has to read past the big BLOBs in order to get to
the smaller fields, even if the BLOBs themselves are not used.

(2) When changing any column of a row, the entire row is rewritten,
including the unchanged columns.  So if you have some smaller fields
(integers and booleans) that change frequently and also some large BLOBs
that change infrequently, you might consider factoring the BLOBs out into a
separate table just so they don't have to be rewritten every time a boolean
in the same row changes.

Both points above a purely performance considerations.  You should always
get the correct answer either way."


Here's the entire discussion:
http://sqlite.1065341.n5.nabble.com/BLOBs-and-NULLs-td75201.html

--
Bill Drago
Senior Engineer
L3 Narda-MITEQ
435 Moreland Road
Hauppauge, NY 11788
631-272-5947 / William.Drago at L-3COM.com


> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-
> users-bounces at mailinglists.sqlite.org] On Behalf Of Pol-Online
> Sent: Thursday, April 30, 2015 12:45 PM
> To: sqlite-users at mailinglists.sqlite.org
> Subject: [sqlite] Does column order matter for performance?
>
> Hi,
>
> I wasn?t able to find the answer to this question online: does the
> column order matter for SQLite performance? E.g. should you put fixed
> width columns like INTEGER before TEXT or BLOB?
>
>
> -Pol
>
> 
> Pol-Online
> info at pol-online.net (mailto:info at pol-online.net)
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
CONFIDENTIALITY, EXPORT CONTROL AND DISCLAIMER NOTE:This e-mail and any 
attachments are solely for the use of the addressee and may contain information 
that is privileged or confidential. Any disclosure, use or distribution of the 
information contained herein is prohibited. In the event this e-mail contains 
technical data within the definition of the International Traffic in Arms 
Regulations or Export Administration Regulations, it is subject to the export 
control laws of the U.S.Government. The recipient should check this e-mail and 
any attachments for the presence of viruses as L-3 does not accept any 
liability associated with the transmission of this e-mail. If you have received 
this communication in error, please notify the sender by reply e-mail and 
immediately delete this message and any attachments.


[sqlite] Does column order matter for performance?

2015-04-30 Thread Richard Hipp
On 4/30/15, James K. Lowden  wrote:
>
> SQLite keeps its rows in pages and reads whole pages.  Within a page it
> references rows and columns by offset.  Every location within a page is
> accessed in constant time.  So column order doesn't matter because
> page-offset doesn't matter.
>

It's more complicated than that.

(1)  A single row might span multiple pages.  SQLite will only read
pages from disk for the prefix of the row it actually needs.

(2) Even if the row is all on one page, SQLite has to compute sizes of
all prior columns and add those sizes together in order to find the
offset to a particular column in that row.  The fewer prior columns
there are, the less work is involved for this step.

-- 
D. Richard Hipp
drh at sqlite.org


[sqlite] Does column order matter for performance?

2015-04-30 Thread James K. Lowden
On Thu, 30 Apr 2015 09:45:14 -0700
Pol-Online  wrote:

> I wasn?t able to find the answer to this question online: does the
> column order matter for SQLite performance? E.g. should you put fixed
> width columns like INTEGER before TEXT or BLOB?  

I'm going to check that box as No.  

SQLite keeps its rows in pages and reads whole pages.  Within a page it
references rows and columns by offset.  Every location within a page is
accessed in constant time.  So column order doesn't matter because
page-offset doesn't matter.  

Most database performance problems are design problems.  If you pay
attention to cardinality and index definition, the rest will usually
take care of itself.   

--jkl


[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread James K. Lowden
On Wed, 29 Apr 2015 20:29:07 -0600
Scott Robison  wrote:

> > That code can fail on a system configured to overcommit memory. By
> > that standard, the pointer is invalid.
> >
> 
> Accidentally sent before I was finished. In any case, by "invalid
> pointer" I did not mean to imply "it returns a bit pattern that could
> never represent a valid pointer". I mean "if you dereference a
> pointer returned by malloc that is not null or some implementation
> defined value, it should not result in an invalid memory access".

Agreed.  And I don't think that will happen with malloc.  It might, and
I have a plausible scenario, but I don't think that's what happened.  

In the bizarre context of the Linux OOM killer, the OS may promise more
memory than it can supply.  This promise is made by malloc and
materialized by writes to memory allocated through the returned
pointer, because at time of writing the the OS must actually (and may
fail to) allocate the memory from RAM or swap.  

Exhaustion of overcommitted memory does *not* result in SIGSEGV,
however.  The OOM killer selects a process for SIGKILL, and the
straw-on-the-camel's-back process that triggered the OOM condition is
not necessarily the one that is selected.  

As far as "invalid" goes, I don't see how we can single out pointers
from malloc.  In the presence of overcommitted memory, *all* addresses,
including that of the program text, are invalid in the sense that they
are undependable.  The process may be killed through no fault of its
own by virtue of a heuristic.  I think it's fair to say it makes the
machine nondeterministic, or at least adds to the machine's
nondeterminism.  

Can writing through a pointer returned by malloc (within the
allocated range) ever result in SIGSEGV?  Maybe.   I have a plausible
scenario in the context of sparse files and mmap, which malloc uses.  

Let us say that you have two processes on a 64-bit machine, and a 1 TB
filesystem.  Each process opens a new file, seeks to the position 1 TB -
1, and writes 1 byte.  Each process now owns a file whose "size" is 1 TB
and whose block count is 1.  Most of the filesystem is empty, yet the
two files have allocated 200% of the available space.  These are known
as "sparse" files; the unwritten locations are called "holes".  

Now each process calls mmap(2) on its file for the entire 1 TB.  Still
OK.  mmap will not fail.  The holes in the files return 0 when read.
When written to, the OS allocates a block from filesystem and maps it
to a page of memory.  As each process begins writing 1's sequentially
to its memory, successive blocks are allocated.  Soon enough the last
block is allocated and the filesystem will be really and truly full.  

At the next allocation, no block can be allocated and no page mapped.
What to do?  When calling write(2) on a full filesystem we expect
ENOSPC, but there's nowhere to return an error condition when writing
to memory.  Consequently the OS has no choice but to signal the
process.  That signal will be, yes, SIGSEGV.  

What does that have to do with malloc?  GNU malloc uses mmap for large
allocations; the pointer it returns is supplied by mmap for an
anonymous mapping to blocks in the swap partition.  If malloc creates
sparse files, writes through malloc'd pointers could result in SIGSEGV.
However, I do not know that that's what malloc does.  

I do not think that's what's happening in the OP's case.  I suspect the
OP's process sailed past any memory-allocation constraints because of
the overcommitted memory configuration, and eventually ran aground when
the stack was exhausted.  Others have already suggested fixing the
overcommit setting as a first step.  Others might be:

1.  Examine the core dump to determine if the SIGSEGV was triggered by
a write to heap or stack memory.  Or not, as the case may be.  ;-)  

2.  Investigate the malloc algorithm and/or replace it with one that
does not use sparse files.  

3.  Increase the stack space allocated to the process.  

It's an interesting problem.  I hope we learn the answer.   

--jkl


[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Dan Kennedy
On 04/30/2015 07:41 AM, Scott Robison wrote:
>
> I wrote up some test code today that definitely forces a SQLITE_NOMEM
> condition, and there are no leaks (though there are a couple really large
> outstanding allocations to force an error without needing to insert
> gigabytes of data).
>
> In thinking about what Dan wrote last night (many short word matches) I
> decided that maybe my supposition was wrong and it wasn't the number of
> relatively unique words in my old data set. Using the FTS4AUX virtual
> table, I recall looking at the info in the DB at the time and seeing a huge
> number of words with 1 or very few matches, and assumed that was the
> problem in that data set. However, given the random nature of that data
> set, and the fact that there are only 26 single letter "words" and 676
> double letter "words" (and so on), I could have easily missed the
> relatively few rows of data that had very large numbers of docids /
> occurrences in the aux table output.
>
> My test app goes to the other extreme. It inserts as many rows as possible
> consisting of the single letter word "a" 256 times, and in my case, it
> fails after 1,052,641 rows were inserted (1,048,576 rows were committed).
>
> In any case, my memory of the "merging" of things was correct, though the
> precise location may not have been.
>
> Notes:
>
> 1. Depending on how much memory pressure I put on the system by
> pre-allocating even more big chunks of memory, the NOMEM error moves around
> a bit. I've seen it happen in a malloc as well.
>
> 2. The reality is that FTS was designed around certain assumptions, and
> these extra large data sets don't fit into those cases. In my case, the
> only time I've seen errors was due to synthetic / unrealistic test data.
> The exponential growth related to segment directory merges seems to dictate
> that eventually, after a lot of inserts / updates, the data structures are
> going to get quite large.

Thanks for doing this. I'll look at it properly later on today to see if 
the results suggest anything we can do.

I'm thinking there's another problem though. At some point soon, we run 
into this:

   https://www.sqlite.org/limits.html#max_length

The default limit is 10^9 bytes.

The fully-merged doclist generated for the term "a" in the above is 
probably around 256MiB in size. So if we could successfully merge it in 
memory, it could be inserted into the database. However, once you get up 
over a billion records there might be doclists for common terms that 
exceed this limit.

Dan.








>
> 3. One possible change that might have an impact for Artem: right now the
> FTS_MERGE_COUNT is set to 16. 16 also seems to be hard coded in FTS in a
> few places, though I could be mistaken and the instances of "16" as a magic
> number could be coincidence. Regardless, I wonder if a different value of
> FTS_MERGE_COUNT might tweak the system so that it takes a lot longer for
> Artem to encounter the problem.
>
> Finally, the details:
>
> In this exact case, the call stack looks as follows at the time of the
> SQLITE_NOMEM error code:
>
> sqlite3MemRealloc, line 17109, sqlite3-1.c
> sqlite3Realloc, line 20996, sqlite3-1.c
> sqlite3_realloc, line 21022, sqlite3-1.c
> sqlite3Fts3SegReaderStep, line 2946, sqlite3-6.c (attempting to realloc to
> 135,265,788 bytes)
> fts3SegmentMerge, line 3214, sqlite3-6.c
> fts3AllocateSegdirIdx, line 1150, sqlite3-6.c
> fts3SegmentMerge, line 3199, sqlite3-6.c
> fts3AllocateSegdirIdx, line 1150, sqlite3-6.c
> fts3SegmentMerge, line 3199, sqlite3-6.c
> sqlite3Fts3PendingTermsFlush, line 3252, sqlite3-6.c
> fts3PendingTermsDocid, line 878, sqlite3-6.c
> sqlite3Fts3UpdateMethod, line 5619, sqlite3-6.c
> fts3UpdateMethod, line 21701, sqlite3-5.c
> sqlite3VdbeExec, line 24064, sqlite3-3.c
> sqlite3Step, line 16367, sqlite3-3.c
> sqlite3_step, line 16434, sqlite3-3.c
> main, line 68: if (chk(sqlite3_step(stmt)) != SQLITE_DONE)
>
> Here is my test code (C++, compiled with Visual C++ 2010, though I'm quite
> certain that is not the problem):
>
> #include 
> #include 
> #include 
> #include 
>
> #include "sqlite3.h"
>
> sqlite3* db = nullptr;
>
> int chk(int errcode)
> {
> if ((errcode > 0) && (errcode < 100))
> {
> int ext_errcode = sqlite3_extended_errcode(db);
> const char* p = sqlite3_errmsg(db);
> if (!p || !*p)
> p = "{missing errmsg}";
> std::ostringstream oss;
> oss << ext_errcode << '-' << p;
> throw oss.str();
> }
>
> return errcode;
> }
>
> int main()
> {
> unsigned long long ull = 0;
>
> // allocate a bunch of memory to put pressure on malloc
> std::vector allocs;
> allocs.push_back(new char[1024*1024*1024]);
> allocs.push_back(new char[256*1024*1024]);
> //allocs.push_back(new char[128*1024*1024]);
> //allocs.push_back(new char[64*1024*1024]);
>
> try
> {
> chk(sqlite3_open("test.db", ));
> chk(sqlite3_exec(db, "BEGIN", nullptr, nullptr, nullptr));
> chk(sqlite3_exec(db, "CREATE VIRTUAL TABLE IF NOT EXISTS data USING
> fts4();", nullptr, nullptr, nullptr));
>
> 

[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Scott Robison
On Thu, Apr 30, 2015 at 11:42 AM, James K. Lowden 
wrote:

> On Wed, 29 Apr 2015 20:29:07 -0600
> Scott Robison  wrote:
>
> > > That code can fail on a system configured to overcommit memory. By
> > > that standard, the pointer is invalid.
> > >
> >
> > Accidentally sent before I was finished. In any case, by "invalid
> > pointer" I did not mean to imply "it returns a bit pattern that could
> > never represent a valid pointer". I mean "if you dereference a
> > pointer returned by malloc that is not null or some implementation
> > defined value, it should not result in an invalid memory access".
>
> Agreed.  And I don't think that will happen with malloc.  It might, and
> I have a plausible scenario, but I don't think that's what happened.
>

The Linux man page for malloc documents that the pointer returned may not
be usable in the case of optimistic memory allocations, as the eventual use
of the pointer may trigger the need to commit a page of memory to the
address space and that a page of memory may not be available at that point
in time. Thus malloc, on Linux, makes no guarantee as to the viability of
using the returned pointer.

Perhaps you are correct and "sigsegv" is not the literal signal that is
triggered in this case. I don't care, really. The fact is that an
apparently valid pointer was returned from a memory allocation function yet
can result in an invalid access for whatever reason (out of memory, in this
case). The Linux OOM killer may kill the offending process (which is what
one would expect, but one would also expect malloc to return null, so we
already know not to expect the expected). Or it may kill some other process
which has done nothing wrong! Sure, the OS is protecting the two processes
address space from one another, but it seems to me that if one process can
kill another process, there is a problem.

I can see the utility of a memory allocation strategy like this. It should
in no way be the *default* memory allocation strategy, especially for a
system that tries to be posix compliant, because this is in direct
violation of posix compliance (by my reading) from
http://pubs.opengroup.org/onlinepubs/009695399/functions/malloc.html

Upon successful completion with *size* not equal to 0, *malloc*() shall
> return a pointer to the allocated space.
>

Or maybe posix just needs a better definition for "allocated space". Sure,
an address was allocated in the processes address space, but actual memory
was not allocated.

The decades old interface contract was "if you call malloc with a non-zero
size, you can depend on malloc to return a null pointer or a pointer to the
first byte of an uninitialized allocation". Thus your application could
decide what to do if the memory was not available: abort, exit, select an
alternative code path that can get the job done with less or no memory,
ignore the return value and let sigsegv handle it later.

Now with optimistic memory allocation, you do not have a choice. If your
malloc call results in an overcommit, your process can be killed later when
it tries to access the memory. Or some other innocent process might be
killed.

I really wonder how many man hours have been wasted trying to debug
problems with processes just to find out that the killed process did
nothing wrong, it was some other process overcommitting memory. Or worse,
how many man hours were wasted and no good reason was ever learned.

I came across this last night while learning more about OOM:
https://lwn.net/Articles/104179/ -- particular, the analogy, which I think
is spot on.

I realize that there is no one right answer to how an OS should handle
memory exhaustion. There are various tradeoffs. However, C is not an
operating system, it is a language, and the standards tell you how you can
expect it to behave. In this case, the C API is broken on Linux by default,
so it becomes impossible to write fault tolerant applications in the face
of this feature.

-- 
Scott Robison


[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Eduardo Morras
On Fri, 3 Apr 2015 18:16:18 +0300
Artem  wrote:

> Hi!
> 
> The situation is like that. There?s a SQLite database with around 3
> billion records. Each record consists of a certain CHAR field and
> several other additional fields with different types. The file size
> is approx. 340 gb. The maximum content length in the doc field is 256
> symbols, the content is in Russian.


You can extend fts3/4 tokenizers to recognize Russian stop words[1] and exclude 
them from FTS index. I don't know Russian, but in English, examples of stop 
words are: 'a', 'the', 'of', etc...

See https://www.sqlite.org/fts3.html#section_8_1 for implement your own 
tokenizer or extend unicode one to exclude your stop words. A fast hack can be 
to add code at/to end of icuNext[2] (file ext/fts3/fts3_icu.c) function and 
check if the token is in your stop word list and skip the token [3](it's a 
pointer to current string) instead, something like this:

233 *piEndOffset = pCsr->aOffset[iEnd];
234 *piPosition = pCsr->iToken++;
235 if ( token_is_stop_word(ppToken,nByte) ){
236   *ppToken = Pointer_to_empty_string;
237   *pnBytes = 0;
238   *piStartOffset = pCsr->aOffset[iStart+nByte];
239   *piEndOffset = pCsr->aOffset[iEnd+nByte];
240   *piPosition = pCsr->iToken--;
241 }
242 return SQLITE_OK;

N.B. It's a fast hack and I haven't compile, run or check with full Sqlite3 
documentation, 

There are list of stop words available[4][5] on internet.

[1] https://en.wikipedia.org/wiki/Stop_words
[2] 
http://www.sqlite.org/src/info/e319e108661147bcca8dd511cd562f33a1ba81b5?txt=1=177
[3] 
http://www.sqlite.org/src/info/e319e108661147bcca8dd511cd562f33a1ba81b5?txt=1=235
[4] https://code.google.com/p/stop-words/ (Warning!! GPLv3 code)
[5] http://www.ranks.nl/stopwords/russian (Warning!! Unknow licence)

> Thank you.

HTH

---   ---
Eduardo Morras 


[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Scott Doctor

This makes my head spin. ridiculous that an over commit even exists. any 
slight performance gain is totally nullified by a crashing program.

-
Scott Doctor
scott at scottdoctor.com
-

On 4/30/2015 11:47 AM, Scott Robison wrote:
> On Thu, Apr 30, 2015 at 11:42 AM, James K. Lowden  schemamania.org>
> wrote:
>
>> On Wed, 29 Apr 2015 20:29:07 -0600
>> Scott Robison  wrote:
>>
 That code can fail on a system configured to overcommit memory. By
 that standard, the pointer is invalid.

>>> Accidentally sent before I was finished. In any case, by "invalid
>>> pointer" I did not mean to imply "it returns a bit pattern that could
>>> never represent a valid pointer". I mean "if you dereference a
>>> pointer returned by malloc that is not null or some implementation
>>> defined value, it should not result in an invalid memory access".
>> Agreed.  And I don't think that will happen with malloc.  It might, and
>> I have a plausible scenario, but I don't think that's what happened.
>>
> The Linux man page for malloc documents that the pointer returned may not
> be usable in the case of optimistic memory allocations, as the eventual use
> of the pointer may trigger the need to commit a page of memory to the
> address space and that a page of memory may not be available at that point
> in time. Thus malloc, on Linux, makes no guarantee as to the viability of
> using the returned pointer.
>
> Perhaps you are correct and "sigsegv" is not the literal signal that is
> triggered in this case. I don't care, really. The fact is that an
> apparently valid pointer was returned from a memory allocation function yet
> can result in an invalid access for whatever reason (out of memory, in this
> case). The Linux OOM killer may kill the offending process (which is what
> one would expect, but one would also expect malloc to return null, so we
> already know not to expect the expected). Or it may kill some other process
> which has done nothing wrong! Sure, the OS is protecting the two processes
> address space from one another, but it seems to me that if one process can
> kill another process, there is a problem.
>
> I can see the utility of a memory allocation strategy like this. It should
> in no way be the *default* memory allocation strategy, especially for a
> system that tries to be posix compliant, because this is in direct
> violation of posix compliance (by my reading) from
> http://pubs.opengroup.org/onlinepubs/009695399/functions/malloc.html
>
> Upon successful completion with *size* not equal to 0, *malloc*() shall
>> return a pointer to the allocated space.
>>
> Or maybe posix just needs a better definition for "allocated space". Sure,
> an address was allocated in the processes address space, but actual memory
> was not allocated.
>
> The decades old interface contract was "if you call malloc with a non-zero
> size, you can depend on malloc to return a null pointer or a pointer to the
> first byte of an uninitialized allocation". Thus your application could
> decide what to do if the memory was not available: abort, exit, select an
> alternative code path that can get the job done with less or no memory,
> ignore the return value and let sigsegv handle it later.
>
> Now with optimistic memory allocation, you do not have a choice. If your
> malloc call results in an overcommit, your process can be killed later when
> it tries to access the memory. Or some other innocent process might be
> killed.
>
> I really wonder how many man hours have been wasted trying to debug
> problems with processes just to find out that the killed process did
> nothing wrong, it was some other process overcommitting memory. Or worse,
> how many man hours were wasted and no good reason was ever learned.
>
> I came across this last night while learning more about OOM:
> https://lwn.net/Articles/104179/ -- particular, the analogy, which I think
> is spot on.
>
> I realize that there is no one right answer to how an OS should handle
> memory exhaustion. There are various tradeoffs. However, C is not an
> operating system, it is a language, and the standards tell you how you can
> expect it to behave. In this case, the C API is broken on Linux by default,
> so it becomes impossible to write fault tolerant applications in the face
> of this feature.
>



[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Scott Robison
On Apr 29, 2015 11:50 PM, "Dan Kennedy"  wrote:
>
> On 04/30/2015 07:41 AM, Scott Robison wrote:
>>
>>
>> I wrote up some test code today that definitely forces a SQLITE_NOMEM
>> condition, and there are no leaks (though there are a couple really large
>> outstanding allocations to force an error without needing to insert
>> gigabytes of data).
>>
>> In thinking about what Dan wrote last night (many short word matches) I
>> decided that maybe my supposition was wrong and it wasn't the number of
>> relatively unique words in my old data set. Using the FTS4AUX virtual
>> table, I recall looking at the info in the DB at the time and seeing a
huge
>> number of words with 1 or very few matches, and assumed that was the
>> problem in that data set. However, given the random nature of that data
>> set, and the fact that there are only 26 single letter "words" and 676
>> double letter "words" (and so on), I could have easily missed the
>> relatively few rows of data that had very large numbers of docids /
>> occurrences in the aux table output.
>>
>> My test app goes to the other extreme. It inserts as many rows as
possible
>> consisting of the single letter word "a" 256 times, and in my case, it
>> fails after 1,052,641 rows were inserted (1,048,576 rows were committed).
>>
>> In any case, my memory of the "merging" of things was correct, though the
>> precise location may not have been.
>>
>> Notes:
>>
>> 1. Depending on how much memory pressure I put on the system by
>> pre-allocating even more big chunks of memory, the NOMEM error moves
around
>> a bit. I've seen it happen in a malloc as well.
>>
>> 2. The reality is that FTS was designed around certain assumptions, and
>> these extra large data sets don't fit into those cases. In my case, the
>> only time I've seen errors was due to synthetic / unrealistic test data.
>> The exponential growth related to segment directory merges seems to
dictate
>> that eventually, after a lot of inserts / updates, the data structures
are
>> going to get quite large.
>
>
> Thanks for doing this. I'll look at it properly later on today to see if
the results suggest anything we can do.
>
> I'm thinking there's another problem though. At some point soon, we run
into this:
>
>   https://www.sqlite.org/limits.html#max_length
>
> The default limit is 10^9 bytes.
>
> The fully-merged doclist generated for the term "a" in the above is
probably around 256MiB in size. So if we could successfully merge it in
memory, it could be inserted into the database. However, once you get up
over a billion records there might be doclists for common terms that exceed
this limit.

Excellent points. It all comes back to "FTS3/4 was not designed with
datasets this large in mind, and nothing simple is going to make things
better."

The stop word list seems like it could help, if it is viable in this use
case, or partitioning as I described earlier. Neither is necessarily
optimal though.

Just as there are use cases where SQLite is not recommended as a general
database solution, it seems we've identified one where FTS is not either.
Might be able to force it, but you aren't going to want to run your Google
killer with it. :)

>
> Dan.
>
>
>
>
>
>
>
>
>>
>> 3. One possible change that might have an impact for Artem: right now the
>> FTS_MERGE_COUNT is set to 16. 16 also seems to be hard coded in FTS in a
>> few places, though I could be mistaken and the instances of "16" as a
magic
>> number could be coincidence. Regardless, I wonder if a different value of
>> FTS_MERGE_COUNT might tweak the system so that it takes a lot longer for
>> Artem to encounter the problem.
>>
>> Finally, the details:
>>
>> In this exact case, the call stack looks as follows at the time of the
>> SQLITE_NOMEM error code:
>>
>> sqlite3MemRealloc, line 17109, sqlite3-1.c
>> sqlite3Realloc, line 20996, sqlite3-1.c
>> sqlite3_realloc, line 21022, sqlite3-1.c
>> sqlite3Fts3SegReaderStep, line 2946, sqlite3-6.c (attempting to realloc
to
>> 135,265,788 bytes)
>> fts3SegmentMerge, line 3214, sqlite3-6.c
>> fts3AllocateSegdirIdx, line 1150, sqlite3-6.c
>> fts3SegmentMerge, line 3199, sqlite3-6.c
>> fts3AllocateSegdirIdx, line 1150, sqlite3-6.c
>> fts3SegmentMerge, line 3199, sqlite3-6.c
>> sqlite3Fts3PendingTermsFlush, line 3252, sqlite3-6.c
>> fts3PendingTermsDocid, line 878, sqlite3-6.c
>> sqlite3Fts3UpdateMethod, line 5619, sqlite3-6.c
>> fts3UpdateMethod, line 21701, sqlite3-5.c
>> sqlite3VdbeExec, line 24064, sqlite3-3.c
>> sqlite3Step, line 16367, sqlite3-3.c
>> sqlite3_step, line 16434, sqlite3-3.c
>> main, line 68: if (chk(sqlite3_step(stmt)) != SQLITE_DONE)
>>
>> Here is my test code (C++, compiled with Visual C++ 2010, though I'm
quite
>> certain that is not the problem):
>>
>> #include 
>> #include 
>> #include 
>> #include 
>>
>> #include "sqlite3.h"
>>
>> sqlite3* db = nullptr;
>>
>> int chk(int errcode)
>> {
>> if ((errcode > 0) && (errcode < 100))
>> {
>> int ext_errcode = sqlite3_extended_errcode(db);
>> const char* p = 

[sqlite] SQLite and Scientific Computing: Arrays and Complex Numbers

2015-04-30 Thread Dominique Devienne
On Thu, Apr 30, 2015 at 2:52 AM, James K. Lowden 
wrote:

> On Tue, 28 Apr 2015 09:24:56 +0200 Dominique Devienne 
> wrote:
> > On Tue, Apr 28, 2015 at 4:16 AM, James K. Lowden <
> jklowden at schemamania.org> wrote:
> > > A major hurdle is the memory model: because array-programming
> > > libraries normally mandate the data be in contiguous memory,
> > > there's a cost to converting to/from the DBMS's B+ tree.  The more
> > > array-like the physical storage of the DBMS, the more it cedes
> > > transactional and update capability.
> >
> > Well, just look at how Oracle solved that problem. The row data
> > doesn't hold the blob itself, like in SQLite, but an index to
> > separate blob pages. This proverbial level of indirection brings
> > tremendous benefits, because you can then update a blob w/o having to
> > rewrite the whole blob (you copy the "blob" page(s) being updated
> > only, and copy the row into a new "data" page with an updated "blob
> > index" with just a few entries changed to point to the updated pages.
>
> Does Oracle provide access to the BLOB as a contiguous memory image in
> the Oracle server's virtual address space?
>

Of course not. But the same blob APIs any OCI client uses can be used by a
server extension that runs out-of-process (if native), or in-process (if
pure Java inside the Oracle server-side JVM), and the implementation of
those APIs do take advantage of the locality. No one in his right mind
would allow arbitrary 3rd party code to run in your "enterprisy" DB server
(unless "safe" like Java or sandboxed if native; Oracle doesn't do the
latter).

But processing a page of values is still much more efficient individual
values one at a time (coming from rows), and if pages happen to be
contiguous all the better. Think of it in terms of how aggregate functions
work, if you will. Perhaps you can use your BLAS library as-is if not
contiguous, but again given the aggr-func analogy, you can allocate the
continuous buffer in the begin/init block, do efficient page-sized copies
on the step block, and do all the computation in the end/finalize block on
that temp continuous buffer. Having such blob-specific pages would enable
such a more efficient scenario.


> The benefit you're describing reduces the I/O required to update the
> BLOB and to traverse the table without referencing it.


Don't forget it also enables to grow blobs w/o writing them in full, and
update them in place transactionally, both of which SQLite does not support
right now, which is my main issue with SQLite. Blobs are basically second
class citizens in SQLite, and I'd wish that changes in future versions.


> That's a *very* old technique; it was available in Sybase in 1980s.


So? I never made any claims to the originality or age of the technique. I
merely stated that this (well known, thank you for the precision) technique
is not used by SQLite, and that using it would allow to solve the two main
issues with SQLite regarding blobs.


> To support use of
> arbitrary linear algebra functions on those BLOBs, they would have to
> supplied to user-defined functions in linear memory with known
> addresses.


See my aggregate-function analogy above. My point of that some function
will work just fine working incrementally on the blob data in page-sized
(or multiples of it) chunks. Such that you can use the blob APIs to load
the blob in user-allocated buffer, i.e. a scalar user-defined function can
drive the loop over the blob, but instead if you do some kind of "inversion
of control" where its SQLite that drives the loop of blob pages, passing
control to an aggregate-like user-defined functions that operates
specifically over blobs and there pages, SQLite can pass in a read-only
pointer to its internal page cache for example and avoid copies. Just
thinking aloud.

My main argument is that I'd like the two current limitations on blobs
(can't extend w/o fully rewriting, can't incrementally update w/o either
losing transactionality or being forced to again rewrite in full) gone from
SQLite. Because then SQLite can model a proper transactional random-access
file on top of its blobs, and I can then use SQLite blobs as the back end
to a 3rd party VFS (like for HDF5 for example). I can do that (and did)
with Oracle, and I can't do that with SQLite. And I'd really like to be
able to do it. And then blobs would no longer be second class citizens in
SQLite. --DD


[sqlite] FTS and min token length

2015-04-30 Thread Pol-Online
Hi,

It seems that the built-in tokenizers (or at least the unicode61 one) has no 
lower-limit regarding the number of characters in a token. For instance looking 
for records containing `t` will return the ones with sentences containing 
?don?t?.

Does this mean FTS is indexing all the ?I? and ?a? in English sentences as well 
as all single digit occurrences, or is there some higher level exclusion 
heuristic?

Is there any way to configure the tokenize to ignore token less than 2 
characters?  


-Pol


Pol-Online
info at pol-online.net (mailto:info at pol-online.net)



[sqlite] Does column order matter for performance?

2015-04-30 Thread Pol-Online
Hi,

I wasn?t able to find the answer to this question online: does the column order 
matter for SQLite performance? E.g. should you put fixed width columns like 
INTEGER before TEXT or BLOB?  


-Pol


Pol-Online
info at pol-online.net (mailto:info at pol-online.net)



[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Dominique Pellé
Dan Kennedy  wrote:

> On 04/30/2015 07:41 AM, Scott Robison wrote:
>>
>>
>> I wrote up some test code today that definitely forces a SQLITE_NOMEM
>> condition, and there are no leaks (though there are a couple really large
>> outstanding allocations to force an error without needing to insert
>> gigabytes of data).
>>
>> In thinking about what Dan wrote last night (many short word matches) I
>> decided that maybe my supposition was wrong and it wasn't the number of
>> relatively unique words in my old data set. Using the FTS4AUX virtual
>> table, I recall looking at the info in the DB at the time and seeing a
>> huge
>> number of words with 1 or very few matches, and assumed that was the
>> problem in that data set. However, given the random nature of that data
>> set, and the fact that there are only 26 single letter "words" and 676
>> double letter "words" (and so on), I could have easily missed the
>> relatively few rows of data that had very large numbers of docids /
>> occurrences in the aux table output.
>>
>> My test app goes to the other extreme. It inserts as many rows as possible
>> consisting of the single letter word "a" 256 times, and in my case, it
>> fails after 1,052,641 rows were inserted (1,048,576 rows were committed).
>>
>> In any case, my memory of the "merging" of things was correct, though the
>> precise location may not have been.
>>
>> Notes:
>>
>> 1. Depending on how much memory pressure I put on the system by
>> pre-allocating even more big chunks of memory, the NOMEM error moves
>> around
>> a bit. I've seen it happen in a malloc as well.
>>
>> 2. The reality is that FTS was designed around certain assumptions, and
>> these extra large data sets don't fit into those cases. In my case, the
>> only time I've seen errors was due to synthetic / unrealistic test data.
>> The exponential growth related to segment directory merges seems to
>> dictate
>> that eventually, after a lot of inserts / updates, the data structures are
>> going to get quite large.
>
>
> Thanks for doing this. I'll look at it properly later on today to see if the
> results suggest anything we can do.
>
> I'm thinking there's another problem though. At some point soon, we run into
> this:
>
>   https://www.sqlite.org/limits.html#max_length
>
> The default limit is 10^9 bytes.
>
> The fully-merged doclist generated for the term "a" in the above is probably
> around 256MiB in size. So if we could successfully merge it in memory, it
> could be inserted into the database. However, once you get up over a billion
> records there might be doclists for common terms that exceed this limit.
>
> Dan.


What about FTS5? I understand that it will use less memory than
FTS3/FTS4. Will it solve this problem?

Regards
Dominique