from:"Dominique Pellé"

Re: [sqlite] sqlite 3.31.1 crashes in SVN on OpenBSD/sparc64

2020-03-10 Thread Dominique Pellé

Stefan Sperling wrote:

> > Does valgrind give any clues?
>
> Valgrind does not run on the OpenBSD/sparc64 platform, unfortunately.

Would the address sanitizer (i.e. gcc -fsanitize=address, or
clang -fsanitize=address) work on OpenBSD/sparc64?

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance Issue on Large Table

2020-02-23 Thread Dominique Pellé

Chip Beaulieu  wrote:

> I also recently vacuumed the file. The issue really showed up after the FTS5 
> was
> setup on the table. I suspect it’s got something to do with the triggers more 
> than
> the indexes. I am definitely open to suggestions. I’ve  not been able to find 
> much
> info on the internet to optimize updates to tables with FTS.

If the performance issue is about updating the FTS index, then maybe
tweaking the 'automerge' feature can help. See:
https://www.sqlite.org/fts5.html#the_automerge_configuration_option

If you don't need NEAR or phrase queries, then perhaps setting
details=column can help speeding up re-indexing (at least
it reduces the index size). See:
https://www.sqlite.org/fts5.html#the_detail_option

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Remarks/questions about SQLITE_OMIT_LOOKASIDE & SQLITE_OMIT_TWOSIZE_LOOKASIDE

2020-02-06 Thread Dominique Pellé

Hi

Here are a few remarks and questions about
SQLITE_OMIT_LOOKASIDE and
SQLITE_OMIT_TWOSIZE_LOOKASIDE:

1) SQLite has SQLITE_OMIT_LOOKASIDE to
disable lookaside memory completely at compilation time.
However, I see that some of the lookaside code
is not disabled by SQLITE_OMIT_LOOKASIDE.

For example, the struct sqlite3 contains this field:

Lookaside lookaside;  /* Lookaside malloc configuration */

... which is not in between #ifdef SQLITE_OMIT_LOOKASIDE.

It seems that we could save some memory per connections
there if there were more #ifdef SQLITE_OMIT_LOOKASIDE
or am I missing something?

2) SQLITE_OMIT_TWOSIZE_LOOKASIDE is new in SQLite-3.31.
It's not documented at:
https://sqlite.org/compile.html#_options_to_omit_features
SQLITE_OMIT_WINDOWFUNC is also not documented in that page
and perhaps others.

3) I assume that whether SQLITE_OMIT_TWOSIZE_LOOKASIDE
is defined or not does not make any difference when SQLite
is built with SQLITE_OMIT_LOOKASIDE.
Is my assumption correct?

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] CVE's opened on 3.30.1 status

2019-12-25 Thread Dominique Pellé

Raitses, Alex  wrote:

> Thanks a lot for the prompt response,
> As far as I found in Fossil repository fixes for all CVE's ,
> excepting   erroneously submitted CVE-2019-19646, were merged to Fossil.
> Can you please estimate next official release of SQLite including these fixes?

According to https://sqlite.org/draft/releaselog/3_31_0.html
SQLite-3.31 is scheduled for 2019-12-31.
I have idea how accurate this date is though.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Heap Out of Bound Read in Sqlite

2019-12-24 Thread Dominique Pellé

Yongheng Chen  wrote:

> This seems a problem of gcc(Ubuntu 5.5.0-12ubuntu5~16.04). When
> I use this specific version to compile sqlite, the problem can be repro.

Unlikely to be a compiler issue.
I can reproduce the bug with valgrind with
SQLite shell built with:

* gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
* or clang-6.0.0-1ubuntu2

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Heap Out of Bound Read in Sqlite

2019-12-24 Thread Dominique Pellé

On Tue, Dec 24, 2019 at 5:48 PM Richard Hipp  wrote:
>
> On 12/24/19, Yongheng Chen  wrote:
> >
> > When we run it with sqlite compiled with asan, we got a heap overflow crash.
> >
> > The bug exists in the latest development code.
>
> Unable to repro.  Tried tip of trunk and release, using gcc and clang,
> all with various combinations of -fsanitize=memory,
> -fsanitize=address, -fsanitize=undefined, and running under valgrind.

Hi

I just tried the latest from trunk in fossil
(1c0a05b09 2019-12-24 16:20:05 UTC) and I can
reproduce the bug when running with valgrind:

$ valgrind ./sqlite3
==6674== Memcheck, a memory error detector
==6674== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6674== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==6674== Command: ./sqlite3
==6674==
SQLite version 3.31.0 2019-12-24 16:20:05
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE v0 ( v2 NOT NULL PRIMARY KEY , v1 ) ;
sqlite> CREATE TEMP TRIGGER y AFTER INSERT ON v0 BEGIN DELETE FROM v0 ; END ;
sqlite> CREATE TRIGGER x DELETE ON v0 BEGIN INSERT INTO v0 ( v2 )
VALUES ( 10.1 ) ,( '' ) ,('') ,( 1) ,( 1) ,( 1) ,( 1 ) ON CONFLICT DO
NOTHING ; END ;
sqlite> INSERT INTO v0 VALUES ( 10 , 10 ) ;
sqlite> INSERT INTO v0 VALUES ( 10 , 10 ) ;
sqlite> INSERT INTO v0 VALUES ( 10 , 10 ) ;
sqlite> SELECT v2 + zipfile ( v2 , v1 + v2 ) == '1' , quote ( v1 LIKE
'1' ) FROM v0 ;
==6674== Invalid read of size 1
==6674==at 0x12944E: zipfileStep (shell.c:6243)
==6674==by 0x1B8711: sqlite3VdbeExec (sqlite3.c:91052)
==6674==by 0x1BD58F: sqlite3Step (sqlite3.c:82703)
==6674==by 0x1BD58F: sqlite3_step (sqlite3.c:82768)
==6674==by 0x13161D: exec_prepared_stmt (shell.c:11379)
==6674==by 0x13161D: shell_exec (shell.c:11684)
==6674==by 0x132C7B: runOneSqlLine (shell.c:18265)
==6674==by 0x13CCE4: process_input (shell.c:18365)
==6674==by 0x11DD65: main (shell.c:19123)
==6674==  Address 0x5d15ccf is 1 bytes before a block of size 120,000 alloc'd
==6674==at 0x4C2FB0F: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6674==by 0x16AAF0: sqlite3MemMalloc (sqlite3.c:23180)
==6674==by 0x1459E1: mallocWithAlarm (sqlite3.c:27071)
==6674==by 0x1459E1: sqlite3Malloc (sqlite3.c:27101)
==6674==by 0x1516A4: setupLookaside (sqlite3.c:158620)
==6674==by 0x1E463D: openDatabase (sqlite3.c:161240)
==6674==by 0x1327FF: open_db.part.64 (shell.c:12696)
==6674==by 0x132EA6: open_db (stdio2.h:97)
==6674==by 0x132EA6: runOneSqlLine (shell.c:18261)
==6674==by 0x13CCE4: process_input (shell.c:18365)
==6674==by 0x11DD65: main (shell.c:19123)
==6674==
0|NULL
sqlite>

Line where overflow is detected in shell.c:6243:

 6242   }else{
!6243 if( zName[nName-1]!='/' ){
 6244   zName = zFree = sqlite3_mprintf("%s/", zName);

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance vs. memory trade-off question

2019-12-14 Thread Dominique Pellé

Richard Hipp  wrote:

> On 12/14/19, Dominique Pellé  wrote:
> >
> > I'm curious about what kind of change reduces memory per
> > connection.
> >
>
> The branch is here: 
> https://www.sqlite.org/src/timeline?r=mini-lookaside-take-2
>
> I'll try to add more description later - busy with other things right
> this moment.

Ah, it about lookaside. I build with SQLITE_OMIT_LOOKASIDE
so I suppose the mini-lookaside-take-2 branch will do nothing then
in this case.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performance vs. memory trade-off question

2019-12-14 Thread Dominique Pellé

Richard Hipp  wrote:

> A new feature on a branch has the following disadvantages:
>
> (1)  It uses about 0.25% more CPU cycles.  (Each release of SQLite is
> normally about 0.5% faster, so enabling this feature is sort of like
> going back by one-half of a release cycle.)
>
> (2)  The code space (the size of the library) is between 400 and 500
> bytes larger (depending on compiler and optimization settings).
>
> The this one advantage:
>
> (3)  Each database connection uses about 72 KB less heap space.
>
> QUESTION:  Should this feature be default-on or default-off?
>
> What's more important to you?  0.25% fewer CPU cycles or about 72KB
> less heap space used per database connection?
>
> The feature can be activated or deactivated at start-time, but you
> take the disadvantages (the performance hit and slightly larger
> library size) regardless, unless you disable the feature at
> compile-time.  If the feature is compile-time disabled, then the
> corresponding code is omitted and and it cannot be turned on at
> start-time.
>
> If you have opinions, you can reply to this mailing list, or directly to me.

It depends of course, but for my project, saving 72KB of memory
per connection is more important as many databases are opened,
each consuming memory. It may be an unusual scenario though.

Furthermore, since I set a heap soft limit for SQLite, saving
memory per connection should leave more memory for page
caching, hence more SQLite page cache hits, hence less I/Os.

In any case, if user can choose at compile time or runtime,
the default behavior does not matter for me.

I'm curious about what kind of change reduces memory per
connection.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] https://www.sqlite.org/draft/gencol.html Typo

2019-10-29 Thread Dominique Pellé

Keith Medcalf wrote:

> At the end of the second paragraph of section 2.1:
>
> Only VIRTUAL tables can be added using ALTER TABLE.
>
> should be
>
> Only VIRTUAL columns can be added using ALTER TABLE.

Above typos is already corrected, but here are 2 other typos
in the same page: https://sqlite.org/draft/gencol.html

section 2.1:  The value of a VIRTUAL *columns* (-> column) is computed
section 2.3: The value of a generated *columns* (-> column) is always

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Why sqlite3 running at different time can have very different speed?

2019-10-19 Thread Dominique Pellé

Peng Yu  wrote:

I have never seen such a dramatic difference in non-sqlite3
> operations. For example, cat has some difference but is ~13%.
>
> $ time cat file.sqa  > /dev/null
>
> real0m7.282s
> user0m0.067s
> sys0m2.371s
> $ time cat file.sqa  > /dev/null
>
> real0m6.316s
> user0m0.062s
> sys0m2.319s
>
> How to prove the large time difference of sqlite3 is indeed due to
> cache and where is the cache?
>

You can try clearing Linux file system cache to convince
yourself that cache misses contributes to performance drop.

Run this as root:

  # sync; echo 3 > /proc/sys/vm/drop_caches

See:
https://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/

Then run your test again (not as root!).

Clearing Linux file system cache can of course degrade
performance a lot the first time you re-run your test
if it's I/O bound. The 2nd time, things will probably be
cached and fast again.

> Why the caching used by sqlite3 can not persist for a longer time?

File system cache only hold the least recently used pages.
If other applications do I/Os, Linux will eventually have
to discard old pages to accommodate for more recently used
ones. To persist longer, you need to have more memory and/or
use less memory (kill unused programs...). All unused memory
is used to cache files.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [patch] avoid dynamic alloc in vdbeSorterSort(...)

2019-10-09 Thread Dominique Pellé

Keith Medcalf  wrote:

>
> On Monday, 7 October, 2019 14:58, Dominique Pellé <
> dominique.pe...@gmail.com> wrote:
>
> >Here the allocated size is fixed (always 64 pointers), so alloca does
> >not seem needed.
>
> >I wonder how many other functions could avoid dynamic allocation
> >like this one (either with a stack array or alloca).
>
> Probably a lot. [...]


I don't think that there are a lot of other such avoidable malloc.
I searched the source code and could not find any (I may have
missed some of course).

For the records, I see that the proposed change has been merged
(slightly modified):
https://www.sqlite.org/src/info/5d76dbc5b0584c15

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [patch] avoid dynamic alloc in vdbeSorterSort(...)

2019-10-07 Thread Dominique Pellé

Here the allocated size is fixed (always 64 pointers), so alloca does
not seem needed.

I wonder how many other functions could avoid dynamic allocation
like this one (either with a stack array or alloca).

Regards
Dominique

On Mon, Oct 7, 2019 at 10:26 PM Mateusz Wajchęprzełóż
 wrote:
>
> What about sqlite3StackAllocZero and SQLITE_USE_ALLOCA?
>
> pon., 7 paź 2019 o 20:51 Dominique Pellé 
> napisał(a):
>
> > Hi
> >
> > Below is a patch which avoids a dynamic
> > allocation in vdbeSorterSort(...), using a local
> > stack array instead (faster and smaller code).
> > I assume that a local array of 64 pointers is small
> > enough to be in the stack.
> >
> > Is this worth merging?
> >
> > $ fossil diff  src/vdbesort.c
> > Index: src/vdbesort.c
> > ==
> > --- src/vdbesort.c
> > +++ src/vdbesort.c
> > @@ -1394,25 +1394,20 @@
> >  ** SQLITE_OK if successful, or an SQLite error code (i.e. SQLITE_NOMEM) if
> >  ** an error occurs.
> >  */
> >  static int vdbeSorterSort(SortSubtask *pTask, SorterList *pList){
> >int i;
> > -  SorterRecord **aSlot;
> > +  SorterRecord *aSlot[64] = { 0 };
> >SorterRecord *p;
> >int rc;
> >
> >rc = vdbeSortAllocUnpacked(pTask);
> >if( rc!=SQLITE_OK ) return rc;
> >
> >p = pList->pList;
> >pTask->xCompare = vdbeSorterGetCompare(pTask->pSorter);
> >
> > -  aSlot = (SorterRecord **)sqlite3MallocZero(64 * sizeof(SorterRecord *));
> > -  if( !aSlot ){
> > -return SQLITE_NOMEM_BKPT;
> > -  }
> > -
> >while( p ){
> >  SorterRecord *pNext;
> >  if( pList->aMemory ){
> >if( (u8*)p==pList->aMemory ){
> >  pNext = 0;
> > @@ -1438,11 +1433,10 @@
> >  if( aSlot[i]==0 ) continue;
> >  p = p ? vdbeSorterMerge(pTask, p, aSlot[i]) : aSlot[i];
> >}
> >pList->pList = p;
> >
> > -  sqlite3_free(aSlot);
> >assert( pTask->pUnpacked->errCode==SQLITE_OK
> > || pTask->pUnpacked->errCode==SQLITE_NOMEM
> >);
> >return pTask->pUnpacked->errCode;
> >  }
> >
> > Regards
> > Dominique
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] [patch] avoid dynamic alloc in vdbeSorterSort(...)

2019-10-07 Thread Dominique Pellé

Hi

Below is a patch which avoids a dynamic
allocation in vdbeSorterSort(...), using a local
stack array instead (faster and smaller code).
I assume that a local array of 64 pointers is small
enough to be in the stack.

Is this worth merging?

$ fossil diff  src/vdbesort.c
Index: src/vdbesort.c
==
--- src/vdbesort.c
+++ src/vdbesort.c
@@ -1394,25 +1394,20 @@
 ** SQLITE_OK if successful, or an SQLite error code (i.e. SQLITE_NOMEM) if
 ** an error occurs.
 */
 static int vdbeSorterSort(SortSubtask *pTask, SorterList *pList){
   int i;
-  SorterRecord **aSlot;
+  SorterRecord *aSlot[64] = { 0 };
   SorterRecord *p;
   int rc;

   rc = vdbeSortAllocUnpacked(pTask);
   if( rc!=SQLITE_OK ) return rc;

   p = pList->pList;
   pTask->xCompare = vdbeSorterGetCompare(pTask->pSorter);

-  aSlot = (SorterRecord **)sqlite3MallocZero(64 * sizeof(SorterRecord *));
-  if( !aSlot ){
-return SQLITE_NOMEM_BKPT;
-  }
-
   while( p ){
 SorterRecord *pNext;
 if( pList->aMemory ){
   if( (u8*)p==pList->aMemory ){
 pNext = 0;
@@ -1438,11 +1433,10 @@
 if( aSlot[i]==0 ) continue;
 p = p ? vdbeSorterMerge(pTask, p, aSlot[i]) : aSlot[i];
   }
   pList->pList = p;

-  sqlite3_free(aSlot);
   assert( pTask->pUnpacked->errCode==SQLITE_OK
|| pTask->pUnpacked->errCode==SQLITE_NOMEM
   );
   return pTask->pUnpacked->errCode;
 }

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLITE and the memory

2019-09-08 Thread Dominique Pellé

Philippe RIO <51...@protonmail.ch> wrote:

> Hello,
> I have an application composed of 256 databases. Each database occupied 42Mb
> for more than 950 000 records into.
>
> [samedi 7 septembre 2019 13:11:45] : Number of passwords stored : 244 152
> 645
> [samedi 7 septembre 2019 13:19:28] : Closing Log file
> [samedi 7 septembre 2019 13:19:28] : *** Log Stoped ***
>
> These 42Mb are measured after a Vacuum.
> When I lauched my application it only opens the databases (all) and the
> memory is filled by 57mb just for SQLITE !
> When I launch a treatment to generate random passwords the memory is filled
> at 87% (>2Gb).
> I would like to reduce this used memory but I don't know how.
>
> I have the following pragmas :
>
> main.page_size=65536
> wal_autocheckpoint=0
> encoding "UTF_8"
> temp_store=FILE
> case_sensitive_like=TRUE
> foreign_keys=OFF
> legacy_file_format=OFF
> cache_size=-200
> main.cache_size=-200
> main.auto_vacuum=NONE
> main.journal_mode=OFF
> main.secure_delete=OFF
> main.synchronous=OFF
> main.locking_mode=EXCLUSIVE
> main.user_version=230714the data
> main.application_id=241259
> shrink_memory
>
> Some pragmas have a sense when creating the database and the other when
> creating the tables.
>
> I have recompiled SQLITE with the following options :
>
> SQLITE_ENABLE_COLUMN_METADATA = 1
> SQLITE_THREAD_SAFE = 2
> SQLITE_DEFAULT_MEMSTATUS = 0
> SQLITE_DEFAULT_FILE_FORMAT = 4
> SQLITE_DEFAULT_PCACHE_INITSZ = 0
> SQLITE_WIN32_MALLOC = 1
> SQLITE_TEMP_STORE = 0
> SQLITE_CASE_SENSITIVE_LIKE = 1
> SQLITE_ENABLE_API_ARMOR 1
> SQLLITE_ENABLE_FTS3 = 1
> SQLLITE_ENABLE_FTS3_PARENTHESIS = 1
> SQLLITE_ENABLE_FTS3_TOKENIZER = 1
> SQLLITE_ENABLE_FTS4 = 1
> SQLLITE_ENABLE_FTS5 = 1
> SQLLITE_ENABLE_GEOPOLY = 1
> SQLLITE_ENABLE_DESERIALIZE = 1
> SQLLITE_ENABLE_JSON1 = 1
> SQLLITE_ENABLE_MEMORY_MANAGEMENT = 1
> SQLLITE_ENABLE_RTREE = 1
> SQLLITE_ENABLE_ENABLE_SESSION = 1
> SQLLITE_ENABLE_SOUNDEX = 1
>
> The software is running under W7 Pro
> I have no problem with it, I find it very fast.
> See my blog at https://md5finder.blogspot.com/2019/08/md5finder.html
> But this memory used/consummed is really a problem.
> I have a small machine with only 3Gb of RAM and only one program needs 87% !
> How to reduce that.
>
> I would appreciate some help.
> Thank You

How about running a memory profiler to find
out when memory is used?  On Linux I'd use
massif, but I don't know what's available on Windows.

If you open many DB connections, then:
1) each connection uses memory for the schema
2) each connection uses memory for the prepared statement.
3) each connection uses paged cache

You can use sqlite3_soft_heap_limit64() to
limit the total memory use in SQLite. It's a soft limit
so SQLite may allocate more if it has no other choice.

Open DBs in read-only mode if possible, it saves
memory as constraints are then not stored in memory.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] demoRead() function in test_demovfs.c not consistent with documentation

2019-06-15 Thread Dominique Pellé

Hi

SQLite page https://sqlite.org/c3ref/io_methods.html says
at the bottom:

=== BEGIN QUOTE ===
If xRead() returns SQLITE_IOERR_SHORT_READ it must
also fill in the unread portions of the buffer with zeros. A VFS
that fails to zero-fill short reads might seem to work.
However, failure to zero-fill short reads will eventually lead
to database corruption.
=== END QUOTE ===

Yet, I see that function demoRead() in the demo
VFS example at https://www.sqlite.org/src/doc/trunk/src/test_demovfs.c
does not zero-fill the buffer in case of SQLITE_IOERR_SHORT_READ.

It looks like a bug in the demo code, or is the
SQLite documentation incorrect?

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Should SQLite distinguish between +0.0 and -0.0 on output?

2019-06-12 Thread Dominique Pellé

James K. Lowden  wrote:

> On Wed, 12 Jun 2019 09:35:13 -0400
> Richard Hipp  wrote:
>
> > Question:  Should SQLite be enhanced to show -0.0 as "-0.0"?
>
> No.
>
> 1.  Prior art.  I can't think of a single programming language that
> displays -0.0 without jumping through hoops.

I disagree. I think most languages distinguish between
+0.0 and -0.0, +inf, -inf and Nan. At least c, c++, Java,
Tcl and probably many others have +0.0 and -0.0. Even my
text editor Vim knows about +0.0 and -0.0, +inf, -inf and NaN.

In Vim:
:echo 1.0 / -0.0
-inf
:echo 1.0 / 0.0
inf

In tclsh:

% expr 1.0 / -0.0
-Inf
% expr 1.0 / 0.0
Inf

> 2.  Math.  Negative zero is not a mathematical concept.

It's about IEEE 754 which is widely used, and not about
math i.e. real numbers.

> 3.  Utility.  There is none.  The user is only inconvenienced.

There can be uses for it.  +0.0 or -0.0 can result
in different outcomes when doing intermediate
computations.  If SQLite deviates from IEEE 754, it's
likely to cause issues.

I think that a database should be generic enough and not
assume that users don't need -0.0.  SQLite is written in C
and all all C functions which use double already understand
+0.0 and -0.0. So I assume that there is hardly any
overhead in supporting -0.0 in SQLite.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Stack allocation upper bound with SQLITE_USE_ALLOCA

2019-04-03 Thread Dominique Pellé

Jonathan Brandmeyer  wrote:

> What is the upper bound for stack consumption under the
> SQLITE_USE_ALLOCA compile-time option?  I see that there are a number
> of configurable size limits available as compile-time and/or run-time
> options.  Which ones affect the maximum alloca?
>
> Thanks,
> Jonathan Brandmeyer

I recall using this tool to measure stack usage on Linux:

  https://github.com/d99kris/stackusage

It gives you the stack high watermarks for each thread,
with no noticeable runtime overhead.

Of course, there is no guarantee that you hit the theoritical
worse case for stack usage while using the tool, but at least you
get at least the typical stack usage.

You could try measuring with and without -DSQLITE_USE_ALLOCA.

Other relevant defines that may influence stack usage:

SQLITE_SPELLFIX_STACKALLOC_SZ
SQLITE_SMALL_STACK

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS5 Transaction Leads to OOB Read

2019-03-18 Thread Dominique Pellé

Chu  wrote:

> The code:
>
> ```
> CREATE VIRTUAL TABLE t1 USING fts5(content);
>
> BEGIN;
> INSERT INTO t1 (content) VALUES('');
> SELECT * FROM 
> t1('*');
> END;
> ```
>
> As you can see, it creates a virtual table with fts5, and run a transaction 
> on it, this will leads to a OOB READ. The ASAN report:
>
> ```
> ➜  sqlite-crashes ../sqlite-autoconf-3270200/sqlite3 < 2-oob-read.sql
> =
> ==21007==ERROR: AddressSanitizer: heap-buffer-overflow on address 
> 0x60d02898 at pc 0x7f0cad16e6a3 bp 0x7ffdc88ddc80 sp 0x7ffdc88dd430
> READ of size 81 at 0x60d02898 thread T0
> #0 0x7f0cad16e6a2  (/lib/x86_64-linux-gnu/libasan.so.5+0xb86a2)
> #1 0x563324ca4013 in fts5HashEntrySort 
> /root/Documents/sqlite-autoconf-3270200/sqlite3.c:207762
> #2 0x563324e685f9 in sqlite3Fts5HashScanInit 
> /root/Documents/sqlite-autoconf-3270200/sqlite3.c:207820
> #3 0x563324e685f9 in fts5SegIterHashInit 
> /root/Documents/sqlite-autoconf-3270200/sqlite3.c:210321

...snip..

Just to confirm that there is a bug when running your
queries in the SQLite-3.27.2 shell with valgrind.

I did not get a heap overflow, but valgrind complains
about uninitialized memory in the same fts5HashEntrySort
function as in your stack:

$ valgrind --track-origins=yes --num-callers=50 sqlite3_shell
==10856== Memcheck, a memory error detector
==10856== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==10856== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==10856== Command: ./Output/Binary/x86_64-Linux-clang/Debug/bin/sqlite3_shell
==10856==
SQLite version 3.27.2 2019-02-25 16:06:06
NDSeV devkit 3.27.2.1 2019-02-26 16:04:39 990c4f90c3340db5
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE VIRTUAL TABLE t1 USING fts5(content);
sqlite> BEGIN;
sqlite> INSERT INTO t1 (content) VALUES('');
sqlite> SELECT * FROM
t1('*');
==10856== Conditional jump or move depends on uninitialised value(s)
==10856==at 0x4C362B2: __memcmp_sse4_1 (vg_replace_strmem.c:)
==10856==by 0x4ECC86: fts5HashEntrySort (nds_sqlite3.c:207811)
==10856==by 0x4EC55D: sqlite3Fts5HashScanInit (nds_sqlite3.c:207869)
==10856==by 0x4EBB45: fts5SegIterHashInit (nds_sqlite3.c:210370)
==10856==by 0x4EB1BE: fts5MultiIterNew (nds_sqlite3.c:211319)
==10856==by 0x4EB5A8: fts5SetupPrefixIter (nds_sqlite3.c:212995)
==10856==by 0x4EAE4A: sqlite3Fts5IndexQuery (nds_sqlite3.c:213324)
==10856==by 0x4EAA61: fts5ExprNearInitAll (nds_sqlite3.c:205310)
==10856==by 0x4EA6F3: fts5ExprNodeFirst (nds_sqlite3.c:205827)
==10856==by 0x4EA5B4: sqlite3Fts5ExprFirst (nds_sqlite3.c:205885)
==10856==by 0x4E9D1A: fts5CursorFirst (nds_sqlite3.c:215420)
==10856==by 0x4E2DD3: fts5FilterMethod (nds_sqlite3.c:215702)
==10856==by 0x4672A7: sqlite3VdbeExec (nds_sqlite3.c:90382)
==10856==by 0x42F876: sqlite3Step (nds_sqlite3.c:81765)
==10856==by 0x42F51C: sqlite3_step (nds_sqlite3.c:81830)
==10856==by 0x4236B8: exec_prepared_stmt (shell.c:10469)
==10856==by 0x4104EB: shell_exec (shell.c:10776)
==10856==by 0x42414E: runOneSqlLine (shell.c:16136)
==10856==by 0x410C5A: process_input (shell.c:16236)
==10856==by 0x40752D: main (shell.c:16995)
==10856==  Uninitialised value was created by a heap allocation
==10856==at 0x4C2FE56: malloc (vg_replace_malloc.c:299)
==10856==by 0x4E0098: sqlite3MemMalloc (nds_sqlite3.c:22886)
==10856==by 0x441ED0: mallocWithAlarm (nds_sqlite3.c:26718)
==10856==by 0x425E0B: sqlite3Malloc (nds_sqlite3.c:26748)
==10856==by 0x425E7F: sqlite3_malloc64 (nds_sqlite3.c:26772)
==10856==by 0x4FE464: sqlite3Fts5HashWrite (nds_sqlite3.c:207636)
==10856==by 0x4FE1F0: sqlite3Fts5IndexWrite (nds_sqlite3.c:213247)
==10856==by 0x4FACB9: fts5StorageInsertCallback (nds_sqlite3.c:217629)
==10856==by 0x505B04: fts5UnicodeTokenize (nds_sqlite3.c:218923)
==10856==by 0x4F623D: sqlite3Fts5Tokenize (nds_sqlite3.c:204268)
==10856==by 0x4FA0DD: sqlite3Fts5StorageIndexInsert (nds_sqlite3.c:217984)
==10856==by 0x4F9E44: fts5StorageInsert (nds_sqlite3.c:215929)
==10856==by 0x4E359D: fts5UpdateMethod (nds_sqlite3.c:216036)
==10856==by 0x4677E6: sqlite3VdbeExec (nds_sqlite3.c:90593)
==10856==by 0x42F876: sqlite3Step (nds_sqlite3.c:81765)
==10856==by 0x42F51C: sqlite3_step (nds_sqlite3.c:81830)
==10856==by 0x4236B8: exec_prepared_stmt (shell.c:10469)
==10856==by 0x4104EB: shell_exec (shell.c:10776)
==10856==by 0x42414E: runOneSqlLine (shell.c:16136)
==10856==by 0x410C5A: process_input (shell.c:16236)
==10856==by 0x40752D: main (shell.c:16995)
==10856==
sqlite> END;

Out

Re: [sqlite] Was there an announcement of 3.27?

2019-02-08 Thread Dominique Pellé

David Raymond wrote:

> SQLite version 3.27.1 is now available on the SQLite website:
>
>   https://sqlite.org/
>   https://sqlite.org/download.html
>   https://sqlite.org/releaselog/3_27_1.html

Release notes https://sqlite.org/releaselog/3_27_1.html say:

=== BEGIN QUOTE ===
Added the remove_diacritics=2 option to FTS3 and FTS5.
=== END QUOTE ===

I wonder that this does.  FTS3 or FTS5 doc were not updated,
since they only document remove_diacritics=0 and 1.

I also use the opportunity to report a few typos in
https://sqlite.org/fts5.html:

- the second character replaced with an *asterix* (-> asterisk)
- fts5 extension function made as part *of of* (repeated word "of")
-  *an the* (-> the) auxiliary data is set to NULL

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Missing SQLITE_OMIT_* in https://www.sqlite.org/compile.html

2018-10-15 Thread Dominique Pellé

Hi

I noticed that the recently added options
SQLITE_OMIT_WINDOWFUNC and
SQLITE_OMIT_UPSERT are not described at:

  https://www.sqlite.org/compile.html.

I suppose that they should be described in that page.
Perhaps there are other SQLITE_OMIT_*  options
missing in the page. I did not check all of them.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug Report: corrupted double-linked list

2018-10-10 Thread Dominique Pellé

Daniel Espinosa  wrote:

> Thanks for the response.
>
> Until now I just used GDB, cause is easy on Meson.  I will try with
> Valgrind, but I will be slow until I find a way to reproduce the problem.
>
> Just take the desition to report this bug, because just the issue trace the
> fault deeply in the SQLite secuence of destruction, but any way, I will
> check if the issue is in related on multi-threading destruction in GDA.

Alternatively to valgrind, build your app and SQLite by adding
-fsanitize=address (assuming you use gcc or clang) and rerun
the test normally. It's likely that it will then pinpoint to the
problem in your application rather than in SQLite.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] LIMIT with OFFSET is not working in android platform

2018-05-21 Thread Dominique Pellé

On Mon, May 21, 2018 at 6:04 PM Richard Hipp  wrote:

> > LOCAL_CFLAGS += -O0
> >
> > LOCAL_CPPFLAGS += -O0

> Looks like you have found a bug in clang.  I recommend updating to the
> latest version of the compiler that you can get your hands on, and see
> if that doesn't fix the problem.

> Running SQLite with compiler optimizations off will make it slower.

Either it is a bug in clang, or SQLite has an undefined behavior,
which could cause differences in debug and release mode.

Can you try to build with -fsanitize=undefined and see if it
reports anything?

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Lots of enhancements coming to version 3.24.0 - please test

2018-05-09 Thread Dominique Pellé

Richard Hipp  wrote:

> There are a lot of important changes in the SQLite code base since the
> previous release.  See
> https://www.sqlite.org/draft/releaselog/current.html for the latest
> summary.

Some corrections in the draft doc:

In https://www.sqlite.org/draft/releaselog/current.html

"[...] EXPLAIN QUERY PLAN output *an* reformats it [...]"
-> typo: an -> and


In https://www.sqlite.org/draft/c3ref/keyword_check.html

"Put all *indentifier* names [...]"
-> typo: indentifier -> identifier


In https://www.sqlite.org/draft/eqp.html

"EXPLAIN QUERY PLAN is most useful on a SELECT statement, but may also *be*
appear with other statements that [...]"
-> The word "be" should be removed.

"In the *In the* following example [...]"
-> Repeated "In the"

"[...] is *acctually* evaluated [...]
-> typo: acctually -> actually

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] SQLite performance graph not updated + 3 typos

2018-04-09 Thread Dominique Pellé

Hi

The SQLite performance graph at https://www.sqlite.org/cpu.html
has not been updated since SQLite-3.21 (currently release being
SQLite-3.23).

Was it forgotten, or is the graph updated only once in a while?

Also here are 3 typos found at https://www.sqlite.org/cpu.html
in the same sentence:

Recent *version* *so* SQLite use less *then* a third of the CPU [...]

->
Recent versions of SQLite use less than a third of the CPU [...]

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] printf() problem padding multi-byte UTF-8 code points

2018-02-17 Thread Dominique Pellé

Richard Hipp  wrote:

> On 2/17/18, Ralf Junker  wrote:
>> Example SQL:
>>
>> select
>>length(printf ('%4s', 'abc')),
>>length(printf ('%4s', 'äöü')),
>>length(printf ('%-4s', 'abc')),
>>length(printf ('%-4s', 'äöü'))
>>
>> Output is 4, 3, 4, 3. Padding seems to take into account UTF-8 bytes
>> instead of UTF-8 code points.
>>
>> Should padding not work on code points and output 4 in all cases as
>> requested?
>
> The current behavior of the printf() function in SQLite, goofy though
> it may be, exactly mirrors the behavior of the printf() C function in
> the standard library in this regard.
>
> So I'm not sure whether or not this is something that ought to be "fixed".


For what it's worth, this is what bash does, which looks
consistent with SQLite:

$ printf '[%4s]\n' 'abc'
[ abc]
$ printf '[%4s]\n' 'äöü'
[äöü]
$ printf '[%-4s]\n' 'abc'
[abc ]
$ printf '[%-4s]\n' 'äöü'
[äöü]

Perl does the same:

$ perl -e 'printf("[%4s]\n", "äöü")'
[äöü]

Vim printf() function does the same, but vim also
has a more convenient %S not present in the C printf(),
see :help printf()

  %sstring
  %6Sstring right-aligned in 6 display cells
  %6sstring right-aligned in 6 bytes
  %.9sstring truncated to 9 bytes

:echo printf('[%4s]', 'äöü')
[äöü]
:echo printf('[%4S]', 'äöü')
:[ äöü]

Perhaps SQLite could add %S along those lines.
After all, SQLite already added "%q", "%Q", "%w"
and "%z" which are not present in the C printf().

Vim uses the number of display cells (not number of
code points). East Asian characters generally take
twice the size of Latin characters on screen, and
such characters take 2 cells on screen. Vim also
provides functions to find string length in bytes strlen(),
in display cells strwidth() and number of characters
strchars():

:echo strlen('äöü')
6
:echo strwidth('äöü')
3
:echo strchars('äöü')
3

With a more interesting string containing
East Asian characters:

:echo strlen('äöü中文')
12
:echo strwidth('äöü中文')
7
:echo strchars('äöü中文')
5

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] custom rank function with FTS3/FTS4: potential crash.

2017-10-06 Thread Dominique Pellé

Dan Kennedy  wrote:

On 10/06/2017 11:58 PM, Richard Hipp wrote:
>
>> On 10/6/17, Jens Alfke  wrote:
>>
>>> Thanks for posting this — my adaptation of that sample code in my project
>>> had the same bug.
>>>
>>> I’m not sure how the SQL in your application gets generated, but if you
>>> allow untrusted SQL, it’s still possible to create a query that can cause
>>> the rank function to crash.
>>>
>> We (developers) have already made a note to update the code example in
>> the documentation to include lots of validity checking on the
>> matchinfo() blob.
>>
>
> New version on the draft website here:
>
>   http://sqlite.org/draft/fts3.html#appendix_a
>
> Any further bug reports or feedback welcome!
>
> Dan.



A few corrections on this page about FTS (hopefully not too nit-picky):

1)
=== BEGIN QUOTE ===
-- ("driver" may also appear in the title, but this alone will not satisfy
the.
-- query criteria).
=== END QUOTE ===

There should not be a dot in "the."

2)
=== BEGIN QUOTE ===
putting the keyword "NEAR" between two phrase
=== END QUOTE ===

two phase -> two phrases

3)
=== BEGIN QUOTE ===
The do not influence the results
=== END QUOTE ===

The do no -> They do not...

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Seeking advice on improving batch sql update performance on zipvfs

2017-09-08 Thread Dominique Pellé

Yue Wu  wrote:

> Hello,
> As mentioned in the subject, our goal is to improve performance regarding
> to batch sql updates.
> The update sql as follow,
>
>> UPDATE ITEM SET FIELD4 =? WHERE DS=?
>
> We run 100,000 updates in a single transaction. The zipvfs version takes
> about 20 min while uncompressed version takes about 7 min.

Which compression algorithm do you use with zipvfs?
Try LZ4, it's is times faster than zlib at compression and
decompression, at the cost of compressing a bit less:

https://github.com/lz4/lz4

Or try zstd, which can compress more than zlib and faster (especially
on 64-bit architectures), but it's not as fast as LZ4:

http://facebook.github.io/zstd/

What is also your SQLite page size?
Reducing the SQLite page size probably helps to speed up
updates, since zipvfs compresses by pages.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Version 3.20.0 coming soon...

2017-07-13 Thread Dominique Pellé

Richard Hipp  wrote:

> The 3.20.0 release of SQLite is expected in about a week.  Please
> report any issues that you have with the beta as soon as possible.
>
> Code is available in the "prerelease snapshot" at
> https://sqlite.org/download.html
>
> An overview of changes is at https://sqlite.org/draft/releaselog/3_20_0.html
>
> You can send issues to this email list, or directly to me.

Looking at the description of the new SQLITE_PREPARE_PERSISTENT
at https://sqlite.org/draft/c3ref/c_prepare_persistent.html
it's not clear to me what are the benefits.
Shouldn't the description say why it can be beneficial to
use it or not?

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Tiny typo in doc.

2017-05-22 Thread Dominique Pellé

Richard Hipp wrote:

> On 5/22/17, Donald Griggs  wrote:
>> http://www.sqlite.org/releaselog/3_19_0.html
>>
>> "containing two ore more"
>>
>
> Thank you.  And since "ore" is a real word, the spelling checker
> didn't catch this.  :-\

Another typo not caught by the spelling checker
in the same release notes page:

°using the index rather that stepping°.  (that -> than)

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] json_valid() crash on Windows - was: json() number value parsing

2017-04-11 Thread Dominique Pellé

Ralf Junker  wrote:

> SQLite on Windows crashes when running this test:
>
>   n_structure_10_opening_arrays.json
>
> The crash results from a stack overflow because json_valid() is implemented
> using a recursive parser. All versions of json1.c up to current trunk are
> affected.
>
> Here is a small SQL snippet that also triggers the overflow:
>
>   SELECT json_valid(json) FROM (
> WITH RECURSIVE
>   cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt
> WHERE x < 4) -- Eventually increase this!
> SELECT group_concat('[', '') AS json FROM cnt);
>
> Depending on compiler and available stack, you may need to increase 4 to
> a larger number to reproduce the problem. sqlite3.exe 3.18.0 on Windows 7
> ran out of stack at around 35000.
>
> The problem might escape Linux testing because it usually has a much larger
> default stack size than Windows.
>
> One solution would be to limit the parser's nesting depth as RFC 7159
> allows:
>
>   https://tools.ietf.org/html/rfc7159#section-9
>
> Ralf

A better solution is to avoid recursion when parsing JSON.

JSON parsing in Vim had the same kind of problem and it was
reworked in vim-8.0.169 to avoid recursion to be able to
parse n_structure_10_opening_arrays.json.
It can be a big change though.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] json() number value parsing

2017-04-10 Thread Dominique Pellé

Dominique Devienne  wrote:

> On Sun, Apr 9, 2017 at 10:34 AM, Olivier Mascia  wrote:
>
>> > Le 9 avr. 2017 à 03:08, Jens Alfke  a écrit :
>> >
>> >> On Apr 7, 2017, at 5:26 PM, Rolf Ade  wrote:
>> >> ./sqlite3
>> >> SQLite version 3.19.0 2017-04-07 20:20:08
>> >> [...]
>> >> sqlite> select json(' { "this" : 000.23 } ');
>> >> {"this":000.23}
>> >> If I read RFC 7159 (http://www.rfc-editor.org/rfc/rfc7159.txt <
>> http://www.rfc-editor.org/rfc/rfc7159.txt>) correct
>> >> this should return: "Error: malformed JSON".
>> >
>> > In this case I would go with Postel’s Law, paraphrased as “Be strict in
>> what you write, but lenient in what you read.” I don’t see a point in
>> disallowing something as trivial as redundant leading zeroes.
>>
>> If you'd go with Postal's Law, you would make it so:
>>
>> sqlite> select json(' { "this" : 000.23 } ');   // be lenient in what you
>> read
>> {"this":0.23}   // be strict in what you
>> write
>>
>
> I disagree. There's a spec. it should be followed, by default.
>
> A separate, explicitly enabled "lenient mode" could be added,
> for leading zeros, NaNs, C-comments, binary strings, etc...
> but it should not be the default.
>
> Otherwise you end up forever having to support non-conformity,
> possibly simply because of an oversights.
>
> Plus json_valid() is then lying about checking well-formed'ness. --DD
>
> sqlite> select json_valid('');
> 0
> sqlite> select json_valid('00.1');
> 1
> sqlite> select json_valid('0.1');
> 1
> sqlite> select json_valid('[]');
> 1
> sqlite> select json_valid('[00]');
> 1
> sqlite> select json_valid('[00.00]');
> 1


SQLite json could be added to https://github.com/nst/JSONTestSuite
(in the parsers/ directory).

This JSON test suite checks many kinds of invalid and valid JSON
inputs to make sure that they are rejected or accepted as expected
by RFC 7159. It surprisingly finds issues in many JSON parsers.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Compressed schema in memory?

2017-03-17 Thread Dominique Pellé

On Thu, Mar 16, 2017 at 11:19 PM, Bob Friesenhahn
 wrote:
> On Thu, 16 Mar 2017, Richard Hipp wrote:
>>
>>
>> Your 664K is a conservative estimate.  On my (64-bit linux) desktop,
>> I'm showing 1.58MB of heap space used to store the schema.  (Hint:
>> bring up the database in the command-line shell, load the schema by
>> doing something like ".tables", then type ".stats".  There will be a
>> line that shows you the total amount of heap memory devoted to storing
>> the schema.  I'm showing 1583864 bytes.)
>
>
> We are still using version 3.10.2 and when I type ".stats" I get only
> "Usage: .stats on|off".

Consider upgrading to a more recent version of SQLite.
In release notes at https://sqlite.org/changes.html  I see:

SQLite 3.12.0:
- Reduce the amount of memory needed to hold the schema.

No idea how much this saves though.

Also defining some SQLITE_OMIT_* can save memory when
storing the schema I think.

I also recall that at some point, SQLite introduced memory saving
by not storing constraints in memory for read-only connections.
I don't see which SQLite version introduced this, but make sure
that you open connections as read-only if you can.  In my case,
it was reducing by half the amount of memory used for the schema.

I'm also interested in saving memory in schema by the way.
So +1 if there is any potential improvement left.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Why is this so much more efficient?

2017-02-16 Thread Dominique Pellé

Cecil Westerhof  wrote:

> I have a table vmstat that I use to store vmstat info. ;-)
> At the moment it has more as 661 thousand records.
>
> In principle the values of usertime, systemtime, idletime, waittime and
> stolentime should add up to 100. I just wanted to check it. Of-course there
> could be a rounding error, so I wrote the following query:
> SELECT date
> ,  time
> ,  usertime
> ,  systemtime
> ,  idletime
> ,  waittime
> ,  stolentime
> ,  (usertime + systemtime + idletime + waittime + stolentime) AS
> totaltime
> FROM   vmstat
> WHERE  totaltime  < 99 OR totaltime > 101
>
> I did not like that, so I rewrote the WHERE to:
> WHERE  ABS(100 - totaltime) > 1
>
> The funny thing the second WHERE is more efficient as the first, where I
> would have expected it to be the other way around.
> The first takes around 1.050 milliseconds.
> The second takes around  950 milliseconds.
> So the second is around 10% more efficient. Why is this?
>
> In case it is important: I did this in sqlitebrowser 3.7.0, which uses
> SQLite 3.8.10.2.

I suspect  that when you use "WHERE ABS(100 - totaltime) > 1"
SQLite cannot use an index since it's an expression, so it does a
full table scan, whereas the other solution which does
"WHERE  totaltime  < 99 OR totaltime > 101"
may use an index on totaltime (assuming that there is an index).

In general using an index is good.  But if most of the records
satisfy the condition "ABS(100 - totaltime) > 1" then an index
can be more harmful than useful.   And that could explain
why using "WHERE ABS(100 - totaltime) > 1" is a bit faster.
You could try "EXPLAIN QUERY PLAN" on your queries to
see if they use an index or if they do a full table scan.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Smallest reasonable cache size

2017-02-16 Thread Dominique Pellé

Kim Gräsman  wrote:

> Hi all,
>
> In my battles with lots of connections competing over precious cache
> memory, I've considered giving some non-critical connections zero
> cache using `PRAGMA cache_size=0`.
>
> Is this a reasonable thing to do? If zero is too extreme, what might a
> more moderate small cache size be? 32? 64? 100?
>
> Some of these connections are only opened to do `PRAGMA quick_check;`,
> I'm guessing that operation is not very cache-intensive, assuming it
> reads pages sequentially and checks them?
>
> Thanks for any advice on this,
> - Kim

I think it's probably best to give a large cache_size to
each connection, and limit the total amount of memory
used by SQLite with sqlite3_soft_heap_limit64().
This will effectively limit the global amount of cache
used if cache pages comes from the heap.

For example, if you give 20MB to each connection
(PRAGMA cache_size=) and limit the global SQLite
heap usage to 30 MB with sqlite3_soft_limit() then even
if you have say 100 connections, SQLite will not use
more than 30 MB of heap. If a connection needs to
cache pages, unpinned cache pages from other
connections will be discarded in LRU order. Strictly
speaking, SQLite could use more than the 30 MB
soft limit if it has no other choice to allocate memory
(hence a soft limit), but in general that does not happen.

That way, inactive connections (connections that
are opened, but no queries have been done in a
long time) do hog pages in memory. Their pages
get discarded after a while. Memory will be best
used automatically to cache pages of the most active
connections. At least that's my understanding.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite 3.16.0 enters testing

2016-12-29 Thread Dominique Pellé

Bob Friesenhahn  wrote:

> On Thu, 29 Dec 2016, Darko Volaric wrote:
>
>> What are you basing that theory on?
>
>
> Perf is claimed to provide very good results but they are real results based
> on real measurements.  Due to this, the measured results are very different
> for the first time the program is executed and the second time it is
> executed.  Any other factor on the machine would impact perf results.
>
> It seems that cachegrind produces absolutely consistent results which do not
> depend on I/O, multi-core, or VM artifacts.

You're right. Consistency can matter more where measuring
small improvements that add up.

I just tried "valgrind --tool=cachegrind ..." and "perf stat ..."
with the same command. Valgrind result was more indeed
more consistent across multiple runs.

Regarding speed of measurement, the same command
took 13.8 sec with cachegrind vs only 0.28 sec with "perf stat"
and 0.27 sec with neither cachegrind nor perf stat. So
perf stat has almost no overhead whereas cachegrind has a
big overhead, making it impractical when measuring slow
commands.

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite 3.16.0 enters testing

2016-12-29 Thread Dominique Pellé

Richard Hipp  wrote:

> On 12/29/16, Bob Friesenhahn  wrote:
>> Is there a way to know how well cachegrind CPU
>> cycles map to real-world CPU usage?
>
> Not that I know of.  If you have any suggestions, please speak up.

The 'stat' command of Linux perf tool would be a good
candidate to replace cachgrind stats. It's much faster
than cachegrind. I suppose that it also gives more realistic
results. See:

https://perf.wiki.kernel.org/index.php/Tutorial#Counting_with_perf_stat

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Any performance penalty for SELECTing more columns? (C API)

2016-10-15 Thread Dominique Pellé

Jens Alfke  wrote:

> In a simple SELECT query of a single table, using the C API,
> is there any difference in performance for requesting more or
> fewer columns of the table in the result? Or is the performance
> penalty only incurred when actually reading the column values?

There can sometimes be a big difference.

The less columns you select, the more chance you have
to be able to use a covering index.  If the query can use
a covering index, it's likely to have a significant speed up.
According to section 8.0 at https://www.sqlite.org/optoverview.html
being able to use a covering index makes the query about
twice as fast.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] transaction during reading on read-only database

2016-10-06 Thread Dominique Pellé

Andrii Motsok  wrote:

> Hi,
>
>
> My understanding is that any reading from database automatically
> starts read transaction. The question: why does sqlite need to do it on 
> read-only database?
>
>
> Regards,
>
> Andrii

You may open a database as read-only, but another process can open it
as read-write.  So SQLite needs to check the lock even when the DB
is opened as read-only.

Now if the database is on a read-only file system, it is possible to
tell SQLite by
opening with file:foo.sqlite?immutable=1  (see https://www.sqlite.org/uri.html)
and that can save some time. But make sure to read the caveats at above URL.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug in CREATE INDEX

2016-08-07 Thread Dominique Pellé

Kevin O'Gorman wrote:

> CREATE INDEX has two problems:
> 1) poor default location of temporary storage.
> 2) gets wedged on very large indexes.
>
> I'm using the sqlite that came with Xubuntu 14.04, I think it's version
> 3.8.2.

SQLite-3.8.2 is old (Dec 2013). It's better to download and
compile SQLite yourself.

There has been several optimizations since that release.
In particular, looking at release notes at
https://sqlite.org/changes.html the following
improvement which may be relevant for your issue:

=== BEGIN QUOTE https://sqlite.org/changes.html ===
3.8.10:

Performance improvements for ORDER BY, VACUUM,
CREATE INDEX, PRAGMA integrity_check, and
PRAGMA quick_check.
=== END QUOTE ===

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Typographical error in sqlite.org/howtocorrupt.html

2016-07-22 Thread Dominique Pellé

Hi

I see another typo in the same page: https://www.sqlite.org/howtocorrupt.html

=== BEGIN QUOTE ===
6.2 Failures of mmap() on QNX

There exists some subtle problem with mmap() on QNX such that
making a second mmap() call against *the a* single file descriptor
can cause the memory [...]
=== END QUOTE ===

I suppose that it's meant to be: [...] against *a* single file descriptor [...]

Regards
Dominique

On Fri, Jul 22, 2016 at 5:26 PM, Richard Hipp  wrote:
> Fix checked in by Joe.  Thanks for reporting.
>
> On 7/20/16, Eubank, Tom  wrote:
>> I believe that there is a typographical error on the following page:
>>
>> https://www.sqlite.org/howtocorrupt.html
>>
>> *7.4 Database pages leak from the free page list*
>>
>> When content is deleted from an SQLite database, pages that are no
>> longer used are added to a free list and are reused to hold content
>> added *but* subsequent inserts.
>>
>> Change 'added *but* subsequent inserts.'
>> to 'added *by* subsequent inserts.'
>>
>> Peace,
>> Tom Eubank
>>
>> ---
>> Explanation, only if needed:
>>  * It seems most likely that this paragraph intends to mean:
>>   "pages in the free list are reused ... by subsequent inserts."
>>  * The sentence as written seems awkward and incomplete.
>>  * If some other reading is intended, such reading is not obvious.
>>  * If 'but' is being used as a conjunction, then something else is
>>missing from the conjunctive clause.
>>
>> This 'Explanation' was added after reading How to Report Bugs Effectively
>> ,
>> in case my suggested change is not obvious.
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Fastest way to add many simple rows to a table?

2016-05-26 Thread Dominique Pellé

Richard Hipp  wrote:

> On 5/26/16, Eric Grange  wrote:
>>
>> I am looking for the fastest way to insert many rows to a simple table.
>
> (1) Create a prepared statement:  "INSERT INTO xyzzy VALUES(?1,?2,...)"
> (2) Run "BEGIN"
> (3) Loop over each row you want to insert, bind values to the prepared
> statement, then call sqlite3_step() and sqlite3_reset().
> (4) Run "COMMIT"
> (5) Call sqlite3_finalize() on the prepared statement to avoid a memory leak.


Additionally, "PRAGMA synchronous=off;" gives significant
speed up for insertions, if you don't mind a corrupted database
in case of system crash. See:

https://www.sqlite.org/pragma.html#pragma_synchronous

Dominique
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Use of __builtin_expect in SQLite

2016-02-07 Thread Dominique Pellé

Hi

I see that SQLite has many small optimizations being
checked-in. Wouldn't it help to use the following macros
and use unlikely(...) for error paths:

#ifdef __GNUC__
#define likely(x)   __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
#else
#define likely(x)   (x)
#define unlikely(x) (x)
#endif

It helps for branch prediction and instruction cache hits
as cold branches are moved away from hot branches.
CPU have branch predictor, but for complex code, with
many 'if', static hints can still helps. Better is to compile
with PGO (profile guided optimization), but few people
use PGO in practice.

Doing a Google search, I found that __builtin_expect was
disabled because it did not work with old gcc and was
then completely removed in 2012:

http://dev.gitblit.com:8080/cgi-bin/repo/sqlite/info/e01f9ed9450d3e23
http://sqlite.org/src4/info/d618b9b1069e66779c298798acb24044664b5109

However, it's possible to check fo gcc version
using something like this as found here:

http://www.opensource.apple.com/source/X11proto/X11proto-57/xproto/xproto-7.0.21/Xfuncproto.h.in?txt

#if defined(__GNUC__) && ((__GNUC__ * 100 + __GNUC_MINOR__) >= 303)
# define _X_LIKELY(x)   __builtin_expect(!!(x), 1)
# define _X_UNLIKELY(x) __builtin_expect(!!(x), 0)
#else /* not gcc >= 3.3 */
# define _X_LIKELY(x)   (x)
# define _X_UNLIKELY(x) (x)
#endif

I'm curious about the outcome on SQLite benchmarks.

Regards
Dominique

[sqlite] Find out how many times does SQLite hit the disk?

2016-01-31 Thread Dominique Pellé

dpb  wrote:
> Dear
 SQLite Community,
>
> On windows, is there a way to find out how many times does my SQLite DB hit
> the disk?
>
> This will help me in deciding if moving to In-memory SQLite will improve my
> application performance. I am done with adding indexes to my tables in
> SQLite DB.
>
> If possible, please point me URLs that will help me.

You could print the number of SQLite page misses.
Each miss causes an I/O of page size bytes.
See sqlite3_db_status():

https://www.sqlite.org/c3ref/db_status.html
https://www.sqlite.org/c3ref/c_dbstatus_options.html

On Linux, I would also use strace to see I/Os. I suppose
that procmon on Windows can also show I/Os.

Regards
Dominique

[sqlite] why is searching for a range of rowids in an FTS table a slow operation?

2016-01-24 Thread Dominique Pellé

skywind mailing lists  wrote:

> Hi,
>
> according to the documentation this is a slow query for FTS tables:
>
> SELECT * FROM mail WHERE rowid BETWEEN 15 AND 20;
>
> while
>
> SELECT * FROM mail WHERE rowid=15;
>
> is fast.
> As far as I know both queries are fast on normal tables.
> Where is the difference / what is the reason for the performance difference?

That's what the FTS3/4 doc says:

== BEGIN QUOTE https://sqlite.org/fts3.html ===
CREATE VIRTUAL TABLE mail USING fts3(subject, body);

SELECT * FROM mail WHERE rowid = 15;-- Fast. Rowid lookup.
[...]
SELECT * FROM mail WHERE rowid BETWEEN 15 AND 20;   -- Slow. Linear scan.
==

But it might not be true anymore, since
in fossil timeline, I see this checkin:

=== 2015-01-29 ===
11:52:22 [85dc12625d] *BRANCH* Optimize range constraints on the rowid
 column of fts3/4 tables even if there is no MATCH clause in the
 query. (user: dan tags: trunk)
===

And for FTS5, I don't see anything about it in the doc, but
I see this checkin in fossil:

=== 2015-06-05 ===
19:05:57 [32cbc0ed36] Make use of range constraints on the rowid field
 of an fts5 table in full-text queries. (user: dan tags: fts5)
===

Regards
Dominique

[sqlite] In the case of ZIPVFS

2015-12-23 Thread Dominique Pellé

??? <2004wqg2008 at 163.com> wrote:

> HI,all
> SQLite retrieves the compressed records is slower than the uncompressed 
> records about 30%.
> How can improve the problem?  and anyone any suggustion?
>
> best regards
> wqg

What is your SQLite page size?  If you access small
records, you're better off with a small page size (fewer
bytes to read and decompress).  If you have large BLOBs
you can afford large page sizes (better compression, which
can result in less I/Os).

Which compression algorithm are you using? You can
see the compression algorithm the first few bytes of
the database file. Some compression algorithms are
faster than others. For example, lz4hc decompression
is much faster than zlib (about 10x faster I think) at
the cost of compressing less than zlib.

Are you reading only or also writing to the DB? Some
algorithms have very different speed for compression
and decompression.

And of course, make sure that your queries minimize
the number of SQLite pages read, which is always good
whether you compress, but especially true with compressed
DB. Watch for queries doing full table scan, etc.

Regards
Dominique

[sqlite] about attach database

2015-12-16 Thread Dominique Pellé

Scott Robison  wrote:

> On Tue, Dec 15, 2015 at 11:19 PM, Dominique Pell?  gmail.com
>> wrote:
>
>> ??? <2004wqg2008 at 163.com> wrote:
>>
>> >
>> > After testing the Sqlite3_open and ATTACH DATABASE,
>> > I found that the attach database is slower than sqlite3_open.
>> > there is attachment after the mail which includ the speed
>> > information ( millisecond ).
>>
>>
>> Your attachment was discarded (attachment not allowed in this
>> mailing list).
>>
>> Anyway, I remember observing that:
>>
>> - sqlite3_open_v2(...) is lazy.  In other words, it does not parse the
>>   schema of the DB until the first query is performed after opening
>>   the database.
>> - ATTACH is not lazy. The schema is parsed as soon as you
>>   attach a database.
>>
>> That could explain the difference in speed.
>> Would there be a way to make ATTACH lazy by the way?
>>
>
> Why would that be of benefit to you? Are you intending to attach a database
> and never use it? It seems to me the same amount of time will be taken
> either way.
>
> When it comes to opening a database, there may be a need to do some
> connection specific configuration prior to actually opening the database
> file and parsing the schema. I believe this is the reason why open defers
> that processing until later, giving you a chance to finish configuring your
> connection before locking it down. Once that configuration is complete,
> there is no advantage to deferring the open of the database.
>
> I say no advantage ... maybe I just can't think of one. Why do you think
> there would be an advantage to deferring the open & schema processing of an
> attached database?

Laziness can be useful in some cases. I have an application
that opens hundred or so of database connections. Being able to open
all connections at start-up is simple. Since it's lazy, it's also fast and does
not use memory to store schemas until the databases are actually
used later. In my application, queries happen in only few connections
after start-up out of all opened connections. For many connections,
queries happen much later or sometimes do not even happen. Laziness
is thus useful to make start-up fast and simple, without application having
to implement laziness itself.

I see that the original message from ??? says "Because there are so
many database [...]", so it seems to be the same scenario as in my
application in which laziness is quite useful. I'm not 100% sure but I'm
quite confident that laziness is the explanation for performance
discrepancy between sqlite3_open*() and ATTACH.

If laziness was useless, why would it then be already implemented
for sqlite3_open_v2(...)?

Having said all that, reading https://www.sqlite.org/c3ref/open.html
I see no mention of the fact that sqlite3_open*() is lazy.
Is it documented somewhere?

Regards
Dominique

[sqlite] about attach database

2015-12-16 Thread Dominique Pellé

??? <2004wqg2008 at 163.com> wrote:

>
> After testing the Sqlite3_open and ATTACH DATABASE,
> I found that the attach database is slower than sqlite3_open.
> there is attachment after the mail which includ the speed
> information ( millisecond ).


Your attachment was discarded (attachment not allowed in this
mailing list).

Anyway, I remember observing that:

- sqlite3_open_v2(...) is lazy.  In other words, it does not parse the
  schema of the DB until the first query is performed after opening
  the database.
- ATTACH is not lazy. The schema is parsed as soon as you
  attach a database.

That could explain the difference in speed.
Would there be a way to make ATTACH lazy by the way?

Regards
Dominique

[sqlite] Database corruption using zipvfs on android 64 bits (x86_64)

2015-11-06 Thread Dominique Pellé

Alexandre Mainville wrote:

> Hi,
>
> I am experiencing database corruption using zipvfs on android 64
> bits (x86_64).
> The same program executing on 32 bits runs without problems.
>
> The program creates a new database on every run and does a series of
> inserts and updates (always the same series of operations).
>
> Both programs produce the same SQLITE_DEBUG trace up until the point the 64
> bit version outputs:
> zipvfs database is corrupt. Line 1100 of [9d6c1880fb]
> because of a zero size payload in a slot header.
>
> I am waiting on a device to test behavior with arm64-v8a.
>
> I am using sqlite 3.8.8.3 and zipvfs 3.8.8.
>
> If anyone has experienced the same or has some insight on this, it would be
> greatly appreciated.
>
> Thanks
>
> Alex

I'm not sure whether it's the same problem, but on our
project we had memory corruptions on Android with zipvfs.
The bug happened only on Android. Valgrind and asan on
Linux was showing no bug.

The bug turned out not to be in SQLite, but in zlib.
Android was using zlib-1.2.7. After using zlib-1.2.8
instead, corruption did not happen anymore.

Reading the changes in zlib-1.2.8 at http://www.zlib.net I saw:

"Version 1.2.8 fixes a very rare bug in decompression.
All users are encouraged to upgrade immediately."

Can you check whether you use zlib-1.2.7 or zlib-1.2.8?

Dominique

[sqlite] Documentation change suggestion: formatchng.html

2015-10-30 Thread Dominique Pellé

Richard Hipp  wrote:

> On 10/29/15, Simon Slavin  wrote:
> >
> > On 29 Oct 2015, at 6:58pm, Richard Hipp  wrote:
> >
> >> That page was really old and mostly obsolete - at least the header
> >> text was.  I've rewritten it.  Please press "Reload"
> >
> > Good.  Will it be updated for everything up to and including partial indexes
> > ?  It's a lot of work so I don't expect it right now.
>
> I don't feel a pressing need to document forwards compatibility breaks.


A typo in the first sentence of that page:

in *incompatable* ways -> in incompatible ways.

Regards
Dominique

[sqlite] version 3.9.0 doc errors

2015-10-09 Thread Dominique Pellé

R.Smith  wrote:

> More draft doc errors:
> https://www.sqlite.org/draft/versionnumbers.html

An additional typo near the end of above page:

=== BEGIN QUOTE ===
Again, the *destinction* (-> distinction) between "major" and "minor"
is subjective.
=== END QUOTE ===

Regards
Dominique

[sqlite] Unable to find the file 'memleak.txt'

2015-07-17 Thread Dominique Pellé

Richard Hipp  wrote:

> On 7/17/15, Sairam Gaddam  wrote:
>> I ran the test files which are present in SQLite for my custom build and I
>> got some memory leaks.
>> I got a message like,
>>
>> Writing unfreed memory log to "./memleak.txt"
>>
>> But I couldn't able to locate that file.
>> Can anyone kindly tell where to find that file or where the unfreed memory
>> log is written ???
>
> I think memleak.txt is only created if you compile with -DSQLITE_MEMDEBUG.
>
> Use valgrind instead.  It works much better.

Or use the leak sanitizer (part of address sanitizer) which is faster
than valgrind:

- compile your program with gcc or clang with: -fsanitize=address
-fno-omit-frame-pointer
- add detect_leaks=1 to ASAN_OPTIONS environment variable. Ex:
export 
ASAN_OPTIONS=abort_on_error=1:detect_stack_use_after_return=1:check_initialization_order=1:strict_init_order=0:color=1:detect_leaks=1:detect_odr_violation=2
- run you program. It will report leaks if any, and other memory bugs.

https://code.google.com/p/address-sanitizer/wiki/LeakSanitizer
https://code.google.com/p/address-sanitizer/wiki/Flags

Regards
Dominique

[sqlite] SQLite version 3.8.11 coming soon...

2015-07-15 Thread Dominique Pellé

On Wed, Jul 15, 2015 at 8:43 PM, Richard Hipp  wrote:
> On 7/15/15, T?r?k Edwin  wrote:
>> On 07/15/2015 08:34 PM, T?r?k Edwin wrote:
>>
>> In fact such detection could be added to fuzzcheck.c too (I didn't know that
>> you can detect presence of address sanitizer at compile time until now):
>
> It's a clever idea.  I'm not so sure I want to add this to fuzzcheck.
> I think I'd like to have the ability to get ASAN to fail using
> fuzzcheck.


Not only asan can be detected at runtime, but asan also
provides an API to ?poison? memory regions:

https://code.google.com/p/address-sanitizer/wiki/ManualPoisoning

It could be useful for SQLite. For example, SQLite could use the
API to poison:
- memory not used in MEMSYS5 buffer provided by
  sqlite3_config(SQLITE_CONFIG_HEAP, ...).
- unused lookaside buffers
- unused scratch buffers

Regards
Dominique

[sqlite] Segfault during FTS index creation from huge data

2015-04-30 Thread Dominique Pellé

Dan Kennedy  wrote:

> On 04/30/2015 07:41 AM, Scott Robison wrote:
>>
>>
>> I wrote up some test code today that definitely forces a SQLITE_NOMEM
>> condition, and there are no leaks (though there are a couple really large
>> outstanding allocations to force an error without needing to insert
>> gigabytes of data).
>>
>> In thinking about what Dan wrote last night (many short word matches) I
>> decided that maybe my supposition was wrong and it wasn't the number of
>> relatively unique words in my old data set. Using the FTS4AUX virtual
>> table, I recall looking at the info in the DB at the time and seeing a
>> huge
>> number of words with 1 or very few matches, and assumed that was the
>> problem in that data set. However, given the random nature of that data
>> set, and the fact that there are only 26 single letter "words" and 676
>> double letter "words" (and so on), I could have easily missed the
>> relatively few rows of data that had very large numbers of docids /
>> occurrences in the aux table output.
>>
>> My test app goes to the other extreme. It inserts as many rows as possible
>> consisting of the single letter word "a" 256 times, and in my case, it
>> fails after 1,052,641 rows were inserted (1,048,576 rows were committed).
>>
>> In any case, my memory of the "merging" of things was correct, though the
>> precise location may not have been.
>>
>> Notes:
>>
>> 1. Depending on how much memory pressure I put on the system by
>> pre-allocating even more big chunks of memory, the NOMEM error moves
>> around
>> a bit. I've seen it happen in a malloc as well.
>>
>> 2. The reality is that FTS was designed around certain assumptions, and
>> these extra large data sets don't fit into those cases. In my case, the
>> only time I've seen errors was due to synthetic / unrealistic test data.
>> The exponential growth related to segment directory merges seems to
>> dictate
>> that eventually, after a lot of inserts / updates, the data structures are
>> going to get quite large.
>
>
> Thanks for doing this. I'll look at it properly later on today to see if the
> results suggest anything we can do.
>
> I'm thinking there's another problem though. At some point soon, we run into
> this:
>
>   https://www.sqlite.org/limits.html#max_length
>
> The default limit is 10^9 bytes.
>
> The fully-merged doclist generated for the term "a" in the above is probably
> around 256MiB in size. So if we could successfully merge it in memory, it
> could be inserted into the database. However, once you get up over a billion
> records there might be doclists for common terms that exceed this limit.
>
> Dan.


What about FTS5? I understand that it will use less memory than
FTS3/FTS4. Will it solve this problem?

Regards
Dominique

[sqlite] Segfault during FTS index creation from huge data

2015-04-29 Thread Dominique Pellé

Simon Slavin  wrote:

> On 29 Apr 2015, at 9:39am, Scott Robison  wrote:
>
>> On windows, malloc returns null if the allocation fails. Sqlite detects
>> this and returns an error.
>>
>> On linux, malloc may return a non null yet invalid pointer and only fail
>> when the memory is accessed because it wasn't really available.
>>
>> If Sqlite is not at fault when posix APIs lie about file locking, I don't
>> think Sqlite is responsible when malloc lies about what should be
>> considered a failed memory allocation which should return null.
>
> Okay, I see your distinction.  Yes, Linux is lying about malloc having 
> succeeded.
> And there's no further way to check the block it handed back apart from 
> trying to
> use it, which triggers the error.  A program can't do anything about a lying 
> OS,
> so SQLite is not at fault.

I'd suggest to try reproducing the problem on Linux after disabling
memory overcommit  (at least temporarily for the experimentation):

* If it returns SQLITE_NOMEM without segfault, then it's not a bug
  but a limitation of SQLite which uses lots of memory for FTS, or
  a leak in the application.
* If it still segfaults, then there is a bug in SQLite or in the application.

Regards
Dominique

[sqlite] Segfault during FTS index creation from huge data

2015-04-03 Thread Dominique Pellé

Artem  wrote:

...snip...
> SQLite is functioning for about 4 hours, after which Segmentation Fault error 
> occurs inevitably.
...snip...

You did not provide any stack dump, so I don't think that SQLite developers
can do anything to help you. Anyway, it's more likely that the bug is in
your application.

Since you use Linux, you should try to recompile and link with
-fsanitize=address -g -O0 -fno-omit-frame-pointer

-fsanitize option is available in recent versions of gcc or clang.
It will likely tell you where the bug is.

Regards
Dominique

[sqlite] Questions about SQLITE_CONFIG_SCRATCH and SQLITE_CONFIG_PAGECACHE

2015-01-29 Thread Dominique Pellé

Hi

1) Question about SQLITE_CONFIG_SCRATCH

In SQLite documentation about SQLITE_CONFIG_SCRATCH,
I read:

=== BEGIN QUOTE https://sqlite.org/c3ref/c_config_getmalloc.html ===
   SQLite will never require a scratch buffer that is more
   than 6 times the database page size. If SQLite needs needs
   additional scratch memory beyond what is provided by this
   configuration option, then sqlite3_malloc() will be used to
   obtain the memory needed.
=== END QUOTE ===

I stumbled upon code where the scratch buffer size is configured
to only 10 KB only, yet some DB have 16KB page sizes:

const int KSize = 10*1024;
const int KBufferCount = 8;
static uint64_t sqliteScratchBuffer[KSize*KBuferSize/sizeof(uint64_t)];

status = sqlite3_config(
   SQLITE_CONFIG_SCRATCH,
   [0],
   KSize,
   KBufferCount);

Is it safe to have only 10KB of scratch buffer when DB page size
can be 16KB?  Is it ideal?  I don't find guidelines about configuring
SQLITE_CONFIG_SCRATCH.

What happens if the scratch buffer was bigger than 6*page size?
Would memory just be wasted?  (since doc says it never allocate
more than 6*page size).


2) Question about SQLITE_CONFIG_PAGECACHE

In order to reduce the number of malloc calls, I consider
configuring SQLITE_CONFIG_PAGECACHE with
a static buffer. However, the application opens multiple
databases with various page sizes (1KB, 4MB, 8MB, 16MB).

So what happens if if do for example:

   // Max DB page size is 16KB. SQLite doc says to add 40 bytes for page header.
   const int KPageSize = 16*1024 + 40;
   const int KPageCount= 512;
   static uint64_t sqlitePageCache[KPageSize*KPageCount/sizeof(uint64_t)];

   status = sqlite3_config(
  SQLITE_CONFIG_PAGECACHE,
  [0],
  KPageSize,
  KPageCount);

Will SQLite use 16KB (=KPageSize) in that buffer to store each page
of DBs even for the DBs where page size is only 1KB or 4KB or 8KB?
If so, it will waste memory for 1KB, 4KB or 8KB pages and
SQLITE_CONFIG_PAGECACHE does not look like a good
idea in such case.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Logging and timing queries

2014-07-31 Thread Dominique Pellé

Ketil Froyn  wrote:

> Hi,
>
> In debugging and inspecting applications using sqlite, I find it would be
> very useful to log some or all queries sent to the database, ideally also
> with timing information on how long it took to gather and send the
> response, so that it can be determined what queries are received and how
> long each query actually takes. It's possible to do this at the application
> level, but I think it would be very useful to do this using sqlite itself.
> If detailed query logging could be enabled in the database with a pragma or
> by using a debug version of the libarry somehow, it would let users and
> developers get standardised details about what queries are run and how long
> they take, even temporarily on a live system.
>
> There's lots of situations where this could be useful, like switching on
> and off debugging on the fly in live/production systems to identify
> transient problems or bottlenecks, and debugging software using sqlite with
> a live dataset without necessarily having to recreate the working
> conditions in a controlled environment, and without recompiling, and
> without developing/enabling support for sql debugging in the application.
>
> I've used mysql-proxy in the past to debug mysql databases in this way, and
> it was very effective. But since there's no network connection involved in
> sqlite, this method can't be used to temporarily redirect the queries in
> the same way.
>
> Have I missed a way to do something like this in sqlite already, or would
> you regard this as a feature request?
>
> Thanks, Ketil

Hi Ketil

SQLite allows you to that.  See:

sqlite3_profile(...);
sqlite3_trace(...);

http://www.sqlite.org/c3ref/profile.html

You call sqlite3_profile(...) once to register a callback.
I use something like this:

sqlite3_profile(db, SqliteProfileCallback, db);

And for the callback, I use something like this:

static void SqliteProfileCallback(void* aDb, const char* aQueryStr,
sqlite3_uint64 aTimeInNs)
{
  sqlite3* db = static_cast(aDb);
  const char* dbName = sqlite3_db_filename(db, "main");

  // Statistics per DB connection.
  // See: http://www.sqlite.org/c3ref/db_status.html
  int cacheUsed[2]  = { 0, 0 };
  int schemaUsed[2] = { 0, 0 };
  int stmtUsed[2]   = { 0, 0 };
  int cacheHit[2]   = { 0, 0 };
  int cacheMiss[2]  = { 0, 0 };
  sqlite3_db_status(db, SQLITE_DBSTATUS_CACHE_USED,  [0],
[1],  0);
  sqlite3_db_status(db, SQLITE_DBSTATUS_SCHEMA_USED, [0],
[1], 0);
  sqlite3_db_status(db, SQLITE_DBSTATUS_STMT_USED,   [0],
[1],   0);
  sqlite3_db_status(db, SQLITE_DBSTATUS_CACHE_HIT,   [0],
[1],   0);
  sqlite3_db_status(db, SQLITE_DBSTATUS_CACHE_MISS,  [0],
[1],  0);

  fprintf(stderr, "SQLite profile: msec=[%llu] mem/high/lim=[%lld/%lld/%lld] "
 "schema=[%d] stmt=[%d] cache=[%d]
hit/miss=[%d/%d] db=[%s] sql=[%s]",
   aTimeInNs/100, // Time taken by the query
in milliseconds.

   sqlite3_memory_used(), // Global memory used by
SQLite now in bytes.
   sqlite3_memory_highwater(0),   // Global high water mark of
memory used by SQLite in bytes.
   sqlite3_soft_heap_limit64(-1), // Global current heap limit
in bytes (a hint only).

   schemaUsed[0], // Memory used by this
connection for the schema.
   stmtUsed[0],   // Memory used by this
connection for statements.
   cacheUsed[0],  // Memory used by this
connection for cache.

   cacheHit[0], cacheMiss[0], // SQLite cache hit/miss stats.
   dbName, aQueryStr);
}

That gives lots of useful information to find slow queries, memory
usage, etc.  But I'm curious whether there is anything else useful
that could be logged inside the profile or trace callback.

One thing that I would like, is to be able to call sqlite3_stmt_status(...)
to get information about the statement inside the callback. But
the sqlite3_stmt object is not available inside the callback
unfortunately and I don't see how to make it available, other
than changing SQLite source (which I have not tried). It would
have been nice it it was available somehow.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Problem with many connections

2014-07-02 Thread Dominique Pellé

Grzegorz Sikorski  wrote:

> Hi Hick,
>
> You were right I was not finalizing statements and this prevented close to
> complete. When I modified my code to finalize all statements before close it
> works fine.

This implies that you have not checked your application with
valgrind which would have found leaks. Even if you fixed the
missing sqlite3_finalize(...), I would still advise to run with
valgrind which may find other hidden issues.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Problem with many connections

2014-07-02 Thread Dominique Pellé

Eduardo Morras  wrote:

> On Wed, 02 Jul 2014 12:02:27 +0100
> Grzegorz Sikorski  wrote:
>
>> Hi,
>>
>> I am not sure if my previous email had reached the list, so I just
>> repeat it:
>>
>> I am developing sharding database using SQLite3 for embedded
>> application. My code works fine up to about 1020 connections to the
>> database. After around this number, I get an error "unable to open
>> database file". I double checked, permissions are OK and I think I
>> always properly close all connections and never open the same file
>> twice (I normally keep one connection opened for whole application
>> lifetime and open others on demand for each transaction). I found
>> this
>> topic:http://stackoverflow.com/questions/22801987/sqlite3-unable-to-open-database-file-ios
>> and I am not sure if there is any reason why keeping opened
>> connection in whole application lifetime is really something I should
>> do? I would prefer to open database only when it is needed, to avoid
>> risk of file corruption on power loss. Is there any known issue with
>> multiple open/close operations?
>
> 1020 simultaneus connections? Perhaps iOS exhausted del max. file descriptor 
> per process.

That's probably the problem.
On my Linux machine, I see that the limit for number of opened
file descriptors is 1024 by default:

$ ulimit -n
1024

But you can increase the limit. Ex:

$ ulimit -n 4096
$ ulimit -n
4096

$ help ulimit
ulimit: ulimit [-SHacdefilmnpqrstuvx] [limit]
Modify shell resource limits.
...snip...
  -nthe maximum number of open file descriptors
...snip...

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Building multiple indexes in one scan over a table

2014-06-10 Thread Dominique Pellé

Fabian Giesen wrote:

> Simon Slavin wrote:
...
>> Also, store your time as a numeric value (Unix epoch or Julian day) rather
>> than a string.  Numbers are faster to handle than strings.  INTEGERs are
>> faster to handle than REALs.
>
>
> :-)
>
> The time fields are 64-bit ints. The exact meaning depends on the
> capture target platform (we support several) but usually it's either
> a system-synchronized CPU clock cycle counter (when available) or
> ticks for a high-frequency bus/event timer.

It's worth adding that SQLite stores integers in variable length
(from 1 to 9 bytes). Small integers are more compact to store
and are thus more efficient  (both in the table and in its indexes).
So if you can store numbers relatively to a known minimum
values, you can make records smaller.

See: http://sqlite.org/src4/doc/trunk/www/varint.wiki

Multiple indexes are independent from each other.
So it would be nice if creation of multiple indexes could
be done in parallel (one thread per index).  That's
assuming that creation of indexes is CPU bound and
not I/O bound.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Documentation typo

2014-04-25 Thread Dominique Pellé

Drago, William  wrote:

> In section 2 of http://sqlite.org/foreignkeys.html the word
> "separately" appears twice in the following sentence:
>
> Foreign key constraints are disabled by default (for backwards
> compatibility), so must be enabled separately for each
> database connection separately.

Other sentences in the same page, where words "can" and
"not" appear incorrectly twice:

- The application *can can* also use a PRAGMA foreign_keys

- The PRAGMA recursive_triggers setting does *not not* affect
  the operation of foreign key actions

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] SQLITE_ENABLE_SQLLOG and sqlite3_init_sqllog

2014-01-10 Thread Dominique Pellé

Hi

I wanted to use the SQLite error logging documented here...

http://www.sqlite.org/errlog.html

But I found that the implementation in SQLite is odd: when
defining SQLITE_ENABLE_SQLLOG, SQLite calls
a function sqlite3_init_sqllog() inside sqlite3_initialize().
That function sqlite3_init_sqllog() is not defined in SQLite
itself and has to be defined by the application.

This is not nice, as it causes a circular dependency
between the application which calls SQLite API, and
SQLite itself which calls sqlite3_init_sqllog() defined
in the application.

Why wasn't a callback used instead?  In fact, there is
such a callback which can be registered when the application
calls sqlite3_config(SQLITE_CONFIG_LOG, , ...).

So why does SQLite impose that the application defines
sqlite3_init_sqllog()?  Even trying to compile and link the
SQLite standard shell.c with -DSQLITE_ENABLE_SQLLOG
fails without defining a sqlite3_init_sqllog() function.

In my build, I create a library for SQLite.  Several programs
link with that SQLite lib. If I define SQLITE_ENABLE_SQLLOG,
it breaks the build as sqlite3_init_sqllog() is undefined.
I can define it of course define it but I have to fix all programs
(a pain). Also when linking with the SQLite lib, I still get link errors
anyway because of the circular dependencies it introduces
between libraries. When linking, libraries should be given
to ld in order, from the general one to the low level one. But
that assumes no circular dependencies between libraries.

To fix it, why not simply remove the following lines from sqlite3.c
in sqlite3_initialize()?

  {
extern void sqlite3_init_sqllog(void);
sqlite3_init_sqllog();
  }

... and let the application register the logging callback with
something like:

  if (sqlite3_config(SQLITE_CONFIG_LOG, , NULL) !=
SQLITE_OK)
  {
...
  }

I've done that. It looks OK, but am I missing something?
I also think that this should be done by default in SQLite.

Regards
Dominique

PS: I'm using SQLite-3.8.2.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQlite Memory Leakage with 65536 Page Size

2013-10-28 Thread Dominique Pellé

Raheel Gupta  wrote:

> Sir, is there any way to not allow malloc to hold memory ? I mean shouldnt
> free(), be freeing the memory ?

free() frees memory in the heap, making it available to other
malloc again. But it does not necessarily shrink the amount of
memory used by the process as was already explained by
multiple replies.

The OS does not necessarily shrink the memory of the process, when
calling free() because that may be reused when the process allocate
again.  Shrinking/growing has a cost. But does it really matter to
you that it does not shrink anyway? Pages that are not used will
be paged out if other processes need more memory and if the page
are really unused, then they will remain paged-out (no swapping) which
has probably no performance penalty. It will use more virtual memory
but it does not matter (unless of course you run out of virtual memory).
How much physical memory is actively being used is what matters.

Not also that just because you freed plenty of memory does
not mean the OS can always reclaim the pages from the process
anyway in case your heap was fragmented with tiny blocks still
allocated all scattered all over the heap.

Worth reading:

http://unix.stackexchange.com/questions/53447/does-free-unmap-the-memory-of-a-process

http://ftp.gnu.org/old-gnu/Manuals/glibc-2.2.3/html_chapter/libc_3.html

Quoting the above link:

=== BEGIN QUOTE ===
Occasionally, free can actually return memory to the operating system
and make the process smaller. Usually, all it can do is allow a later call
to malloc to reuse the space. In the meantime, the space remains in
your program as part of a free-list used internally by malloc.
=== END QUOTE ===

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQlite Memory Leakage with 65536 Page Size

2013-10-27 Thread Dominique Pellé

Raheel Gupta wrote:

> Hi,
>
> Yes, I tried Valgrind and it shows no leaks.
> But why would this happen with 64K pages ? In 1024 Sqlite is able to
> release all the memory.
>
>>> It might also be that your memory allocator is holding onto freed memory
> rather than releasing it back to the OS.
>
> How should I free it ?

Most certainly the memory has been freed.  But that does not
mean that the OS will reduce memory usage of the process.

Valgrind memcheck already showed that there is no leak
as you said.  If you still think that memory is allocated, you
can use a heap profiler such as "valgrind --tool=massif" to
find where memory is allocated, and how much of it is
allocated.  See:

http://valgrind.org/docs/manual/ms-manual.html

There are other heap profilers, such as this one:
http://google-perftools.googlecode.com/svn/trunk/doc/heapprofile.html

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Segmentation Fault With Trigger

2013-10-15 Thread Dominique Pellé

techi eth wrote:

You should compile your code & sqlite3.c with -g -O0 when
sending a stack trace.

Most likely, the bug is in your program. Since you're on Linux,
try running with valgrind memcheck tool to find bugs (access
to free memory, uninitialized memory, etc.)

If you compile with clang-3.3 or newer, or gcc-4.8.* or newer, you
can also compile & link with -fsanitize=address to find bugs
(much quicker than using valgrind if your program is slow).

But the best is to try both (valgrind & gcc/clang address sanitizer,
not at the same time) since they can find different bugs:
* only valgrind will find uninitialized memory access
* only the address sanitizer will find global and stack overflows
* both will find other kind of errors (double free, use of freed memory...)

Dominique


> Please provide me hint to solve the issue.
>
>
> Thanks..
>
>
> On Fri, Oct 11, 2013 at 7:58 PM, techi eth  wrote:
>
>> It is giving same trace what i have sent last time.
>>
>> It is possible by some one to send test code snapshot of scenario of
>> selecting user defined function while trigger execution.
>>
>> Thanks a lot...
>>
>>
>> On Fri, Oct 11, 2013 at 7:41 PM, Richard Hipp  wrote:
>>
>>> On Fri, Oct 11, 2013 at 10:06 AM, techi eth  wrote:
>>>
>>> > Please provide some hint.
>>> >
>>>
>>> Recompile with -O0 and -g and then post the stack trace after the
>>> segfault.
>>>
>>> --
>>> D. Richard Hipp
>>> d...@sqlite.org
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Missing optimization with id BETWEEN xxx AND xxx?

2013-05-15 Thread Dominique Pellé

Hi

The SQL script at http://dominique.pelle.free.fr/query-between.sql
shows 2 SELECT queries:

* Select query #1 takes 20.2 sec  (slow!)
* Select query #2 takes 0.004 sec  (fast!)

Yet the 2 queries are equivalent.

Here is how to download and run the script:

=
$ wget http://dominique.pelle.free.fr/query-between.sql
$ rm -f foo.sqlite ; sqlite3 foo.sqlite < query-between.sql
--- Populating table...
--- Query #1: measure time to select with 150 BETWEEN clauses with
equal min/max -> SLOW!
0
CPU Time: user 20.237265 sys 0.036003 <--- SLOW!

--- Query #2: measure time to select using: IN (...150 values...)  -->
equivalent and FAST!
0
CPU Time: user 0.004001 sys 0.00 <--- FAST!
=


If I add "EXPLAIN QUERY PLAN" in front of each SELECT query, I see:

--- Query #1: measure time to select with 150 BETWEEN clauses with
equal min/max -> SLOW!
0|0|0|SCAN TABLE t (~50 rows)

--- Query #2: measure time to select using: IN (...150 values...)  -->
equivalent and FAST!
0|0|0|SEARCH TABLE t USING INTEGER PRIMARY KEY (rowid=?) (~150 rows)
0|0|0|EXECUTE LIST SUBQUERY 1


Shouldn't SQLite be able to use the index (implicit index
since 'id' is a PRIMARY KEY) when using BETWEEN clauses?

I understand that using...
 "WHERE id BETWEEN xxx AND xxx OR id BETWEEN yyy AND yyy"
... is a bit silly.  It's of course simpler to write the equivalent...
  "WHERE id IN (xxx, yyy)"

However, I found this query which was automatically
generated.  Not all ranges have equal min/max
but often they do.  It was slow.  Replacing all the equal
ranges using "OR id IN (xxx, yyy...)" resulted in a big
speed up.

Can't the SQLite optimizer do such optimization automatically?

I'm using SQLite 3.7.16.2 2013-04-12 11:52:43 on Linux x86_64.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to achieve fastest possible write performance for a strange and limited case

2013-03-29 Thread Dominique Pellé

Pavel Ivanov  wrote:

> On Fri, Mar 29, 2013 at 12:17 PM, Jeff Archer
>  wrote:
>> I have previously made an apparently bad assumption about this so now I
>> would like to go back to the beginning of the problem and ask the most
>> basic question first without any preconceived ideas.
>>
>> This use case is from an image processing application.  I have a large
>> amount of intermediate data (way exceeds physical memory on my 24GB
>> machine).  So, I need to store it temporarily on disk until getting to next
>> phase of processing.  I am planning to use a large SSD dedicated to holding
>> this temporary data.  I do not need any recoverability in case of hardware,
>> power or other failure.   Each item to be stored is 9 DWORDs, 4 doubles and
>> 2 variable sized BLOBS which are images.
>>
>> I could write directly to a file myself.  But I would need to provide some
>> minimal indexing, some amount of housekeeping to manage variable
>> sized BLOBS and some minimal synchronization so that multiple instances of
>> the same application could operate simultaneously on a single set of data.
>>
>> So, then I though that SQLite could manage these things nicely for me so
>> that I don't have to write and debug indexing and housekeeping code that
>> already exists in SQLite.
>>
>> So, question is:  What is the way to get the fastest possible performance
>> from SQLite when I am willing to give up all recoverability guarantees?
>
> Use
> pragma journal_mode = off;
> pragma synchronous = off;
> pragma locking_mode = exclusive;
>
> In addition to that you may issue BEGIN statement at the beginning of
> the application and never COMMIT.


Yes that should be the fastest.

In addition:

* make sure that you prepare your INSERT query just once
  before the insert loop, and use bind/step/reset in the
  insertion loop.

* avoid duplicating the implicit uid index with another
  primary key by using INTEGER PRIMARY KEY
  and not something like INT PRIMARY KEY

* if you have indexes, make sure that you create them
  after all the inserts, rather than before.

* if you have several tables to populate, you can consider
  storing them in different databases, and populate them
  in parallel in different processes or threads. Then later
  you can ATTACH all of them to see them as a unique
  database.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] random, infrequent disk I/O errors

2013-02-10 Thread Dominique Pellé

Greg Janée wrote:

> Hello, I'm running a web service that uses SQLite that throws a disk I/O
> exception every once in a while, meaning once every few weeks.
...snip...
> Any ideas?  Unfortunately, the (standard) Python SQLite wrapper I'm using
> doesn't provide access to any more information (if there is any to be had).


I've never tried it, but doing a Google search
for "python errno" suggests that you can access
errno in Python:

http://stackoverflow.com/questions/661017/access-to-errno-from-python

Hopefully that can give better information about the error
(file system full, etc.)

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] populating a table as quickly as possible

2013-02-04 Thread Dominique Pellé

Paul Sanderson wrote:

> I want to populate a large table (millions of rows) as quickly as possible,
>
> The data set will not be operated on until the table is fully populated and
> if the operation fails I will be starting again. the operation is a one off
> and the table will not be added to at a future date.
>
> What are optimisations will work best for me?


Do you have only 1 table to populate or several tables?

If you have several tables, you could consider this:

* put your tables in different databases;

* perform the INSERT in different processes (1 per
  database) so tables can be populated in parallel;

* wait for all processes to finish;

* ATTACH all databases, so it it behaves as
  a single database.

Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] populating a table as quickly as possible

2013-02-03 Thread Dominique Pellé

Simon Slavin  wrote:

>
> On 3 Feb 2013, at 6:34pm, Paul Sanderson  wrote:
>
>> I want to populate a large table (millions of rows) as quickly as possible,
>>
>> The data set will not be operated on until the table is fully populated and
>> if the operation fails I will be starting again. the operation is a one off
>> and the table will not be added to at a future date.
>>
>> What are optimisations will work best for me?
>
> Don't worry about hacking into SQLite.  You should initially try it doing 
> something like
>
> DROP all your indexes
> DELETE FROM myTable
> BEGIN
> INSERT ...
> INSERT ...
> INSERT ...
> ...
> END
> CREATE whatever indexes you want
>
> If that proves not to be fast enough for you, you can mess
> with making a new transaction every 100,000 rows or
> something: just do "END; BEGIN" every so-often.
> If it's still not fast enough you can mess with
> 'PRAGMA journal_mode = OFF'.

In addition to all this, you can also try:

PRAGMA synchronous=off

But read about the risks:

http://www.sqlite.org/pragma.html#pragma_synchronous

Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] FTS SQLite "optimize" command creates very large BLOBs in *_segment table.

2013-01-31 Thread Dominique Pellé

Hi

I have a database using SQLite-3.7.14 with a FTS4 virtual table (Free
Text Search). The FTS table contains several millions of small documents.

The FTS DB is created on a server (where creating time does not matter)
and then used on an embedded device as a read-only database for FTS
queries (where speed of queries need to be as fast as possible and use
as little memory as possible).

Since the DB is used read-only on the device, I ran the the "optimize"
command (see http://www.sqlite.org/fts3.html#optimize) in the FTS index
after the last INSERT was done on the server.

Unfortunately, I found that the "optimize" command is creating very large
BLOBs in the database since it merges together all of inverted index b-trees
into very large complete b-tree. For frequent terms, merged BLOBs can be
up to ~50MB in my case in the ftsTable_segment table, as a result of
running "optimize":

  sqlite> SELECT *, length(block) AS len_blob
FROM ftsTable_segments
ORDER BY len_blob DESC LIMIT 3;

  seblockid|block|len_blob
  336808||51867353 <-- 51 MB BLOB!
  311724||19375541 <-- 19 MB BLOB
  334719||19223423 <-- 19 MB BLOB

Such large BLOBs (~50MB) are a problem in my case as they consume
a  large amount of memory on a embedded device, when doing FTS
queries with several frequent terms. SQLite memory high watermark
reported by sqlite3_memory_highwater() reaches ~200MB when query
contains several frequent terms, which is too much for an embedded
device, even though I set of soft memory limit of only 3MB using
sqlite3_soft_heap_limit64(...).

As a result, I have disabled running "optimize" on the FTS index
after creating the FTS DB on the server.

However, it got me thinking: it would be nice to be able to pass an
optional parameter to the FTS "optimize" command in order to
avoid merging b-trees when BLOBs reach a certain size?  In other
words, instead of doing...

INSERT INTO ftsTable(ftsTable) VALUES('optimize');

... I would like to be able to do something like this...

INSERT INTO ftsTable(ftsTable) VALUES('optimize=1048576');

... where optimize=1048576 indicates to *partially* optimize
reverse index b-trees in such a way that BLOBs do not exceed
1MB (1048577 bytes) in this example. It's OK if it's a fuzzy soft limit.


1/ Wouldn't such partial optimization of FTS index be useful?

2/ I also suggest that the documentation at
http://www.sqlite.org/fts3.html#optimize
indicates that optimizing an FTS index can create very large BLOBs.
This may be OK on a desktop or server, but it can be a problem
on embedded devices with limited amount of memory.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] integrity_check "out of memory"

2013-01-22 Thread Dominique Pellé

Max Vlasov wrote:

> Hi,
>
> I found with the web search that this error (out of memory  for PRAGMA
> integrity_check) appeared in the wild and there were some fixes related to
> it.
>
> Today I decided to do the check for a base that lives with its own
> encrypted vfs and with an sqlite version older than the current. But the
> check was made with the latest version (3.7.15.2). I got the same error
> (out of memory) for max values larger than 800, for example PRAGMA
> integrity_check(801). If I provide the number below or equal to 800, the
> result is only the (truncated) list of "Page .. is never used". What could
> go wrong? Can it be related to inadequate code of my vfs? The reason why I
> would like to see the entries after 800 is that there can be more serious
> errors than "Page is never used".
>
> Thanks,
>
> Max

Hi Max

It looks perhaps like the bug that was fixed in this checkin:

http://www.sqlite.org/src/info/120c82d56e

Can you try and confirm?

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How do you load a ".csv" and skip the first line?

2012-09-02 Thread Dominique Pellé

joe.fis...@tanguaylab.com  wrote:

> Is there some way to load a ".csv" file into a SQLite database table when
> the first row (record) of the file contains headers?
>
> This does not work?
> --
> .separator ,
> .import test.csv ld_assay
>
> Here's my database / table:
> CREATE TABLE [ld_assay] (
>   [id] INTEGER PRIMARY KEY,
>   [barcode_id] NCHAR(9),
>   [light] BOOLEAN,
>   [bpm] REAL);
>
> Here's my "test.csv":
> id,barcode_id,light,bpm
> 1,02455,1,180.2
> 2,02455,1,168.3
> 3,02455,1,189.4
> --
>
> This works (example from different program):
> --
> When R loads a ".csv" file it uses the [header = TRUE] flag to skip the
> first row of the file:
> data.frame_ld_assay <- read.table("test.csv", header = TRUE, sep = ",",
> colClasses = c("integer", "character", "integer", "numeric"))
>
> Most likely utilizing the "skip, integer: the number of lines of the data
> file to skip before beginning to read data." option.
> --


Spatialite extension of SQLite has VirtualText() which
creates a virtual table from a CSV files.  It uses the
first lines to name the columns in the virtual table.

Example:

$ cat test.csv
id,barcode_id,light,bpm
1,02455,1,180.2
2,02455,1,168.3
3,02455,1,189.4


spatialite> CREATE VIRTUAL TABLE virt_test
   ...> USING VirtualText(test.csv, UTF8, 1, COMMA, DOUBLEQUOTE, ',');

spatialite> SELECT * FROM virt_test;
0|1|2455|1|180.2
1|2|2455|1|168.3
2|3|2455|1|189.4

spatialite> CREATE TABLE test AS SELECT * FROM virt_test;

spatialite> SELECT * FROM test;
0|1|2455|1|180.2
1|2|2455|1|168.3
2|3|2455|1|189.4

spatialite> .schema test
CREATE TABLE TEST(
  ROWNO INT,
  id INT,
  barcode_id INT,
  light INT,
  bpm TEXT
);

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] VACUUM fails once in a while with status=14 (unable to open database file)

2012-08-21 Thread Dominique Pellé

Richard Hipp <d...@sqlite.org> wrote:

> On Tue, Aug 21, 2012 at 5:30 AM, Dominique Pellé
> <dominique.pe...@gmail.com>wrote:
>
>> Hi
>>
>> Once in a while, I see an error when doing a "VACUUM" operation.
>>
>> sqlite3_exec3(...) returns status=14 (unable to open database file).
>> I suppose that it fails to open a temporary database when doing
>> the VACUUM operation, but I don't see why.
>>
>
> You might have plenty of disk space left, but do you have plenty of *temp*
> disk space left?  I assume you are on Linux?  Is /var/tmp mounted on a
> separate filesystem?  Do you have plenty of disk space in /var/tmp?

It's on Linux x86_84.

/var/tmp is on the / file system, which has 364Gb available space
The size of the Database that I try to vacuum is "only" 2.5 Gb.
So running out of disk space does not look possible.

Shouldn't I get another error message if somehow there was not
enough disk space anyway?  Like SQLITE_FULL = 13?

Simon Slavin wrote:

> On 21 Aug 2012, at 10:30am, Dominique Pell <dominique.pe...@gmail.com> 
> wrote:
>
> > I wonder whether VACUUM of different databases happening
> > in parallel in different processes could use the same temporary
> > file names, causing conflicts.
>
> I don't think so.  SQLite makes up a random filename for the temporary
> database, then checks to see that the file doesn't already exist before
> it makes it.

It seemed unlikely indeed that such a bug (conflicting tmp files) would be
in SQLite, since SQLite is so widely used, but I had no other explanation
so far.

But... you say (1) SQLite creates a random file name and THEN (2) checks
to see if it does  not already exist (3) before it makes it. It seems that there
is a small window here in between (2) and (3) if two processes creates the same
file name at the same time, if there is no mutual exclusion. I still doubt
that would be the issue in practise.

I asked in case I was missing something obvious. It looks like no.

I'll debug further, look at errno (etc.) or come up with a simple program
that reproduces it.

Regards
-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] VACUUM fails once in a while with status=14 (unable to open database file)

2012-08-21 Thread Dominique Pellé

Hi

Once in a while, I see an error when doing a "VACUUM" operation.

sqlite3_exec3(...) returns status=14 (unable to open database file).
I suppose that it fails to open a temporary database when doing
the VACUUM operation, but I don't see why.

sqlite3_extended_errcode(db) also returns extendedErrorCode=14

The code to vacuum looks simple, it does these operations:

==
sqlite3* db;
int status = sqlite3_open(dbName, );
if (status != SQLITE_OK) {
  ...  no error happening here.
}

char* errMsg = NULL;
status = sqlite3_exec(db, "PRAGMA synchronous=OFF;", NULL, NULL, );
if (status != SQLITE_OK) {
  ... no error happening here.
}

status = sqlite3_exec(db, "VACUUM;", NULL, NULL, );
if (status != SQLITE_OK) {
   ... and here I get status is 14 once in a while?!
}

status = sqlite3_close(db);
if (status != SQLITE_OK) {
   ... no error happening here, even when above VACUUM failed.
}
==

I precise that:

- It's doing VACUUM of fairly large DBs (1 or a couple of Gb in general)
  but the file system has far more free space.  I realize that
  VACUUM needs free space (>= 2 or 3 times the size of the DB?)
  That should not be the problem. And if it was, I would expect
  a different error code than 14.

- I'm doing several VACUUM in parallel of different DBs
  in different processes. I'm not using threads.
  Each process VACUUM a different database which is simple.

- The DB is opened just before doing the VACUUM (as in above code)
  There is no other opened connection to it.

- It's not 100% reproducible. It happens maybe 1/10
  of the times, which suggests that it could be a race condition.
  Yet it's not using threads. Different processes uses
  different databases, so it does not seem possible to have
  race conditions in those conditions.

- Even though VACUUM fails, the DB looks fine.

- It's using SQLite-3.7.3  (actually, Spatialite-2.4.0) on Linux x86_64.
  (OK, maybe time to upgrade SQLite)

- I've checked with Valgrind memory checker on tiny
  databases: it sees no problem.

-> Any idea what could be wrong?

I wonder whether VACUUM of different databases happening
in parallel in different processes could use the same temporary
file names, causing conflicts.

Regards
-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to know what terms were created using FTS

2012-08-17 Thread Dominique Pellé

Mohit Sindhwani  wrote:

> Hi Ralf,
>
>
>
> On 17/8/2012 3:50 PM, Ralf Junker wrote:
>>
>> On 17.08.2012 09:30, Mohit Sindhwani wrote:
>>
>>> We're using FTS4 and it works well for many things.  One of the things
>>> that we'd like to do is to see what terms are being created by the
>>> tokenizer in use.  What would be the easiest way to do that?
>>>
>>> I tried looking through the fts_aux table and the segments and content
>>> tables, but nothing struck me directly as usable.  Any suggestions?
>>
>> http://www.sqlite.org/fts3.html#fts4aux
>
>
> I did look at this - but I couldn't figure out a way that allowed me to see
> what terms were created by the tokenizer for a particular expression.
> Example "SOME TEXT" becomes "SOME", "TEXT" - is there a way to find that?
>
> Best Regards,
> Mohit.

This gives the tokens:

sqlite> CREATE VIRTUAL TABLE ft USING fts4(x);
sqlite> INSERT INTO ft VALUES("hello world");
sqlite> INSERT INTO ft VALUES("hello there");

sqlite> CREATE VIRTUAL TABLE ft_terms USING fts4aux(ft);

sqlite> .header on
sqlite> SELECT * FROM ft_terms;
term|col|documents|occurrences
hello|*|2|2
hello|0|2|2
there|*|1|1
there|0|1|1
world|*|1|1
world|0|1|1

Regards
-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] VERY weird rounding error

2012-06-16 Thread Dominique Pellé

Etienne  wrote:

> Hi Simon,
>
> Once again, I really do not care of the accuracy.
>
> I KNOW 0.1 CAN NOT BE STORED EXACTLY IN A REAL VARIABLE.
>
> I just try to figure out why the SAME CODE (sqlite3 library) with the SAME 
> INPUT gives DIFFERENT RESULTS (no matter how accurate/exact they individually 
> are!) when linked to different shells.
>
>
> Etienne

The compiler may not produce strict IEEE-754 floating point
computations: trade off between optimizations and strict
conformance. Floating registers registers have more bits
than double IEEE 754 (80 bit registers, i.e. more accuracy
than double 64 bits). So strict conformance to IEEE would be
not not only slower but can also be less accurate.

Now if you need strict conformance to IEEE-754, you
can give options to the compiler.  Try compiling with
gcc option -ffloat-store for example. There might be
other relevant options. Also turn off optimizations (-O0).

Some links:

http://stackoverflow.com/questions/7295861/enabling-strict-floating-point-mode-in-gcc
http://stackoverflow.com/questions/982421/how-to-write-portable-floating-point-arithmetic-in-c/

Regards
-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] (shell dependent) library rounding error

2012-06-15 Thread Dominique Pellé

Etienne  wrote:

> R:\>sqlite NUL "select 0.1;"
> 0.1
>
>
> JSDB (www.jsdb.org) output:
>
> js>var db = new SQLite();
> js>db.exec("select 0.1", function(r){writeln(r)});
> 0.1=0.11
> true
>
>> You are not doing the same thing both times.  One time you are writing the 
>> result using SQLite.  The other time you are writing the result using a 
>> 'writeln()' command.  There is no 'writeln()' command in SQLite.
>
> THE PROBLEM OCCURS AT SQLITE LEVEL! The caller does not matter.
>
> Behaviors diverge from sqlite3(071300).c: line 19911 (var "realvalue") very 
> precisely.
>
>
> Regards,
> Etienne

More likely, the printf(...) (or the equivalent function) is done with
different precision in both case.  So it prints the same number
differently.

Anyway, you should not rely on 2 floating points being equal or not.
Floating points should be compared with a small margin.

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] NaN in, 0.0 out?

2012-03-27 Thread Dominique Pellé

Francis J. Monari, Esquire  wrote:

> All,
>
> How are +infinity and -infinity handled?
>
> Frank.

Don't forget signed zero, signaling and quiet NaN.

Here is a short c++ programs to show how +infinity, -infinity
zero, -zero, quiet NaN, signaling NaN are internally represented
in IEEE-754:

$ cat double.cpp
#include 
#include 
#include 

using namespace std;
int main()
{
  union
  {
double   d;
uint64_t i;
  } inf, ninf, zero, nzero, qnan, snan;

  if (numeric_limits::has_infinity)
  {
inf.d = numeric_limits::infinity();
ninf.d = -inf.d;
cout << "+infinity:\t[" <<  inf.d << "]" << "\t0x" << hex <<
inf.i << dec << endl;
cout << "-infinity:\t[" << ninf.d << "]" << "\t0x" << hex <<
ninf.i << dec << endl;

zero.d  = 1/inf.d;
nzero.d = 1/ninf.d;
cout << "zero\t\t["  <<  zero.d << "]" << "\t0x" << hex <<  zero.i
<< dec << endl;
cout << "-zero\t\t[" << nzero.d << "]" << "\t0x" << hex << nzero.i
<< dec << endl;
  }
  if (numeric_limits::has_quiet_NaN)
  {
qnan.d = numeric_limits::quiet_NaN();
cout << "quiet nan:\t[" << qnan.d << "]" << "\t0x" << hex <<
qnan.i << dec << endl;
  }
  if (numeric_limits::has_signaling_NaN)
  {
snan.d = numeric_limits::signaling_NaN();
cout << "signaling nan:\t[" << snan.d << "]" << "\t0x" << hex <<
snan.i << dec << endl;
  }
  return 0;
}

$ g++ -std=gnu++0x -Wall double.cpp
$ ./a.out
+infinity:  [inf]   0x7ff0
-infinity:  [-inf]  0xfff0
zero[0] 0x0
-zero   [-0]0x8000
quiet nan:  [nan]   0x7ff8
signaling nan:  [nan]   0x7ffc


In sqlite:

sqlite> CREATE TABLE t (pos REAL, neg REAL, nn REAL);
sqlite> INSERT into t VALUES (1, 2, 3);
sqlite> .nullvalue ThisIsANull
sqlite> insert into t values ((1e305 * 1e305), (-1e305 * 1e305), 1/0);
sqlite> insert into t values (1/(1e305 * 1e305), 1/(-1e305 * 1e305), 1/0);
spatialite> SELECT * FROM t;
1.0|2.0|3.0
Inf|-Inf|ThisIsANull
0.0|0.0|ThisIsANull

The signed zero seems lost in SQLite.

Regards
-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bulk Insert

2011-08-13 Thread Dominique Pellé

Stephan Beal wrote:

> On Fri, Aug 12, 2011 at 1:51 PM, Sumit Gupta wrote:
>
>> I was try to to Read a Binary file in to the Sqlite Database using .NET.
>> All
>> NET to do bulk upload. I use that Dataset method to upload about 70,000
>> records in to SQL server and it take 10-15 sec, but in Sqlite it is still
>> taking minutes to record that data.
>>
>
> The answer is the same as it was yesterday: wrap the inserts in a
> transaction.

Besides putting all INSERT within a transaction, you can also
try "PRAGMA synchronous=off"  in case you don't mind having
a corrupted database when OS crashes.  Some operations
can be much faster with synchronous=off  (50 times faster or
so according to the documentation):

http://www.sqlite.org/pragma.html#pragma_synchronous

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Best solution to import big DBF file?

2011-06-27 Thread Dominique Pellé

Gilles Ganault  wrote:

> Hello
>
> I have a 87MB file in DBF format that I'd like to import into SQLite.
>
> What is the best tool for this, free or affordable?
>
> Thank you.

Spatialite (extension of SQLite) implements VirtualDbf and VirtualShp.

To create a virtual table from a DBF file "foo.dbf" using UTF-8 as
encoding for TEXT columns:

spatialite> CREATE VIRTUAL TABLE virt_tbl USING VirtualDbf(foo.dbf, UTF-8);

If you want to create a real Sqlite table (not virtual), you can
copy the content of the virtual table into a real table:

spatialite> CREATE TABLE tbl AS SELECT * FROM virt_tbl;

The Spatialite shell also provides the following dot command for convenience:

spatialite> .loaddbf foo.dbf tbl UTF-8

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Big difference in performance between Python and gcc

2011-06-02 Thread Dominique Pellé

Roger Binns wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 06/01/2011 12:25 PM, Dominique Pellé wrote:
>> [Various optimisations]
>
> While those are all valid, they don't address the underlying issue which is
> C code taking five times longer than Python code for the same SQLite
> operations.  In addition that same "redundant" code is executed behind the
> scenes in Python so it is fair for these comparisons/benchmark.
>
> Other things having been ruled out, it looks like Jan's suggestion of
> compilation options and code is likely the cause.
>
> Roger

That's true of course and I should have mentioned it.
However, for a benchmark, it's best to avoid things that
can be trivially optimized.  I don't know how the python
code translates into SQLite C API.

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Big difference in performance between Python and gcc

2011-06-01 Thread Dominique Pellé

Alessandro Marzocchi wrote:

> Hello,
>    I made some preliminary tests for an application storing big
> chunks of data in a sqlite database. I did firsts tests with python
> and they gave me quite impressive results. I then tried to make the
> same test using C. I expected to get execution times to be the same of
> those of python. However I was surprised as the performance got a lot
> worse, with execution times being more than 3 times more.

...snip...

>  if(SQLITE_OK!=(ret=sqlite3_prepare_v2(db,
>    "INSERT INTO helloworld VALUES (?,?,?)",
>    -1,
>    _stm,
>    NULL
>  )))

Since you're going to always insert 1 in the first
column, why not use:

INSERT INTO helloworld VALUES (1,?,?)


>  {
>    fprintf(stderr,"sqlite error in prepare() [%d]",ret);
>    return -1;
>  };
>
>  int i;
>  char data[1024*8+1];
>  for(i=0;i<1024*8;i++)data[i]='0';
>  data[1024*8]='\0';

Since the data column also does not change,
you could bind it only once before entering
the for loop (rather than binding it at each loop
iteration)


>  for(i=0;i<10;i++)
>  {
>    if(!(i%1))printf("%d\n",i);
>
>    if(SQLITE_OK!=(ret=sqlite3_bind_int(db_stm, 1, 1)))
>    {
>      fprintf(stderr,"sqlite error in bind()");
>      return -1;
>    }

Above bind is useless if you used:
INSERT INTO helloworld VALUES (1,?,?)


>    if(SQLITE_OK!=(ret=sqlite3_bind_int(db_stm, 2, i)))
>    {
>      fprintf(stderr,"sqlite error in bind()");
>      return -1;
>    }
>    //if(SQLITE_OK!=(ret=sqlite3_bind_blob(db_stm, 3, data,-1,
> SQLITE_STATIC/*SQLITE_TRANSIENT*/)))
>    if(SQLITE_OK!=(ret=sqlite3_bind_blob(db_stm, 3, data,8192,
> SQLITE_STATIC/*SQLITE_TRANSIENT*/)))
>    {
>      fprintf(stderr,"sqlite error in bind_blob() [%d] ok=%d",ret,SQLITE_OK);
>      return -1;
>    }

above bind to blob can be moved outside the loop.


>    ret=sqlite3_step(db_stm);
>    if(ret!=SQLITE_DONE)
>    {
>      fprintf(stderr,"sqlite error in sqlite3_step() [%d]",ret);
>      return -1;
>    }
>    if(SQLITE_OK!=(ret=sqlite3_reset(db_stm)))
>    {
>      fprintf(stderr,"sqlite error in sqlite3_reset() [%d]",ret);
>      return -1;
>    }
>    sqlite3_clear_bindings(db_stm);

Calling sqlite3_clear_binding(...) at each iteration
is not needed in your example.  In fact, you should
not call it at all if you decide to move the bind to the blob
outside the loop.

That should make it faster.

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] usleep bug?

2010-09-27 Thread Dominique Pellé

Dustin Sallings wrote:

> On Sep 27, 2010, at 5:43, Simon Slavin wrote:
>
>> I think this differs for different platforms.  For instance:
>>
>> http://www.manpagez.com/man/3/usleep/
>>
>> makes no mention of that fault.
>
>        It kind of does indirectly.  It says it's built on nanosleep(), and 
> links to that doc which says this in their error section:
>
>     [EINVAL]           rqtp specified a nanosecond value less than zero or
>                        greater than or equal to 1000 million.
>
>        I didn't actually run into a problem with it, I just noticed that its 
> result isn't checked and documentation I was reading had it defined to fail.
>
>> Since the time for sleeping is an integer number of seconds it's probably 
>> simplest just to replace this call with one to sleep(10).  sleep() is 
>> defined in POSIX.1 so I believe it should be available in a library for 
>> every platform SQLite is expected to run on.
>
>        I think that makes sense.  It also means you don't need a
> comment describing that it's intended to sleep for exactly ten seconds.  :)


Yes, that might be a problem.

I see than in Vim source code, usleep() is used as follows
to  avoid calling usleep() with a large argument:

#ifdef HAVE_NANOSLEEP
...
#else
# ifdef HAVE_USLEEP
while (msec >= 1000)
{
usleep((unsigned int)(999 * 1000));
msec -= 999;
}
usleep((unsigned int)(msec * 1000));
# else
...
#endif

-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] [patch] fixed memory leak in the ".dump" command of command line tool sqlite-3.7.0.1

2010-08-07 Thread Dominique Pellé

Hi

I noticed a memory leak when using the .dump command of
command line tool "sqlite3".  I initially saw it when using
spatialite but the same bug exists in latest sqlite-3.7.0.1.   A
block of memory is leaked each time the ".dump" command
is used.

In the following example, I run .dump 3 times and valgrind
finds 3 blocks of memory leaked:

 $ valgrind  --trace-children=yes --leak-check=yes \
 --track-fds=yes ./sqlite3 2> vg.log
SQLite version 3.7.0.1
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> CREATE TABLE t (a INT);
sqlite> .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE t (a INT);
COMMIT;
sqlite> .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE t (a INT);
COMMIT;
sqlite> .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE t (a INT);
COMMIT;
sqlite> .quit

And file vg.log contains:

==5015== 12 bytes in 3 blocks are definitely lost in loss record 10 of 52
==5015==at 0x4024F70: malloc (vg_replace_malloc.c:236)
==5015==by 0x4024FFA: realloc (vg_replace_malloc.c:525)
==5015==by 0x804997A: appendText (shell.c:895)
==5015==by 0x8049F83: dump_callback (shell.c:1148)
==5015==by 0x4094027: sqlite3_exec (sqlite3.c:82322)
==5015==by 0x804AA14: T.339 (shell.c:1197)
==5015==by 0x804E3EB: do_meta_command (shell.c:1460)
==5015==by 0x804BC35: process_input (shell.c:2195)
==5015==by 0x804EED8: main (shell.c:2615)

Following patch fixes it:

$ diff -c shell.c.ORIG shell.c
*** shell.c.ORIG2010-08-07 11:38:07.715422705 +0200
--- shell.c 2010-08-07 11:38:13.555496088 +0200
***
*** 1148,1153 
--- 1148,1154 
 zTmp = appendText(zTmp, zTable, '"');
 if( zTmp ){
   zSelect = appendText(zSelect, zTmp, '\'');
+   free(zTmp);
 }
 zSelect = appendText(zSelect, " || ' VALUES(' || ", 0);
 rc = sqlite3_step(pTableInfo);


Regards
-- Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to update many rows efficiently?

2010-08-05 Thread Dominique Pellé

Igor Tandetnik wrote:

> Dominique Pellé <dominique.pe...@gmail.com> wrote:
>> For example, given 2 tables t1 and t2, both with 2 columns as follows...
>>
>> Table t1:
>>
>>    ID   name
>>    --   
>>    1    
>>    2    NULL
>>    3    NULL
>>    4    
>>    (~1 million records)
>>
>> Table t2:
>>
>>    ID   name
>>    --   
>>    2    
>>    4    
>>    (~500,000 records)
>>
>> ... I would like to update the 'name' column in records of table t1, so that
>> they are the same as the 'name' column in table t2 when the ID columns
>> match.
>
> update t1 set name = (select name from t2 where t1.ID=t2.ID)
> where ID in (select ID from t2);
>
> Or, if t1.ID is a primary key or otherwise has a unique constraint:
>
> insert or replace into t1(ID, name)
> select ID, name from t2;


Thanks Igor.  That helped. I need to brush up my SQL.

I'm still curious: what if I need to update several columns?

Using almost the same example as in previous mail, I'd
like to update columns l_nm and r_nm in table t1 using
information in table t2 as depicted below:

Table t1:

   ID_PK   l_nm   r_nm
   --     
   1      NULL
   2   NULL   NULL
   3   NULL   NULL
   4      NULL
   (~1 million records)

Table t2:

   ID_FK   l_nm  r_nm
   --    
   2     
   4     
   (~500,000 records)


Table t1 (after update):

   ID_PK   l_nm   r_nm
   --  
   1      NULL
   2         <- l_nm & r_nm column updated.
   3   NULL   NULL
   4         <- l_nm & r_nm column updated.
   (~1 million records)

Using information in previous reply, I can do it with 2 UPDATE queries
as follows (but I suspect that there is a better solution).

  UPDATE t1 SET l_nm = (SELECT l_nm FROM t2 WHERE t1.ID_PK = t2.ID_FK)
  WHERE ID_PK IN (SELECT ID_FK FROM t2);

  UPDATE t1 SET r_nm = (SELECT r_nm FROM t2 WHERE t1.ID_PK = t2.ID_FK)
  WHERE ID_PK IN (SELECT ID_FK FROM t2);


I found information about updating multiple columns at...

http://it.toolbox.com/wiki/index.php/How_do_I_update_multiple_fields_with_single_update_statement_using_subqueries%3F

... but SQLite does not accept the kind of queries in above page.

Regards
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] How to update many rows efficiently?

2010-08-04 Thread Dominique Pellé

Hi

I'm new to SQLite and I'm still learning.  It's unclear to me how to update
many records efficiently in a table.

For example, given 2 tables t1 and t2, both with 2 columns as follows...

Table t1:

ID   name
--   
1
2NULL
3NULL
4
(~1 million records)

Table t2:

ID   name
--   
2
4
(~500,000 records)

... I would like to update the 'name' column in records of table t1, so that
they are the same as the 'name' column in table t2 when the ID columns
match.

In other words, given the above tables t1, t2, I would like t1 to look like
as follows after update:

Table t1:

ID   name
--   
1
2
3NULL
4
(~1 million records)

What is an efficient way of doing it?
Can this be done in one SQL query?
I'm using C SQLite API.

Thanks
Dominique
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

87 matches

Mail list logo