Re: [sqlite] the sqlite3 documentation would be pretty good if it wasn't tragic...

2019-06-27 Thread _ph_
For your second point: it is a reference, not a tutorial. For gentler
introduction, you might want to go to http://www.sqlitetutorial.net/


First and foremost, though, itis one of the best documentations I've worked
with. Yes, it's wordy, but it's not verbose. It's precise and aims for
formal correctness, making it very much non-ambigous. 

Yes, the habit of writing in full sentences makes it an unusual read, but
after using it as a foundation for a large, long-running project, I'm deeply
thankful for its existence. 






--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] behavior with inconsistent pragma journal_mode

2019-06-25 Thread _ph_
process:

 - create a new db and set pragma journal_mode=PERSIST
 - create a table
 - open the same db again, without setting journal mode
 - create another table

Result: 

second create table statement returns SQLITE_IOERR (code 10) after notable
delay

Honestly, I'm not very proud of this use case (one client tries to optimize
access times, the other doesn't care), but a more graceful handling or a
warning in the documentation would be nice. 

Example code:

#include "sqlite3.h"
#include 
#include 

void Check(sqlite3 * db, int err)
{
   char const * msg = sqlite3_errmsg(db);
   assert(err == SQLITE_OK);
}

void Exec(sqlite3 * db, char const * sql)
{
   Check(db, sqlite3_exec(db, sql, nullptr, nullptr, nullptr));
}

int main()
{
   char const * path = "c:\\temp\\db1.sqlite";

   std::remove(path);
   sqlite3 * db = 0;
   Check(db, sqlite3_open(path, ));

   Exec(db, "pragma journal_mode=PERSIST");
   Exec(db, "CREATE TABLE t1 (id INTEGER PRIMARY KEY AutoIncrement, value
STRING)");

   sqlite3 * db2 = 0;
   Check(db2, sqlite3_open(path, ));
   Exec(db2, "CREATE TABLE t2 (id INTEGER PRIMARY KEY AutoIncrement, value
STRING)");

   Check(db2, sqlite3_close(db2));
   Check(db, sqlite3_close(db));
}


sqlite amalgamation 3.28.0.



--
Sent from: http://sqlite.1065341.n5.nabble.com/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Proposed enhancement to the sqlite3.exe command-line shell

2014-02-21 Thread _ph_
Suggestion: Warning banner, and a .saveas command that copies the db to a
file. 

(I haven't read the entire thread, sorry if this already came up.)



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Proposed-enhancement-to-the-sqlite3-exe-command-line-shell-tp73827p74071.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] auto_vacuum default setting

2013-08-29 Thread _ph_
auto_vacuum at full is slower:  

When you delete some data, SQLite would shrink the file size to the minimum.
If, in the next transaction, you add new data, SQLite would again have to
extend the file size by at least a page. This takes time. Without
auto-vacuum=FULL, the Insert statement would reuse the page that became
free.  

File system fragmentation (how the blocks of your file are arranged on disk)
also likely becomes worse. 

In-file fragmentation (how close related data is w.r.t. the logical file
offset) might also become worse, this should not be worse than random
inserts and deletes, though. 

---

auto_vacuum might be useful if disk space is very precious, or you have to
be "always on" with guaranteed response times, where even an infrequent
short "downtime" is problematic. 

If you are not sure, you can set auto_vacuum to INCREMENTAL to keep your
options open. This allows you to add incremental vaccum later, without
having to change database files. The overhead, AFAIK, is little. 

---

it makes more sense to trigger maintenance when it pays. 

You can always use pragma free_pages to query how many pages would be
released by vacuuming. 

For example: 

 - at some points - e.g. after a DELETE, after a while of inactivity or when
the user closes the application, check if there is a significant (!) number
of pages that would be freed by vacuuming. If yes, do an incremental vacuum. 

(Do not use icnremental vacuum after every delete! Only if there is
significant space to be freed!)

 - you can put a time limit on the incremental vacuum: if there's a
significant number of pages to be freed, run incremental_vacuum(1)
repeatedly until there are no more free pages, but also stop after a short
time (e.g. 50ms) has passed. This may not clear everything, but also does
nto interrupt the user

 - at a suitable point e.g. when the user is closing the application, and
you find there are really very very many pages that could be freed, start an
incremental or full vacuum of everything, but allow the user to cancel this.
(You might know something similar from closing outlook)

 - When doing a full vacuum, store the write counter in the database
somewhere. if the current write cunter is far ahead the one of the last full
vaccum, recommend a full rather than an incremental vacuum to the user.

---

We use SQLite as applicaiton file format, and in some common use cases, we
add and remove a lot of data. Furthermore, with the previosu solution,
people have complained about "databases not getting smaller" (we are talking
abotu hundreds of megabytes). So now I am using a combination of this:

 - when closing the app, and there's more than 1MB or more than 20% to 
reclaim, i release as many pages as possible for 50 ms. 
 - when after this, there's still a lot to reclaim (50% or 10MB, IIRC), I
continue calling incremental_vacuum, but allow the user to cancel the
process. 
 - There's a manual maintenance operation for integrity an validity check
and other cleanup. This also includes an option for a full VACUUM. When the
last full VACUUM was a long time ago (100k or a million writes or so), I
pre-select the "full vacuum" option. 






--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/auto-vacuum-default-setting-tp70765p70781.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Collation advice

2013-08-26 Thread _ph_
> In Hungarian, yes, that's what happens.   

I shouldn't be surprised :) 

I liike up digraphs
(http://en.wikipedia.org/wiki/Digraph_(orthography)#Digraphs_in_Unicode) and
found at least some of them have unicode code points. But if you can't cover
all, yeah, normalization won't help. 

(btw. Muenster / Münster would fall back to full comparison due to the ü)





--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Collation-advice-tp70668p70686.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BLOB & Other Data Type

2013-08-26 Thread _ph_
You can see that e.g. with a select statement:

create table test ( Name TEXT, Age Int);
INSERT INTO Test VALUES ('Klaus', 22);
INSERT INTO Test VALUES ('Meier', '022');
SELECT * FROM Test WHERE Age=22;

In this case, the comparison is made on integers, '022' converted to integer
with be equal to 22:

   Klaus|22
   Meier|22

If you use a blob for the age, 

create table testb ( Name TEXT, Age BLOB);
INSERT INTO TestB VALUES ('Klaus', 22);
INSERT INTO TestB VALUES ('Meier', '022');
SELECT * FROM TestB WHERE Age=22;

you get a binary comparison, returning only

   Klaus|22




--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/BLOB-Other-Data-Type-tp70605p70681.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Collation advice

2013-08-26 Thread _ph_
What Simon says extends to another case: if you change the visible behavior
of that function, even if it's a necessary bug fix, you end up with broken
indices. 

That's true for every collation sequence, though. 

If I interpret your code correctly, however, the user can switch the
preferred language on the fly, changing the collation locale and thus the
"response" of your function. This will *definitely* break the index. 

i.a.W. if s1 < s2 in one language, but bot in another (somethign that can
occur AFAIK), your index is valid only for the language it was created with.  

You would either have to avoid using an index at those columns (making sort
not exactly fast), or re-index when switching language (making switching
expensive and file-global)

---

Other things:

Should "AD" + "ZV" really compare as a "A" + "DZ" digraph +"V" in the
respective language? I am not sure about the intended behavior, but it seems
strange. (OTOH, language. It's always strange.)
 
Anyway, I would definitely  unicode-normalize the strings *before* putting
them into the database. You might avoid the special handling for the
digraphs if you normalize /towards/ the digraph code points: only strings
actually containing digraphs would escape your optimization.  



With the problem above in mind: If you have to change the DB anyway when
switching language, you could also store the sequence of comparison weights
in a separate column, and index/sort on this. 

I don't know the collation algorithm well enough to say, but it seems to me
you could just throw away the "case" weight, and then do a full binary
comparison. But pure ascii strings could be stored in a native lowercase
representaiton and take "only" twice the space. 

This trades of course file size, working set and "switch performance" for
comparison speed, not sure if this makes sense for you. 



Unit Tests:  
I would isolate the comparison in a core function, and primarily test that
core function, basically

assert( cmp("Ape", "monkey") < 0);

Select strings for each branch (pure ascii, "bad european ascii",
non-ascii).   
For each pair, check that you get the opposite result when you switch the
arguments. 

There's no need to test that, say, "DZ" sorts differently in huingarian than
in Albanian -  unless you distrust the core functions. You just want to make
sure to pick the right branch (memcmp, ASCII compare, full compare). 

It is hard to test for speed optimizations, as the variance of normal
execution speed trumps most gains. What I do in this case is adding
"performance counters" and test for them: in your case, three simple
counters for each of the branches:

  assert(mycollate.perf.asciicmp == 0);
  assert( cmp("Ape", "monkey") < 0);
  assert(mycollate.perf.asciicmp == 1);

This is most brittle as such optimizations tend to change, so don't
overindulge. 

Of course I assume you make a "manual" performance measurement, too. 





I am not sure why you go the full length of doing it right, but then balk at
the A vs B define, pickign the "dangerous" one. 

There are some oddities about the code, e.g. comparing a g_bUseAscii  to
explicit 0 / 1 / -1 values, but that's not the topic of your post...



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Collation-advice-tp70668p70679.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to "upgrade" a DB without write access to its file, or to the direction it's in?

2013-08-19 Thread _ph_
Another option: Copy the meta-data to an in-memory-DB. As you "load all meta
data into memory" anyway this seems a viable option.

ATTACH ':memory:' AS mem
INSERT INTO mem.Table SELECT * FROM Table

(repeat for each table, you may want to create/recreate constraint and
indices in the attached DB, of course)

During this copy, you can make the schema adjustments. 

---

I'm storing both raw and meta data in an SQLite DB, create that
in-memory-copy only for the meta data, and use this as the
in-memory-representation - no separate C++ class graph for the data
structure. 

(There's basically a C++ class for each table holding one row, and fille on
demand from the in memory db. It's not blazingly fast - you can quickly
accumulate a few ms for a bunch of queries - but good enough certainly, and
fun to use.)

When creating that "snapshot", I make adjustments to the schema, so read
compatibility is in one place. 



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/How-to-upgrade-a-DB-without-write-access-to-its-file-or-to-the-direction-it-s-in-tp70408p70574.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLITE_DEFAULT_AUTOVACUUM not preserved for in-memory-db

2013-08-19 Thread _ph_
[edit] this is of course 3.*7*.12 and 3.*7*.17



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/SQLITE-DEFAULT-AUTOVACUUM-not-preserved-for-in-memory-db-tp70571p70572.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLITE_DEFAULT_AUTOVACUUM not preserved for in-memory-db

2013-08-19 Thread _ph_
When compiling the amalgamation with 

#define SQLITE_DEFAULT_AUTOVACUUM 2

and creating an in-memory-DB, 

pragma auto_vacuum 

will return 0 (confirmed on 3.1.12, 3.1.17)

Doing this in a disk DB, or after changing it at runtime (pragma
auto_vacuum=N; VACUUM;), auto_vacuum will return the expected value.

--- 

This is not an issue for me - and I'm not sure if auto_vacuum is even
relevant for an in-memory-db, so a note in the documentation might be
sufficient. Yet it did trigger a unit test (and it took me longer than I am
willing to admit to find the cause). 



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/SQLITE-DEFAULT-AUTOVACUUM-not-preserved-for-in-memory-db-tp70571.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] incremental_vacuum within or outside transaction?

2013-07-17 Thread _ph_
Is there any rationale for (not) running incremental_vacuum in an
transaction?

Unlike a full VACUUM, it seems to work with or without. 

Purpose:
We are running with auto_vacuum = INCREMENTAL, and when closing the file in
our software, we do an on-demand cleanup like so:

if (Query('pragma freelist_count') > threshold)
   while (time_passed < max_time)
  Exec('pragma incremental_vacuum(1)');

In practcie, the condition involves more than a single query, and we will
elevate that from silent to "with UI" if there's a lot to be reclaimed. 



A quick (statistically insigificant) test suggests that we do make more
progress within an transaction, OTOH it feels "strange". 



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/incremental-vacuum-within-or-outside-transaction-tp70086.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Strategies for checking the version of a database?

2012-09-21 Thread _ph_
> http://semver.org/  

Yikes! A good example for how many words it takes to formalize somethign
that "feels" obvious. 



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Strategies-for-checking-the-version-of-a-database-tp64330p64419.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Strategies for checking the version of a database?

2012-09-21 Thread _ph_
I'm using SQLite as application data file format for our software product
line, and being compatible in both directions is important for us (and
honored by our users). I've only recently moved to SQLite, so the thoughts
listed here haven't stood the test of time yet, but I'd assume they are
helpful nonetheless.



1. Add a version scheme and reject changes of the major version in the very
first software you release. 

Instead of user_version, I am using a major.minor field, and reject (in the
first to the current release) everything that is different in major version.

Note that a future release can always opt to transparently support older
versions.

I've also added a text field containing an optional message displayed when
the database is rejected. This allows future software versions to "send"
instructions to the users how they could upgrade (e.g. where to download a
conversion tool).



2. If you can't reject the database, you still have to deal with the cases:

 - the schema of the file is older than the one the application knows
 - the schema of the file is newer than the one the application knows

In the first case (code "knows" the old schema) the easiest way is to
upgrade the schema automatically when the database is opened. This way, all
the "different version" support is conveniently located in one place, and
doesn't creep into the entire database layer.

If the database is opened only for reading, or the upgrade would break
previous versions of the software, I could still fall back to adjusting some
queries on the fly. From my past experience, that would be r 

In the second case, the database has a scheme you don't know. You can only
trust the newer application to not have broken anything for you. 

It is fairly easy to allow future versions to add columns, though:

 - SELECT statements need to query for explicit columns, rather than using
SELECT *
 - All columns added in later versions need a default value, so they don't
need to be specified in INSERT's

(That probably won't help for complex queries).




3. Reformat on the fly  

There's another option to simplify handling old data that might help in some
cases: when opening the database, you can create a temporary copy of key
tables, and reformat / update that temporary copy.

For tables that see complex changes, don't hold large amounts of data and
are mostly read from, this again moves the versioning problems to a single
point, a lot of the remaining code can work as if the database was of the
newer format. 

(You would still need to handle the different versions for non-SELECT
statements, or maybe provide a generic mechanism that writes back the
changes made to the in-memory copy).


-

4. Maintenance Operation

I've added a "Maintenance" operation that is motivated for the user as "try
this first if the database behaves strangely". This operaiton can
transparently make some changes.

-

Use case:  After Schema 1.0, I've added the ability to de-duplicate the data
held in large-ish blobs. For this I have added a "Hash" column that
default-initializes to null. When opening an older database, it is only
upgraded to contain this column. When storing a record, I calculate the hash
of the new record and look for an existing record with the same hash. During
Maintenance, I can calculate missing hashes, and fold duplicates of the
existing data.


Hope that gives you some ideas.









--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Strategies-for-checking-the-version-of-a-database-tp64330p64407.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] index has higher cache priority than data?

2012-09-21 Thread _ph_
There is also the compile-time option

SQLITE_DIRECT_OVERFLOW_READ

(see here: http://www.sqlite.org/compile.html)
that makes content in overflow pages bypass the SQlite page cache. In my
understanding, that should help if the majority data consists of large BLOB
/ String cells. 

(It would be nice if this option could be set through a pragma, for
performance comparisons)





--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/index-has-higher-cache-priority-than-data-tp64393p64399.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] index has higher cache priority than data?

2012-09-21 Thread _ph_
I have a little experience with a somewhat similar setup: a typical "real"
file is 200MB...2GB. I am making a "snapshot" of the data structure (few
hundred K) into an attached in-memory-database. 

I've seen that the timre required to create that snapshot depends largely on
the size of the entire table, even if only selected columns go into the
snapshot. 

I.e. 

   ATTACH ':memory:' AS mem;
  INSERT INTO mem.Snapshot SELECT col1, col2 FROM Data;

is much slower if 'Data' contains an additional column with large data.
Moving my item meta data (small) into a table separate from the possibly
large blobs helped immensely. 

Note: I haven't investigated that much, as separating the large data column
into a separate table makes sense for other reasons. It could be that the
significant difference - even though very consistent to observe with
multiple files - was more due to OS/disk caching than SQLite itself. 




--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/index-has-higher-cache-priority-than-data-tp64393p64397.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Fw: create table question

2012-08-27 Thread _ph_
> our Visual Studio project will not  breakpoint in certain places

For SQLite, this is usually caused by Visual Studio only supporting
breakpoints in the first 64K lines. (VS doesn't even tell you the
breakpoints don't work, they just don't get hit).



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/create-table-question-tp63710p63903.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] When to call VACUUM - revisited

2012-07-05 Thread _ph_
Hi Simon,

I already read your previous replies, but to revisit my scenaro:

 - My OS is "sensitive to fragmentation"
 - We are running with auto-vacuum enabled, so the freelist_count is usually
small (not a good indicator) 
   but fragmentation supposedly gets worse
 -We use sqlite as application data format, a typical user has dozens of
files. 
   This makes "During a support call" is not an option

So for me from reading documentation and this list, there's an "omnious
cloud of defragmentation looming" - wthout much data how bad it is or can
be. 

--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/When-to-call-VACUUM-revisited-tp63114p63128.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users