from:"Max Vlasov"

Re: [sqlite] Sizeof tables

2010-08-18 Thread Max Vlasov

On Tue, Aug 17, 2010 at 4:28 PM, Lukas Haase  wrote:

> Hi,
>
> My sqlite database is about 65 MB. The data is split into serval tables.
>
> Is there a way to enumerate the space requirements for each table so
> that I can see which tables are the memory consumers?
>
>
Look at this discussion:
http://www.mail-archive.com/sqlite-users@sqlite.org/msg53997.html
at least two variants of the solution there.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Same db, same query, very slow query performance in sqlite 3.7.0.1

2010-08-16 Thread Max Vlasov

> And let us know if the problem persists.  Perhaps this has been fixed by
>
>http://www.sqlite.org/src/info/e4b8a2ba6e
>
>
Richard,
I tried to investigate the problem L L posted in parallel. I think L L will
report his results. From my tests in seem the snapshot has fixed the
problem, but it seems the cause was different to one mentioned by you.

When I narrowed the query to

select * from catalogues
inner join
   (select max(validfrom), idcatalogue from CRONOcatalogues) as tbl
on catalogues.idcatalogue=tbl.idcatalogue and catalogues.active='s'

The difference between the plans was the lines

TABLE catalogues
TABLE  AS tbl WITH AUTOMATIC INDEX
in case of 3.7.0.1 and

TABLE  AS tbl
TABLE catalogues USING PRIMARY KEY
in case of 3.23.

For some reason, sqlite 3.7.0.1 prefered creating automatic index on the
temporary result (tbl) and full scan catalogues while 3.23 full-scanned the
sub-query (or even feed the results while performing the sub-query) and used
primary index for catalogues. The snapshot you mentioned now correctly uses
the index. Mabye it was due to false assumption that the index can not be
used when a field not used in index (active = 's') exists in the
expression.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Reserve database pages

2010-08-13 Thread Max Vlasov

On Fri, Aug 13, 2010 at 1:38 PM, Max Vlasov <max.vla...@gmail.com> wrote:

>
> I can approximately calculate, how big the new database will grow. Is
>> there a way to tell SQLite to reserve an inital space or numer of pages
>> instead of letting the database file grow again and again?
>>
>
>
> Thought about this recently. Another idea is to tweak VFS. Since xWrite
> method is supposed to accept iOfst that is bigger than the current file
> size, one can check whether the new write query is going to allocate new
> space for the file (vs internal space writing), and before actual call make
> prior call of the same function writing for example a single zero byte a
> little far and after that perform the original request.

Took not so much time to try it, the good news is that the tweak technically
works (thanks to the flexible VFS sqlite uses), the bad news is that it
gives no significant improvement, at least on Windows XP, both FAT32 and
NTFS, only in some specific cases a little difference was noticeable.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Reserve database pages

2010-08-13 Thread Max Vlasov

> I can approximately calculate, how big the new database will grow. Is
> there a way to tell SQLite to reserve an inital space or numer of pages
> instead of letting the database file grow again and again?
>


Thought about this recently. Another idea is to tweak VFS. Since xWrite
method is supposed to accept iOfst that is bigger than the current file
size, one can check whether the new write query is going to allocate new
space for the file (vs internal space writing), and before actual call make
prior call of the same function writing for example a single zero byte a
little far and after that perform the original request. The only thing that
can be affected in this case is the routines that use xFileSize. I looked at
the sources, at least backup Api uses it, it either should be
straightforward in this case or "pretend" and report the expected file size.
>From what I see, it seems that the sqlite internally doesn't rely on the
file size for the core functionality and xFileSize looks more like a service
function, but I may be wrong

Max Vlasov,
www.maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Coping with database growth/fragmentation

2010-07-28 Thread Max Vlasov

On Fri, Jul 23, 2010 at 5:11 AM, Taras Glek <tg...@mozilla.com> wrote:

>  Hello,
> Recently I spent some time investigating sqlite IO patterns in Mozilla.
> Two issues came up: keeping sqlite files from getting fragmented and
> fixing fragmented sqlite files.
>
>
Funny, that's why I like reading someone's questions in this list. It helps
sometimes solving old tasks :). As many noticed Windows system cache is hard
to control. So for example it almost impossible to clear reading cache for
testing purposes, once you read the file, it's in the cache so the following
timing numbers are irrelevant. There's an option for CreateFile,
FILE_FLAG_NO_BUFFERING, it can be used to disable the cache for a file when
one wants to work with it. I thought maybe to change the sources and prepare
a special version of sqlite allowing to open without cache. But a better
solution at least on XP came, if I "touch" a file with
CreateFile(..FILE_FLAG_NO_BUFFERING) and close it, it won't use the cache
for the next file opening, so for testing purposes I just made a checkbox in
an admin that "touches" the file before passing it to sqlite. And it seems
it really works.

So, Taras, thank for your post )

Also with this approach I tried to test places.sqlite, particularly
moz_places table, the query was

SELECT * FROM moz_places WHERE url Like "%double%"

I suppose that mozilla team probably uses different queries, but since there
are no fts table recognizable, there should be some kind of full-scan.

So, my tests on two hard drives showed that windows fragmentation had small
effect on the performance of the query, while VACUUM; results had
significant. Before Vacuum, my long time places.sqlite 13M in size, having
moz_places with 16893 records, return results after 8-10 seconds, depending
on the place it lived, but after VACUUM, the results were between 150ms and
300ms.

I think that this can be due to the nature of windows read ahead cache, so
when sqlite tables are placed in consequent blocks of file, Windows loads
the same pages sqlite expects. So before any file system defragmentation,
internal sqlite defragmentation (VACUUM) have to be applied.

Max Vlasov,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] VACUUM not reclaiming space

2010-07-27 Thread Max Vlasov

On Mon, Jul 26, 2010 at 11:28 PM, Taras Glek  wrote:

>  Hi,
> I noticed an interesting disk-space behavior with VACUUM. If I vacuum my
> places.sqlite(Firefox database), it's 49mb. ...
> ...
> Then vacuum, the db becomes 24mb.
>

Taras,
49 almost equal to 24*2. Can it be related to some encoding conversion, like
UTF16 in first case and UTF-8 in the other?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] suggestion: bitwise shift right optimization

2010-07-20 Thread Max Vlasov

>   In the case of SQLite, it is also very unlikely to save space.  In
>  fact, because of the way integer values are stored, it is very
>  likely to use more space.
>

Jay, In most cases yes, but there are ones with several integers that should
be used together in ordering, the space is wasted since sqlite saves the
table b-tree with the rowid + these integers and also index b-tree with the
integers and rowid reference. When such tables contains thousands or
millions records, it starts making sense. Using rowid in this case as a
packed value should help.

Ironically sqlite being db right before you with all the statistics
available encourages experimenting opposite to for example mysql looking
like a mountain far away )

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Horrendous slowdown when updating versions

2010-07-20 Thread Max Vlasov

On Tue, Jul 20, 2010 at 7:38 PM, Ed Hawke <
edward.ha...@hawkeyeinnovations.co.uk> wrote:

> I was running a system using an old sqlite3.dll (version 3.0.8, I believe).
>
> Updating to the latest version (3.6.23) causes my program to run
> incredibly slowly.
>
>
Can you provide more details about the query, maybe narrow it to some easily
understandable sql fragment

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] suggestion: bitwise shift right optimization

2010-07-20 Thread Max Vlasov

On Tue, Jul 20, 2010 at 5:14 PM, Pavel Ivanov  wrote:

> > Are there reasons not to implement optimization in the first case? Except
> > for this is not most requested one :)
>
> I guess because this case is highly specific and it's behavior should
> depend on particular constants used. Put there for example Id >> 54 =
> 1000 and now we should make optimizer guess right away that query will
> never return any rows.
>
>
Pavel, I thought about this a little more and I can see a little problem

For example, if we have an abstract function F, that we can guarantee:
- if a <=b  F(a)<=F(b)
- if a >=b  F(a)>=F(b)
we actually should perform a kind of range search, but less effective than
general range search. General range search knows what are the limits so
search only for them regardless of the rows to be found, but this query
should find any value and after that move left while F() is true and move
right while F() is true.

On the other size this kind of search will have either the same
effectiveness as a full scan (in worst case) or better. I suppose this
limitation is also why the queries with complex left parts (even WHERE id +
1 =) also does not use optimizer (CMIIW)

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] error in sum function

2010-07-20 Thread Max Vlasov

SQLite Expert (which I use extensively in the Pro version) enforces
> displaying types as declared.  So if you declare your column as INTEGER
> (or INT, ...) then it will display integers regardless of the actual
> individual data type using common conversions.  This is a side effect
> of the Delphi grid component that Bogdan uses.
>
>
Although the topic is no longer active, the problem is interesting and I
thought about this recently
There's something sqlite-aware administration tools would implement. It's
not that hard when data retrieval is in progress to compare the type one
want to retrieve to the result of sqlite3_column_type function. Since some
lines of the conversion table listed at
http://www.sqlite.org/c3ref/column_blob.html will likely have lossy
conversions, it is always possible to detect such state and show a warning
about possible conflicts. If the user does not want it, ok, select "never
show again" and forget about it, but personally I'd never switch it off.

Sure it's not a problem for tools that always use sqlite3_bind_text for
display purpuses.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] suggestion: bitwise shift right optimization

2010-07-20 Thread Max Vlasov

Hi,
as long as I see currently bitwise right does not use index

CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT);

EXPLAIN QUERY PLAN SELECT * FROM TestTable WHERE Id >> 12 = 1000;
: TABLE TestTable

Sure I can replace it with the following query

EXPLAIN QUERY PLAN SELECT * FROM TestTable WHERE Id BETWEEN 1000 << 12 AND
(1001 << 12)-1;
: TABLE TestTable USING PRIMARY KEY

But the first syntax is more straightforward.
Are there reasons not to implement optimization in the first case? Except
for this is not most requested one :)
Btw, actually, sometimes when several small fields exists and they should be
indexed, one can pack them into id(rowid) to save space and the first syntax
will allow querying more naturally

Thanks,

Max Vlasov
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How much a table takes (will this query always work)

2010-07-17 Thread Max Vlasov

On Fri, Jul 16, 2010 at 5:14 PM, Jim Wilcoxson  wrote:

> >>
> > You mean, like this?
> >
> > pragma page_count
> > pragma page_size
> > size = page_count*page_size
> >
> >
> Doh.  Nevermind.  I see you said table, not database. :)
>
>
If I'm asked, the syntax PRAGMA page_count(table_name/index_name) would be
great if such function could be included in future versions. And the old
syntax without names would work as before

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How much a table takes (will this query always work)

2010-07-17 Thread Max Vlasov

On Fri, Jul 16, 2010 at 4:52 PM, Jay A. Kreibich  wrote:

>
>  Check out sqlite3_analyzer.  This is available in binary format only
>  on the SQLite website.   http://sqlite.org/download.html
>
>
Thanks, the tool is really great, but just wondering can the code used there
be part of another project in c language.  As long as I see from the search
result the source code for this functionality is spaceanal.tcl. I tried to
find the method used and it seems that sqllite tcl extension has a special
way of accessing btrees directly. So in this case I probably should find the
corresponding functions in the c source and make them public, right?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] How much a table takes (will this query always work)

2010-07-16 Thread Max Vlasov

Hi,

always wanted to have a possibility to calculate how much a table occupies.
As long as I see from the archive, there's no out-of-the-box solution
(CMIIW)

Recently I made a query that probably solves this, but it has some
limitations and there are still issues unsolved.

So, let's TableToCheck is table we want to know about.

-

CREATE TEMP TABLE [tblsizetemptable1] AS SELECT * FROM TableToCheck;
CREATE TEMP TABLE [tblsizetemptable2] ([Id] INTEGER PRIMARY KEY);
SELECT (SELECT rootpage FROM sqlite_temp_master  WHERE
tbl_name='tblsizetemptable2')-
(SELECT rootpage FROM sqlite_temp_master  WHERE
tbl_name='tblsizetemptable1');

The result can be used as a page count size or be multiplied with the result
of PRAGMA page_size;

After that we should drop the tables
DROP TABLE tblsizetemptable2;
DROP TABLE tblsizetemptable1;

I hope these queries are self-explanatory
-

So the questions and problems:
- Is it possible to create a similar query for indexes. I mean the one, in
what I only have to change the name of the index. As long as I understand
there's no syntax looking like CREATE INDEX ... AS
- If there are temp tables exist, the temp database file might be
fragmented, so this method can give bad results. Is there something like
VACUUM for the temp database or a similar actions leading to the compacted
temp database?
- Are there cases when a wrong result possible with ideal conditions (no
temp database before)?
- Such method is not good for large databases since all the data should be
re-saved just to get the value. If someone suggest an algorithm solving
this, this will be great.

Thanks,

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] error in sum function

2010-07-14 Thread Max Vlasov

Riccardo,

please do the following query

select typeof(basket) as tp FROM Data WHERE tp<>"integer"

as you may know sqlite accepts any value regardless of your desired type, so
you possible could have inserted a real value not knowing about it. This
query will probably show the rows with such values

Max

On Wed, Jul 14, 2010 at 2:43 PM, Riccardo Cohen
wrote:

> Hello
> I've been using sqlite in many projects (thanks for providing it) and
> found today someting strange with sum function
>
> I have a simple database:
>
> CREATE TABLE data (irisid char(9),postid varchar(20),basket integer);
> CREATE INDEX irisididx on data (irisid);
> CREATE INDEX postididx on data (postid);
>
> created by my c#.net program using sqlite ado component from
> http://sqlite.phxsoftware.com/ v1.0.65.0 (cannot say which sqlite version)
>
> This works perfectly until I need to use sum. I use SQLiteExpertPersonal
> 3.0.19 to do some selects :
>
> NO SUM :
> 
> select irisid,basket from data where irisid in ('37203') and postid
> in ('A_02_001_0001') group by irisid
>
> irisid  basket
> 37203   696
>
> DO THE SUM OF THIS ITEM :
> 
> select irisid,sum(basket) from data where irisid in ('37203') and
> postid in ('A_02_001_0001')
>
> irisid  sum(basket)
> 37203   695.81315226
>
> same problem if multiple rows, with or without group by, with '='
> instead of 'in'... same also with min() and max(), there is no null, and
> no float (all ints) in this record (or any other)
>
> I cannot understand why this approximation. I tried to build a new small
> database in sqlite expert and cannot reproduce the problem.
> I cannot send you the database it takes 1,5 Gb (15M records).
> I found nothing when googling for this problem.
>
> I would like to know if there is a simple explanation, before trying to
> rebuild my huge database with some tests.
> thanks a lot for your help.
>
> --
> Riccardo Cohen
> Architecte du Logiciel
> http://www.architectedulogiciel.fr
> +33 (0)6.09.83.64.49
> Membre du réseau http://www.reflexe-conseil-centre.org
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] New 3.7.0 snapshot - release estimated for July 22

2010-07-14 Thread Max Vlasov

On Wed, Jul 14, 2010 at 2:12 PM, D. Richard Hipp  wrote:

> The signature of the VFS has changed slightly - the xShmOpen() method
> has been removed.  Implementations are now expected to automatically
> create the shared memory on the first call to xShmMap().  And the
> xShmClose() method has been renamed to xShmUnmap().  A new snapshot
> with these changes is now available in the usual place:
>
>
I also see that the order of two methods (Lock and map) was changed. No
problem, just checking is it correct

struct sqlite3_io_methods {

(201007091257)

  int (*xShmOpen)(sqlite3_file*);
  int (*xShmLock)(sqlite3_file*, int offset, int n, int flags);
  int (*xShmMap)(sqlite3_file*, int iPage, int pgsz, int, void volatile**);

(201007140820)

  int (*xShmMap)(sqlite3_file*, int iPg, int pgsz, int, void volatile**);
  int (*xShmLock)(sqlite3_file*, int offset, int n, int flags);

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] WAL: no longer able to use read-only databases?

2010-07-13 Thread Max Vlasov

>
> > In testing the latest SQLite snapshot with WAL enabled, it seems that
> > there's no way to use a database in a read-only location.
> >
>
> Documentation on the developers' current view of this issue can be found
> here:
>
>
>
In my opinion it's ok, WAL already has special conditions on which it would
operate and the current documentation describes them all thoroughly.  I wish
only the "advantages" sections of WAL also grew a little, maybe giving more
details about speed improvement encouraging using this mode more frequently

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Suggestion for sqlite shell

2010-07-12 Thread Max Vlasov

On Mon, Jul 12, 2010 at 5:00 PM, Alexey Pechnikov wrote:

> Maxim, please show example here and the link to your implementation.
>
>
Alexey,

to illustrate I prepared a screen shot

http://www.maxerist.net/downloads/temp/bind_iterate.png

excel and cells to be copied are at the top, at the center the query, the
dialog ask for the user confirmation and the results are below. It's just an
example of complex data manipulation "on the fly" with this feature, the
code itself is very simple as you may see from the input and output. Knowing
that sqlite shell already can parse csv content, I think this feature will
require no more than a couple of hundred lines of code

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Suggestion for sqlite shell

2010-07-12 Thread Max Vlasov

Hi,

recently I implemented a feature for an gui admin I use internally, but I
think it would be useful also for sqlite shell. Sorry if it already exists
in some form, I could not find similar.

The idea is to use bind syntax together with csv (or clipboard compatible)
import. So if one enters a query with question marks and other symbols that
allows parameters numbering (but not naming obviously) and also provides csv
file, shell will allow to interate through each row of the file and perform
the query using the row as the parameters set. It will allow not only using
a complex inserts with concatenation and so on but also give the ability to
make mass updates or other conditional operations. There are errors possible
like conflict between expected number of parameters and provided, but this
could be resolved either with returrning an error (% expected, but % found)
or allowing ignoring extra fields with padding.

I don't know whether the clipboard content can be used in sqlite
ideologically, but in my case it saved additional time allowing importing
for example from excel without the intermediate file (csv).

Thanks

Max Vlasov
www.maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to determine when to VACUUM?

2010-07-08 Thread Max Vlasov

>>

> >
> > Simon, you gave an interesting explanation, but does this rule work in
> > general? I mean there are many models, many sizes and so on.
>
> Don't know.  You could test it.  Write a program that creates a file half
> the size of the drive, then writes to random parts of it timing each
> command.  If all the write commands take about the same amount of time then
> it doesn't work the way I described.  I just repeated the description I had
> read of the way SSD drives work.
>

Thanks, that what I thought )
SSD is an interesting thing to research especially in sqlite perspective,
for example, for reading it's access times that makes SSD winner, so
probably reading large sqlite database randomly can have some benefits being
used on SSD, but I don't know of any real world measurements. Do you know
any?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] WAL questions

2010-07-08 Thread Max Vlasov

Second Proposed Change:

>
> Remove the WAL mode from "PRAGMA journal_mode".  The journal_mode pragma
> only specifies the various rollback journal modes.  Enable the WAL using a
> separate pragma such as  "PRAGMA wal=ON; PRAGMA wal=OFF;"
>
> Thoughts?  Comments?  Other suggestions?
>
>
Maybe it's not the right time, but can it be implemented with separated
pragma as your second proposed change but with more options like OFF,
PERSISTENT, TEMPORAL. Still thinking about the fact that WAL being a new
software feature and actually becomes a new file format change, the latter
option can solve this (if it's technically possible and not hard to append
of course). The logic can be like this, if WAL=TEMPORAL, the first
reader/writer that accesses the base makes the changes to 18,19 bytes (=2)
allowing it and following reader/writers work in WAL mode and the last
reader/writer that closes the base, reverts it back (=1) allowing the file
format stays the same. Sure there are possible cases when the format still
changed (for example unexpected program end), but the next successful
open/close will fix this.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] WAL questions

2010-07-08 Thread Max Vlasov

> You want "PRAGMA main.journal_mode"
>
> A "PRAGMA journal_mode;" (without the "main.") shows you the default
> journal
> mode used by newly created databases, which is always "DELETE" unless you
> have changed it with a prior "PRAGMA journal_mode=MODE" command.
>
> Though, I will admit, this is confusing, and I was thinking last night
> about
> ways we could possibly change it
>

Thanks for pointing out. I have two observations
- According to docs this was present for ages so it's a shame on me not to
knowing it in the first place. So probably it should stay as it is now.

- But this is the first time when this important pragma really affects the
sqlite file format and what it more important, not only for current session.
You probably had reasons for implementing WAL the way it is now, but
imagine, this setting once changed don't remind of itself for the developer.
So all his current code base works once it started using this version of
sqlite, but consequently small (or maybe large part) of his bases becomes
WAL-enabled (number 2 in the file format). The latter may appear because of
his own WAL-On without WAL-Off or WAL-On with unexpected program
interruption. Everything is ok, until these bases have to be
distributed/used in other enviroments, that probably use lower than 3.7
versions of sqlite3, so it stops working claming about unknown file format.
I think it's a potential way of new wave of mass bug/missing reporting or
simply confusion

I may exaggerate, I suggest other participants of the list share their
thoughts

Thanks,

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How to determine when to VACUUM?

2010-07-08 Thread Max Vlasov

> >> (I guess it well might not on an SSD disk, but on a conventional
> >> rotational disk, pager could read several pages ahead with one seek -
> >> but does it?)
> >
> >  No, the pager does not.  Among other things, my feeling is that the
> >  locality of pages is not very strong, unless the database was just
> >  VACUUMed.
>
> Actually the SSD possibility makes it worse, not better.
>

Simon, you gave an interesting explanation, but does this rule work in
general? I mean there are many models, many sizes and so on. For example
SanDisk SSD used in my Asus T91MT claims it has some internal writing cache,
so this controller can have its own logic working independently of the
software installed. Also, allowing several chips writing at the same time
might have conflict  with any OS'  own caching mechanism. Besides I'm not
sure the caching in any OS is smart enough to take this into account. For
example (I'm not sure Windows is the best) but giving the fact that XP
didn't have proper partition alignment for SSD and it took some time for
enthusiasts to let MS know about this makes me think the cache system in
Windows is still not smart enough even for much easier SSD-related tasks :)

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] WAL questions

2010-07-07 Thread Max Vlasov

Alexey,

I read this sentence, but it didn't help.
So I suppose there's a bug in PRAGMA journal_mode logic
Steps to reproduce.

1. Create an empty base with some table.
Look at the 18,19 offsets, they both = 1, it's ok, the base is compatible
with full range of sqlite3 versions.

2. Do PRAGMA journal_mode=WAL; , it returns wal, so WAL is set.
Look at the 18,19 offsets, they both = 2, the base no longer compatible with
older versions, checking... yes, they say "encrypted" or something.
Query PRAGMA journal_mode; alone (just to check not set) , it still says
"wal", ok

3. Close the database. offsets 18,19 still = 2, so the WAL setting are kept
between sessions.

4. Open the db again, offsets 18,19 still = 2,
query PRAGMA journal_mode; it says  "delete", but definetely should return
"wal".

Max

On Thu, Jul 8, 2010 at 12:09 AM, Alexey Pechnikov wrote:

> See http://sqlite.org/draft/wal.html :
>
> "An SQLite database _connection_
> defaults
> to journal_mode=DELETE. To convert to WAL mode, use the following
> pragma: PRAGMA journal_mode=WAL;"
>
> --
> Best regards, Alexey Pechnikov.
> http://pechnikov.tel/
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Why "select count(distinct f1, f2) from tbl_test" can not work?

2010-06-26 Thread Max Vlasov

On Sat, Jun 26, 2010 at 7:27 AM, zhangzhenggui wrote:

> tbl_test maybe like this:
> create table tbl_test(f1, f2, f3);
>
> Now, I want to get the num of records which (f1, f2) are distinct.
>
> I try "select count(distinct f1, f2) from tbl_test", but error occur: SQL
> error: wrong number of arguments to function count()
>
>
> although, I can do like this, but I think it not very good :
> select count(1) from (select distinct f1, f2 from tbl_test);
>
>

If you know the type and the maximum range (in case of integer) you can do
the tricks like this

select count(distinct ((Value1 << 32) | Value2)) FROM TestTable

but I did quick text for a table consisting of thousands of integers, the
speed is similar to your another query that you called not very good. I
think both needs some temporary storage, so what you like aesthetically is
the best :)

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] database corruption problem

2010-06-08 Thread Max Vlasov

On Mon, Jun 7, 2010 at 7:31 PM, Daniel Stutzbach <
dan...@stutzbachenterprises.com> wrote:

> I'm the author of a small Windows application that embeds SQLite.  I'm
> currently using version 3.5.9 with PRAGMA locking_mode = EXCLUSIVE
> and PRAGMA journal_mode = TRUNCATE.  I just received a crash report from a
> user with the dreaded error: "database disk image is malformed".
>
> My application is multi-threaded, although only the main thread calls
> SQLite
> functions (the other threads do non-DB stuff).
>
>
You didn't mention your language/development tools. It would help also.

I also suggest to implement artificial tests like constantly emulating user
actions in your program. Once it helped me to find the cause of the same
corruption error. It would be some dev-only menu item that activates a loop
constantly reading randomly and writing randomly until specially
interrupted. While the program reads and writes you check PRAGMA
integrity_check; from time to time and stop if result shows corruption.
If you can't reproduce the problem on your machine you can send specially
prepared version to the user and ask him to perform the similar actions

Max Vlasov,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is the absence of msvcrt.dll a known issue with SQLite Windows 2000

2010-06-06 Thread Max Vlasov

On Sat, Jun 5, 2010 at 11:01 AM, Frank Church  wrote:

>
> On checking the sqlite3.dll docs it states the only additional requirement
> for it is msvcrt.dll.
>

Frank, as long as I know it's just a variation of sqlite3.dll that uses MS C
run-time dynamically linked vs statically. You can compile sqlite3.dll
without msvcrt dependency with any other compiler, for example freely
available borland command-line compiler, just checked the dll made with it,
it has only kernel32.dll and user32.dll functions in import section. Also,
sqlite3 source has no any msvcrt* mentioning in the sources (if it had, this
can be some indication of dynamic dependency).

Max,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Returning empty result set

2010-06-01 Thread Max Vlasov

Hi,

Tried to figured out the simplest query returning empty result set without
binding to any existing table.

The query
SELECT 1 WHERE 1=2

worked, but it looked a little strange ) and besides it didn't work in
mysql. Will it work in future versions of sqlite or I'd be better to use a
query working in both worlds:

SELECT * FROM (SELECT 1) AS TBL WHERE 1=2

?

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLite turns 10 years old

2010-05-29 Thread Max Vlasov

Mr. D. Richard Hipp,

10-year anniversary a great news and real pleasure to see how your and
supporters' skills turn this library into such big power. Sometimes it makes
me sad I can't explain to a non-developer what is so great about sqlite,
what makes it appearing in so many software packages. If I had power, I'd be
glad to calculate and see how many sqlite-related reads and writes are
taking place on the whole earth. I'm sure it's a big number :)

Thank you for sqlite!

Max Vlasov

On Sat, May 29, 2010 at 5:57 PM, D. Richard Hipp <d...@hwaci.com> wrote:

>
> Thanks, everybody, for helping to make SQLite the most widely deployed
> SQL database engine in the world.  And Happy 10th Birthday to SQLite!
>
> D. Richard Hipp
> d...@sqlite.org
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Aggregate and join query very slow

2010-05-27 Thread Max Vlasov

On Thu, May 27, 2010 at 3:07 PM, Michael Ash  wrote:

> ...These are large tables (52,355 records in facility and 4,085,137 in
> release_cl).
>
> ...
> sqlite> explain query plan
>   ...> SELECT name,score
>   ...> FROM facility f, (SELECT facilitynumber,SUM(score_rev) AS score
>   ...> FROM release_cl
>   ...> WHERE media<3
>   ...> AND year=2006
>   ...> GROUP BY facilitynumber) r
>

Michael, from what I see, if your release_cl table is not properly indexed
to be quickly aggregated (media and year field), this will lead to full
table reading (so all the data of your 4M records). If it's properly
indexed, and the result number of records of this select is big, consider
adding non-indexed fields to this (or brand-new) index since otherwise
sqlite quickly finds records with this index, but have to look up main data
tree to retrieve other fields.

Max,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Does sqlite always allow creating own field named 'rowid'?

2010-05-26 Thread Max Vlasov

On Wed, May 26, 2010 at 6:19 PM, Pavel Ivanov  wrote:

> > But just
> > wondering, was allowing to create such field intentional? As I suppose
> such
> > field is complete phantom since most sql queries will interpret rowid
> > internally and won't allow access this user field at all.
>
> Situation is completely opposite. When you declare your field named
> "rowid" the real rowid becomes phantom and all queries will select
> your field instead. But real rowid will be still accessible via other
> names like "_rowid_" or "oid" (is there a way to access real rowid
> when all three built-in names are declared by user in the table, I
> wonder?). And this behavior is documented (see
> http://www.sqlite.org/lang_createtable.html) and thus I suppose
> intended and introduced long ago.
>
>
>
Ok, it makes sense. My only complain is that although rowid is a sqlite
specific word, it became more of "reserved" words. One example (maybe too
specific). One works with tables with a some sqlite admin and some imported
tables from other database engine contains its own rowid field. As everyone
knows every sqlite table has id regardless of the developer intention, so I
assume that I can query SELECT rowid ... for almost every existing sqlite
table in the world. But with this hypothetical case the results of such
query even can be puzzling, for example, if the data don't declare
uniqueness limitation on its own rowid field, we could see repeating values
or even non numeric values. I think that using _rowid_ might be additional
guarantee from such problems, but I suppose many developers still prefer
rowid.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Does sqlite always allow creating own field named 'rowid'?

2010-05-26 Thread Max Vlasov

While implementing a table that intended for arbitrary table storage, I
automatically named some field rowid not linking at the moment that it will
have a name conflict with sqlite internal rowid name. Lately I discovered
this, and (not a big deal) recreated table with a different name. But just
wondering, was allowing to create such field intentional? As I suppose such
field is complete phantom since most sql queries will interpret rowid
internally and won't allow access this user field at all.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Update detail record based on master properties

2010-05-26 Thread Max Vlasov

> where exists (
>  select 1 from master_table where
>master_table.id=detail_table.masterid and
>masterfieldtocheck = okvalue and
>masterfield2tocheck = okvalue2);
>
> -- or
>
> where detail_table.masterid in (
>  select id from master_table where
>masterfieldtocheck = okvalue and
>masterfield2tocheck = okvalue2);
>
> --
> Igor Tandetnik
>
>
Thanks,Igor, it works, I see that sql itself is smart enough )
Both suggested by you are similar in speed, but the latter looks more
self-explanatory for me.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Update detail record based on master properties

2010-05-26 Thread Max Vlasov

We all know UPDATE sometimes is limited so in order to make some complex
conditional updating we can only rely on the complexity of WHERE clause.

I would like to update my detail table based on master properties (so to set
some value only  if the corresponding master record fits some conditions).
In case of only one property the statement can look like this

UPDATE detail_table SET somefield=somevalue
  WHERE
  (SELECT masterfieldtocheck FROM master_table WHERE master_table.id
=detail_table.masterid)=okvalue

Is there a way to alias the master table when I need to check several fields
of the master table? Or the only choice is to write

  WHERE
  ((SELECT masterfieldtocheck FROM master_table WHERE
master_table.id=detail_table.masterid)=okvalue)
AND
  ((SELECT masterfield2tocheck FROM master_table WHERE master_table.id
=detail_table.masterid)=okvalue2)

If I still need to repeat a similar select statement in the second section,
is sqlite optimizer smart enough to recognize that the both parts queries
the same record and not to perform this operation several times?

Thanks

Max,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] What languages can include SQLite statically?

2010-05-24 Thread Max Vlasov

Gilles,

For Delphi I successfully used files from http://www.aducom.com to
statically link sqlite files compiled with bcc (Borland command-line c
compiler freely available now) with Delphi. Also the components of
aducom.com will allow you to use all the power of Delphi database components
with the sqlite without necessity to provide any additional dlls.

But there are some adjustments needed, if you use c run-time from msvcrt dll
(the forum at aducom.com shows how to do this), no other actions needed. But
if you want to be free from msvcrt dependency, you can use c-runtime object
files also available on aducom.com, but the version I used had a serious
bug, Albert (from aducom.com) promised to fix on the site, probably you can
use it as of now.

Max
maxerist.net

On Fri, May 21, 2010 at 1:31 PM, Gilles Ganault wrote:

> Hello
>
> My C skills are very basic. I was wondering: After compiling SQLite
> into an .OBJ or .LIB file, what languages can be used to include this
> output into a main program, so we end up with a single EXE.
>
> I assume we have the choice of:
> - C
> - C++
> - Delphi (?)
> - Other?
>
> Thank you.
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Id ranges to be used with joins

2010-05-07 Thread Max Vlasov

> I have a table with two fields defining range of ids of another table,

>  > IdFrom and IdTo. I would like to use this information (multiply rows as
> a
> > SELECT result of this IdFrom and IdTo) in joins.
>
> Something like this?
>
> select * from MyTable join Ranges
> on (MyTable.id between IdFrom and IdTo);
>

Yes, thanks, Igor,

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Id ranges to be used with joins

2010-05-07 Thread Max Vlasov

Hi,
I doubt this is sqlite specific question, maybe sql in general.
I have a table with two fields defining range of ids of another table,
IdFrom and IdTo. I would like to use this information (multiply rows as a
SELECT result of this IdFrom and IdTo) in joins. So ideally it would be
great to find a way to use them in joins directly, but if it's not possible,
at least to find an effective way to populate this ranges to a temporary
table. I know there's a always a way to do this algorithmically, but just
wondering if it's possible with sql.

Thanks,
Max Vlasov
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Exclusive transactions over network

2010-04-28 Thread Max Vlasov

Jean-Christophe,
from my experience it depends. We have several clients accessing a database
shared on a 2003 server and no corruption took place so far, but sometimes
freezing of a client was possible. Also when I did some artificial tests,
when several clients tried to write on a constant basis there were cases
when one of them could also freeze. Consider doing some die hard tests with
your configuration. This should not be the same scheme as yours, the only
thing you should additionally do from time to time is PRAGMA
integrity_check. After a whole night test and thousands of successful writes
from several computers you will at least have probability arguments on your
side )

Max Vlasov,
maxerist.net

On Wed, Apr 28, 2010 at 9:43 AM, Jean-Christophe Deschamps 
<j...@q-e-d.org>wrote:

> Hi gurus,
>
> I'm aware of the limitations that generally preclude using SQLite over
> a network.
> Anyway do you think that doing so with every read or write operation
> wrapped inside an explicit exclusive transaction can be a safe way to
> run a DB for a group of 10 people under low load (typically 2Kb read or
> 100b writes per user per minute)?
> Schema will be very simple and queries / inserts as well.  Speed is not
> a real concern.
>
> So do you believe DB corruption can still occur in this context,
> knowing that the use will be for a very limited time (2-3 weeks) and
> low volume (~50K rows)?
>
> Using one of the available client/server wrappers is not a suitable option.
> This is targeted at Windows, XP or later.
>
> Do you have a better idea to make the thing more robust, even at
> additional cost in concurrency and/or speed.
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Direct access of table data

2010-04-23 Thread Max Vlasov

> ...As I add more restrictions on the where-clause it
> tends to slow down.  I realize that this is due to my indexes, but can't
> add a lot of indexes because it slows down the insert speed which is
> more important than the query speed.
>


Nathan, maybe you already knew but just in case...

if your select relies on an index and also queries fields not presented in
that index, consider appending these extra fields to the index. It doesn't
make sense in terms of search speed, (moreover it will increas the db size),
but this will save time since no extra lookup will be taking place.

So if you have table
CREATE TABLE  a, b, c,

and index
CREATE INDEX  ON a, b

and use query similar to
SELECT a, b, c ... WHERE a= and b =

sqlite will do extra lookup to get c from the table,

but if you change the index to

CREATE INDEX  ON a, b, c
the same query will get all the data from the index itself saving time and
the amount of data flow.

I did a quick test and it showed not only a noticable difference in time,
but also a significant difference in amount of the data read.

Max,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Please help test the latest query planner changes

2010-04-16 Thread Max Vlasov

> We are looking forward to your feedback, both positive and negative.
>
>
Mr D. Richard Hipp

found some time to compare this variant with the direct predecessor
(3.6.23_1)
Both dlls were compiled with Borland command-line compiler with identical
options

I tested a simple multiply lookup driven query (SELECT .. LEFT JOIN .. LEFT
JOIN.. ) and did not find any significant difference.

In another test from real program when I have a master-detail relation
between authors and their poems, the query forms a pool of not yet read
poems your new planner seems to show real improvements.

With the query

SELECT StihiAuthors.Id As AuthId, StihiAuthCandidates.Date as Date,
StihiAuthCandidates.Num as Num FROM StihiAuthors
  INNER JOIN StihiAuthCandidates ON
StihiAuthors.Id=StihiAuthCandidates.AuthorId
  LEFT JOIN StihiPoems ON Date=StihiPoems.PoemDate AND
 Num=StihiPoems.PoemNum
  WHERE StihiAuthors.IsFav=1 AND StihiPoems.rowid Is Null

sqlite3-amalgamation-3_6_23_1.dll
  reported 747 milliseconds returning 22,642 rows

sqlite3-20100415132938.dll
  reported 563 milliseconds

Both measurements are made several times and for a db that at least once was
used before in order to exclude windows system cache as a player.

Max Vlasov,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Sharing an SQLite database (using whole-file locking) over OpenAFS (Andrew File System)

2010-04-12 Thread Max Vlasov

On Tue, Apr 13, 2010 at 3:46 AM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> Does SQLite actually use byte-range locking ?  I thought it always locked
> the entire file.
>
>
Simon, I think it's about the way sqlite implements different locks, a quote
from os.h

** A SHARED_LOCK is obtained by locking a single randomly-chosen
** byte out of a specific range of bytes. ...
** An EXCLUSIVE_LOCK is obtained by locking all bytes in the range.
** There can only be one writer.  A RESERVED_LOCK is obtained by locking
** a single byte of the file that is designated as the reserved lock byte.
** A PENDING_LOCK is obtained by locking a designated byte different from
** the RESERVED_LOCK byte.

It was a note about windows, but there's also a phrase "The same locking
strategy and byte ranges are used for Unix."

Max Vlasov,
maxerist.net
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] fast acces to one memory located table from two threads

2010-03-24 Thread Max Vlasov

One process, two threads.
> I have found in documentation, that is not recommended to pass one
> connection
> from one thread to second one.
>

Yes, you're right, forgot about this, only exception is sqlite3_interrupt
that can be called from other thread otherwise its existence makes no sense.
But you probably can work around this with your own mutexes, wrapping your
calls in the corresponding access/release calls, but I suppose there still
could exist some complexities. I didn't try it myself, but I guess there are
some operations that can be implemented safely with this approach.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] fast acces to one memory located table from two threads

2010-03-24 Thread Max Vlasov

> I need an application consisting two threads.
> In one of them i need to store incomming "messages" (one message is 1 to 8
> bytes of data) to temporary table existing only in memory.
> It needs to be fast, storing hundreds of messages per second.
> There i have a trigger deleting old rows and creating some sort of circular
> buffer.
> In separate thread there should be done some selects (read only) on this
> table
> sorting the informations and storing them into separate table(s) located on
> disk (jffs2 filesystem).
>
>
Are you talking about two processes? If not, why these two threads have to
have separated connections? If it's not necessary, I'd opened disk db,
attach memory database to it (see http://www.sqlite.org/inmemorydb.html )
and pass this db handle to both threads. One of them in this case will use
only memory part of the base (fast appending), another one will do necessary
selects and inserts to disk db part. After the db is closed, you will only
have disk part left. I don't know more about locking logic in this case,
because for performance reasons, it would be better for example, if memory
part only locked its part not affecting the disk part.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] INTERSECT versus INNER JOIN with latitude, longitude queries

2010-03-22 Thread Max Vlasov

>
> Would you care to repeat those two SELECTs, but after making indices on the
> X and Y columns ?
>

Simon, Tim, forgot to mention, there were also two indexes in the test db,
on X and on Y. Without them 1,8 seconds and 1/10 data flow would not be
possible ))

On Mon, Mar 22, 2010 at 2:52 PM, Tim Romano  wrote:

> Another addendum:  apologies  --  I hope my
> discussion was clear enough despite the disconnect between my head and
> my fingers; I just noticed that I had typed "INNER JOIN" (yikes) rather
> than "INNER LOOP", by which I mean fetching the rowids using an index
> (on LATITUDE say) and then having to loop through those rowids in order
> to fetch rows from the base table to compare the LONGITUDE.
>

I think you got better results for loop because of some specific case with
your real data. With two dimensional data and without using R-Tree I doubt
we can get rid of either looping or pre-sorting. I just guess that in
general Select intersect select should be better because sqlite quickly
finds two ranges, sort results, and performs one-pass synced scan, while
your approach seeks for every candidate from scratch.

Also there are cases when Select intersect select will be always better, for
example when together with your coordinates you will have large additional
data, like descriptions. Without indexes quering [Select Y From table Where
rowid = ] will also read extra data from every record even if you don't need
it, but if you have tow compact indexes indexes only by X and Y, and use
intersect the data flow will be the same regardless of your record size. In
my test case, if I'd added descriptions to the table and every description
were 1000 bytes, the final size would grow from 44M to 1G, but the query
still read only 3M of data from the table.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] INTERSECT versus INNER JOIN with latitude, longitude queries

2010-03-22 Thread Max Vlasov

>
> Assuming a table where Latitude column and Longitude column each have
> their own index:
>
> perform select #1 which returns the rowids of rows whose latitude meets
> criteria
> INTERSECT
> perform select #2 which returns the rowids of rows whose longitude meets
> criteria
>
>

Ok, just test.

Created a base with a table

CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,
[X] INTEGER,[Y] INTEGER)

Filled with 1,000,000 records:

INSERT INTO TestTable
(X, Y)
VALUES
((random() % 5) + 4, (random() % 5) + 4)

Final size: 44M

Simple query
SELECT rowid FROM TestTable WHERE (X BETWEEN 3 AND 4)
Time: 330 ms, 110,000 rows

Intersect query
SELECT rowid FROM TestTable WHERE (X BETWEEN 3 AND 4) INTERSECT
SELECT rowid FROM TestTable WHERE (Y BETWEEN 3 AND 4)
Time:1800 ms, 10,000 rows

and from my vfs stat the latter reads about 3M of data from this 44M base
(so no full table scan)

You say that your INNER JOIN QUERY faster? You probably have a different
scheme, maybe that's the reason, but please let us know in this case

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] VACUUM & journal size

2010-03-17 Thread Max Vlasov

>  First, I'm sure it is nearly impossible to do this as a
>  guaranteed, atomic operation on most OSes and filesystems.  ...
>
>  Second, if this is meant to look like a cleanup operation on the
>  original file, the original file (including any filesystem meta-data)
>  should be kept in-tact.  ...
>
>  Last, file manipulations of this sort aren't supported by the current
>  VFS interface, and (IMHO) with good reason.  ...
> ...
>  However, it might be nice to have a "VACUUM TO " version of the
>  command.
>

Thanks, Jay (and Pavel too), you gave a very detail description of the
situation. Although unlikely your interesting suggestion (VACUUM TO) will be
implemented, I think one always make his own equivalent if it is necessary
(creating a subset of operations from sqlite3RunVacuum)

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] VACUUM & journal size

2010-03-16 Thread Max Vlasov

> This means that to VACUUM a SQLite database of size X, you need at
> least 2X of _additional_ free disk space available.  That seems rather
> wasteful, just looking at it as a SQLite user.  Although
> programmatically there may be reasons for it that I'm not aware of.
>
>

Hmm, did some research, I think that VACUUM  requirements for free disk
space is too big.

When I read the comments it was obvious that the algorithm uses very simple
approach:
Attach blank database, copy all data,  detach,  rename. Sure I might be
wrong in details, but generally it looks like this.

With this actions journal file(s) for the new database should not contain
much data. So my quess is one only need at most the size of the actual data
from the source base plus very tiny journal file. But in fact (as everyone
sees) it uses much more.

I just tried to perform VACUUM on a test base and emulate the actions with
manual attach/copy with the same base.

We have:
- truecrypt volume, size: 10M in order to see all the "disk full" errors.
- new connection, testdb.db3
- CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Value]
Text)
- 100,000 times
INSERT INTO TestTable (Value) VALUES
("12345678901234567890123456789012345678901234567890")
Final size: 6M
- After we have completely filled the db, let's remove half of the records.
DELETE FROM TestTable WHERE Id < 5

Now we have our 6M db (ready to be decreased to ~3) and about 4M of free
space.

So two scenarios:

 1. Simple vacuum

Trying to perform VACUUM:
Result: "database or disk is full" error.

 2. Emulation of Vacuum actions.

testdb_new.db3 is a new connection (tiny file 3k in size)

ATTACH 'Q:\testdb_new.db3' AS newdb

BEGIN TRANSACTION
 ; the following operations create two db-journal files so rollback is
possible.

CREATE TABLE newdb.TestTable ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Value]
Text)
 ; actually sqlite3RunVacuum does some string formatting with results from
sqlite_master and performing corresponding queries. so this create table is
probably the same query.

INSERT INTO newdb.TestTable SELECT * FROM main.TestTable
  ; the operation is ok, Windows reports 3M free, but this is probably due
to cached writes.

END TRANSACTION
  ; Ok, testdb_new is 3m and 1M is free

=

So the question is what is so special about sqlite3RunVacuum that it needs
more space than a simple emulation of its actions?

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS3 (Version 2 .6.23) - Is the boolean operator "AND" no more available?

2010-03-15 Thread Max Vlasov

> I wonder if the operator "AND" (in capitals letters) is yet available and
> different from the basic term "and" (in lower letters).
>
>
Make sure you compiled the sources with SQLITE_ENABLE_FTS3_PARENTHESIS,
since according to docs

SQLITE_ENABLE_FTS3_PARENTHESIS
This option modifies the query pattern parser in FTS3 such that it
supports operators AND and NOT (in addition to the usual OR and NEAR) and
also allows query expressions to contain nested parenthesesis.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Can default column values use max() ?

2010-03-15 Thread Max Vlasov

> When I enter a new book, I want the author to default to the last author in
> the database.  (For the sake of this example you can ignore tricks involving
> deleting authors and reusing ids.)
>


I'm not sure I understood you correctly, but the query

INSERT INTO Books
(Author, Title)
VALUES
((Select Max(rowid) FROM Authors), "test")

just worked (tested it), sure you have to make id autoincrement to ensure
Max(rowid) is actually the last author

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [sqlite-dev] Poor SQLite indexing performance

2010-03-15 Thread Max Vlasov

> Also it's quite known that
> creating index after inserting all rows is much faster than creating
> index before that. So it can be even beneficial in inserting huge
> amount of rows somewhere in the middle of the work: first delete all
> indexes, then insert rows, then create indexes once more.
>

Pavel, please add some reference (for example from docs or drh), my tests
showed that for large data there almost no difference in speed. And at the
first place I thought that the algorithm is really different, now I almost
sure that building index from scratch is just enumerating records building
the tree. CMIIW

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] structure question

2010-03-14 Thread Max Vlasov

nci...@aquarelo.com> wrote:

> Thanks Dannis,
>
> The problem is a little bigger. I must have 2 instances of same table:
> original and latest. Then my problem is what is the best way to:
>  - transform 'original' with same data as 'latest'. This is 'save'.
>  - transform 'latest' with same data as 'original'. This is 'undo'.
>
> I must always have 2 tables. Client will read from original and
> Administrator will make changes. Then, if Administrator wants to apply
> those changes to Client a copy from 'latest' to 'original' must happen.
>
> Francisco A
>

Francisco, you probably want to find some easy solution and I understand
that. But what about just adding some extra logic inside your own code. What
is undo feature, it's just information about what should be done in order to
reverse changes. So you could create your own log table inside your sqlite
base logging your changes.It works if all the changes is controlled by you,
so before any DELETE, INSERT, UPDATE you can prepare this information and
write it into your log table. In case you want to apply undo, you read this
information and make all necessary undo operations.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Column types "safe" changes

2010-03-10 Thread Max Vlasov

As I recall sqlite probably doesn't case about string types included in the
CREATE TABLE field list, so TEXT,  VARCHAR, VARCHAR(200) are identical from
its point of view. Today I needed to increase the number of chars in VARCHAR
statement since the library that uses sqlite actually does care about the
value. So I did it with writable_schema pragma temporary change. Everything
worked but just wondering, what kind of changes are safe if it is absolutely
necessary? I suppose the order of fields and the presence of every one of
them is absolutely necessary, but what about types changes like moving from
INTEGER to TEXT? Also is it possible to add to sqlite a limited ALTER TABLE
.. MODIFY support that could wrap all known "safe" changes and invoking
errors on all "unsafe" ones?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] non-blocking concurrent read questions

2010-03-09 Thread Max Vlasov

> The writer application must be failsafe, as much as possible (acoustic
> emission recording devices); I simply can not afford that a reader makes a
> select and because of a programming error the acquisition be blocked. I had
> this just by opening sqliteman.
>
> The recording rate is variable; using a test structure (~14 fields in 1
> table, all integers 32/64bit) I was able to achieve ~9 records /sec
> with
> sqlite API, which was not really good enough (there are is no jurnal; I
> don't need any rollback / consistency check).
>


If you still prefer sqlite for some reasons, how about three-parts approach.
You have raw data writer working with the speed of I/O system and making
segmented output in very simple format, lazy writer that consumes non-active
(passed) segments writing them to the sqlite base, and general reader doing
necessary reads. In this case reader and write don't have strict
requirements about locking (at least the reader can be more important in
this case). You can tune their relationship in a way when raw writer and
sqlite writer are usually synchronized, but if the reader is more active
there are temporary more non-consumed segments than before.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] non-blocking concurrent read questions

2010-03-09 Thread Max Vlasov

> Then I tried in a loop with 2 programs to write / read in parallel and it
> seems to work without problems. Can anyone advise if this has any chance to
> work (or say it would definitely NOT work)?
>
> As a short summary: would it be interesting for anyone to enable read-only
> open with a special pragma to allow reading without locking (that means,
> shared locks being a noop)?
>
>

Although speaking generally such method could be used in some situations, I
don't think it's good to allow to use it even with a "i know what I'm doing"
pragma. Any structured file (sqlite is an example) have internal
dependencies. One of the reasons to block is to write different parts of
structured data together without intervention from other parties in order to
keep the data integral. Imagine writing cache that kept changes for your
writer and finally it needed to flush the data and at the same time your
"anytime" reader started to perform some query in the middle of this
multiply pages writing process. I can't predict whether the reader will end
with some valid result or it will return with "database malformed" error.

Instead consider changing your own logic. You wrote "without* any
possibility to be blocked". I suppose you already have a perfect writer that
fits your needs, but if you post some info about the nature of your writer
and reader (records per second and something like this), it would help to be
more specific with answers.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Maximum database size?

2010-03-03 Thread Max Vlasov

On Wed, Mar 3, 2010 at 8:57 AM, Collin Capano wrote:

> The databases in question don't seem to be
> corrupt; I can open them on the command line and in python programs
> (using pysqlite) and can read triggers from them just fine. It's just
> when I try to vacuum and create temp tables that I run into trouble.
>
>
Did you try to perform PRAGMA integrity_check; on theses bases? At least
you'll be sure sqlite reads all the necessary data and considers it correct.
I don't know what will it take for 55G database, maybe hour maybe a full
night, but if you have enough time, I'd do this before anything else

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] A question about sqlite : How could i get the rows affected by current sql statement ?

2010-03-02 Thread Max Vlasov

On Tue, Mar 2, 2010 at 6:36 PM, Pavel Ivanov  wrote:

> OK, now I see the problem, but sqlite3_total_changes() will not help
> here too - it behaves the same way as sqlite3_changes(), it doesn't
> accumulate changes over several statements.
>

Hmm... are you sure about this?
A quote from the docs:

This function returns the number of row changes caused by INSERT, UPDATE or
> DELETE statements since the database connection was opened.
>

Either you're or this sentence on the site should be changed (in the final
part)

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] A question about sqlite : How could i get the rows affected by current sql statement ?

2010-03-02 Thread Max Vlasov

On Mon, Mar 1, 2010 at 6:31 PM, Pavel Ivanov  wrote:

> sqlite3_changes() is exactly what you should use in this case. And I
> didn't understand why did you find it unsuitable for you?
>
> Pavel
>

I think I understand his confusion. Imagine if for some reason you don't
know whether last query is amongst INSERT, UPDATE, or DELETE (for example it
could be SELECT). But he probably wants that some call or fragment of code
return number of changes or 0 for any recent operation including SELECT. If
he just relies on sqlite3_changes() after INSERT with two rows affected and
simple SELECT afterward, this call will still return 2. In this case  I'd
recommend using difference between consequitive sqlite3_total_changes()
values. For any read-only query this difference will always be zero.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] FTS & Doc Compression

2010-03-02 Thread Max Vlasov

On Tue, Mar 2, 2010 at 2:41 AM, Jason Lee  wrote:

> Hi all,
>
> I've been playing around with the FTS3 (via the amalgamation src) on a
> mobile device and it's working well. But my db file size is getting
> pretty big and I was looking for a way to compress it.
>


Jason, can you calculate the ratio between your text data and fts3 data?
>From my tests it showed that fts eats not so much. For example, once I tried
en wikipedia abstracts as a test file (downloadable xml, I took title and
abstract from it), it's 3M records, 500M file without fts, after indexing
the size has changed to 1,5G. And I even didn't use stop-words. So with
proper stop-words usage the ratio can even be better.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BLOBs' affect on DB performance

2010-03-02 Thread Max Vlasov

On Tue, Mar 2, 2010 at 10:54 AM, Paul Vercellotti wrote:

> Now I'm guessing that storing all those blobs will slow down access to the
> main tables (assuming records are added gradually - most without associated
> blobs, some with), because records would be spread out over many more pages
> (more seeking / disk access) - is that correct?
>

It depends on your operations. For example querying simple SELECT rowid from
table actually means reading all the pages with the table data (but not the
overflow pages when a record does not fit in one page). And for such query
lowering the size of the record will lead to fewer pages, lower
fragmentation so faster access/reading. But everything changes if your query
involves indexes, since starting this your performance will depend not on
the size of the original record, but the size of the "index" record (the
fields that is listed in CREATE INDEX). This one's because the sqlite index
actually contains only the data it needs + rowid and to effectively filter
something related to index data sqlite does not need the main table data.

Ironically that post and tests allowed me to find out that creating separate
index on rowid alias (one that has something like [Id] Primary Key
Autoincrement in CREATE TABLE) makes sense and the SELECT I mentioned in the
beginning being changed to SELECT Id FROM Table Order By Id started to work
faster and the time no longer depended on the size of the record. This was
possible thanks to this new index that was absolute wasting of space
(duplicating to the primary index), but gaining advantages from the facts
that it was effectively packed in much fewer pages.

So, my suggestion for you is to analyze your scheme and future queries and
if expensive parts of them involves just indexes (based on anything but your
blobs) and these blobs are accessed on some final stage of the query, go
with blobs inside sqlite base.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] accessing file change counter in api

2010-02-27 Thread Max Vlasov

I did some search and as I see, file change counter (dbFileVers field of the
header) is not directly or indirectly accessible for reading. But sometimes
one would like to have such feature. For example, every layer above sqlite
has likely its own cache (Delphi db engine for example). And having ability
to read, compare such counter and reload cache based on this info would
likely have some benefits in multi-client environment. As long as I see
adding some experiemental function called let's say sqlite3_getchangecounter
and intended to be used only for reading should not affect any core
functionality.

What you think? Maybe such addition is too specific?

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Prepared statements and locking

2010-02-26 Thread Max Vlasov

On Thu, Feb 25, 2010 at 6:53 PM, Pavel Ivanov  wrote:

> > So is such behavior documented and can be used or should I still avoid
> > keeping such "live" statements in db-shared enviroment?
>
> It's documented that sqlite3_finalize AND sqlite3_reset release all
> locks acquired by the statement. So after resetting the statement you
> can keep it as long as you want, it won't block out any other process
> from accessing the database.
>
>
Thanks, Pavel, but could not find this in the docs, only found this topic
was discussed previously when drh answered:

> You should be in the habit of calling sqlite3_reset() on each query
> as soon as that query finishes.  Otherwise the query *might* leave a
> read-lock on the database file and thus prevent subsequent write
> operations for working correctly.

Just hope this information will be included in the article about
sqlite3_reset some day,

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Prepared statements and locking

2010-02-25 Thread Max Vlasov

Hi,

I thought that prepared statements in non-finalized state isn't compatible
with locking mechanism in db sharing environment i.e. if I want for any of
several processes not to hang waiting for EXCLUSIVE lock none of others
should use non-finalized prepared statements. But today I found that they
can coexist, but sqlite3_reset should be called (even if I'm not planning
new data reading). My simple test confirmed this: a writing process waited
getting SQLTE_BUSY results indefinitely and when another process called
sqlite3_reset (I'm sure this is the only sqlite api call performed fired by
button press), the first one successfully ended writing.

So is such behavior documented and can be used or should I still avoid
keeping such "live" statements in db-shared enviroment?

Platform: Windows

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Interrupt first sqlite3_step in FTS3 query

2010-02-20 Thread Max Vlasov

> The fts implementation does work in response to data gotten from

>  > SQLite calls, and feeds data back out via SQLite calls, which should
> > all start throwing errors and causing things to unwind.  Most
> > expensive fts operations involve lots of subsidiary queries into the
> > SQLite core
>
> But do any of these subsidiary queries involve a call to
> CHECK_FOR_INTERRUPT?
>

Simon,
Scott meant that most of the job fts is doing is actually calls to sqlite
core that checks this state very often. I just did a windows desktop test
with the fts3, my second thread is ordered to call sqlite_interrupt after 1
second sleep and different tests confirms his explanation (including phrase
search and mask search), everything works fine.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Interrupt first sqlite3_step in FTS3 query

2010-02-20 Thread Max Vlasov

Simon,

you mentioned full-text search.
I just tried to search for mentioning of sqlite3_interrupt in the sqlite
sources
The main is the implemention of the function itself that just sets the
isInterrupted variable:

void sqlite3_interrupt(sqlite3 *db){
  db->u1.isInterrupted = 1;
}

but all of of fts_* sources lacks using this variable at all. I heard that
fts is implemented a little separately to the main core code. So I suppose
any job involving working with internal fts structures (packed data inside
blobs) currently can ignore checking this variable. Can someone more
familiar with the fts correct or confirm this?

Max

On Fri, Feb 19, 2010 at 3:18 AM, Simon  wrote:

> I am using sqlite3 with an FTS3 index for a software running on an iPod
> Touch.
>
> I have a text field that launches a full text search query at every key
> press.
>
> There is a huge speed difference between these devices and what I got
> accustomed to on desktop computers...
>
> To mitigate that, I use an asynchronous query and send a sqlite3_interrupt
> before each new query when a new key is pressed.
>
> However, it seems that some process (that can take several tens of seconds)
> in the first sqlite3_step does not test for interrupt (resulting in
> simultaneous uninterrupted concurrent threads...)
>
> Is there some way to change that?
>
> Thanks in advance for your help,
>
> Best regards,
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Prohibitive indexing time with large tables

2010-02-16 Thread Max Vlasov

Hello, Jérôme

Nice to hear you finally joined us with this really interesting discussion )


>
> To Max Vlasov:
>
> > in sorted order to sqlite base other 5 minutes, so about 10 minutes it
> > total. First 5 minutes was possible since we exchange only offsets,
> > not data
> > and other 5 minutes since inserting sorted data into B -tree is really a
> > fast operation.
> .
> Nice solution (of the type I already fiddled around, actually, as you
> can imagine).
> This variant still poses 2 problems for me:
>
> 1) Its workability is RAM-limited, and therefore not necessarily robust
> to an
> increase in dataset size beyond a certain limit I am already close to
> (Win32-based
> processes are still bound to max. 2GB/process, unfortunately).
>
> 2) I need to create 2 indices on 2 different columns whose contents is
> totally
> uncorrelated with respect to sort order. Your solution would nicely
> reduce indexing time
> of the 1st column but what about the 2nd one ?...
>
>
>
You addressed real problems, and my when I try to run my test on a system
with lower RAM the results confirms these observations. But at least we
found some way to increase the speed for some datasets and some hardware
systems. Maybe some other approaches can improve the solution. The
suggestion about using RAM drive form Ibrahim for example was interesting,
I'd also mention for example using different hard drives together with merge
sort, but all these solutions breaks the beauty of sqlite imho, and as a
consequence the flexibiliy.

But the second one is really hard to solve, that's where sqlite internally
could take advantage of low-level data access, but I doubt this is an easy
task. I suppose making any special sorting with direct file access can even
break the beauty of vdbe not mentioning the danger of changing the code
significantly

By the way, you didn't mention the size of your "fingeprints". So can you
calculate the average index record size or total index size in case of your
20M records?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Prohibitive indexing time with large tables

2010-02-16 Thread Max Vlasov

>   I'm really surprised you're not seeing something from an increased
>  cache size.  It has always made a very noticeable difference in my own
>  manipulations, but I think the largest table I've ever worked with
>  "only" had 6M rows.
>


Jay, increasing cache size really helps, but till some size as I mentioned.
And it really makes sense. I can assume what is going on. While we're not
out of cache, all this B-tree structured data is completely inside RAM and
is able to update/change very quickly, but when the time for flushing comes,
we have many pages forming this b-tree and the question is what pages should
we flushed making this part of cache free in order to next 1,000,000 records
(with absolutely unpredictable content) use this tree more effectively.
Honestly I'm not aware of such algorithm.  So every next mass insert
produces many "collisions", moments when we need to write to some page, but
it's not in the write cache.

But I guess there is something sqlite could do with using existing cache
more effectively. I don't know the internals very well, but it seems that
both read and write cache exist in sqltie. But read cache is sometimes
almost unnecessary since this data already present in the system file cache.
In this case accessing some page in the cache and accessing it with xRead
interface of VFS will probably take the same or similar time (sure if
there's no encryption or other overhead of VFS). But I'm not sure about this
guess since it's sometimes hard to separate read and write cache and also
for B-tree inserts most of pages just read will probably be written soon

By the way, Alexey Pechnikov's recent post about his tests shows that
page_cache_size not always helps and that you should take into account not
only the number of rows, but also the total index size (quote from the
http://geomapx.blogspot.com/2009/11/degradation-of-indexing-speed.html: "As
we can see the index creating speed degradation is result of index size more
than SQLite page cache size.").

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Prohibitive indexing time with large tables

2010-02-14 Thread Max Vlasov

> This is appx. 500MB cache, why not trying with 2,000,000 cache size ? :-)
>
>
Hmm, managed to increase it to only 1,000,000 (x1024) size, larger values
bring to "Out of memory" finally, and this values (1G) allows up to
6,000,000 fast records for 100 bytes field per record index. Still good,
such extreme cache method can work in many cases I think.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Prohibitive indexing time with large tables

2010-02-14 Thread Max Vlasov

Marcus,

although increasing cache size is a good method, it may sometimes give
unpredictable results (in terms of performance).
I looked at the vdbe code (EXPLAIN CREATE INDEX ... ) of the index creation
and it seems like there is no special sorting algorithm (CMIIW please).
Excluding all "make once" lines from vdbe code left this loop lines:

"19""Rowid"
"20""Column"
"21""MakeRecord"
"22""IdxInsert"
"23""Next"  .. 19

that leads to conlusion that either single "indexed" insert or a step from
index creation is actually just an insert of data inside B-tree without
special preprocessing.

So we just consider mass insert into B-tree. As I suppose the cost of such
insert is sometimes low, sometimes high, depends on different factors
including occasional need for rebalancing. As long as all operations are in
memory and all pages are "dirty" (not saved to disk) the results looks very
promising. I tried to change cache size to 500,000 (x1024) for the same base
(10M records, 100 bytes record size) and adding from scrach was very fast
until about 3M records (taskman showing about 800Mb of memory used at the
moment) and dropped significally after that. Will this bad point be 3M or
not 3M depends on the size of data used in index. So if any search inside
the tree is a parsing through the cached pages (either by sqlite or
underlying file system) and any page write operation involved in new tree
operations are dirty pages that will be write later, there is no problem.
Otherwise there is. The other problem is when total datasize exceeds 32bit
address space, the cache stops giving significant improvement.

I know that quicksorting I suggested have disadvantages, but in some way its
results is more predictable. For example, we postpone any sqlite writing
operation until the data is ready. Unfortunately, for index data size bigger
than 32bit address space the file-mapping doesn't work, but it can be
changed to simple Seek acces and at least the file system can do a decent
job keeping frequentely access data (currently) in cache.

Marcus, I didn't understand your comment about frequent reloading. As I read
from the initial post, he has large data chunk of unsorted data, 20M records
that needed to be accessed on a daily basis. So in his case any way that
lead to data placed inside sqlite db indexed properly is ok. The problem is
that on a daily basis couple of hours is a big price to pay. Jerome can
correct me, but he still didn't add anything new to this discussion, hope he
will.

Nice approach with the hash and collision resolving inside query, will keep
it in mind for future use )

Max

On Sun, Feb 14, 2010 at 4:03 PM, Marcus Grimm wrote:

> Just for my curiosity:
>
> Have you tried to increase the cache as already suggested ?
>
> I ran into a similar problem while playing with a artificial
> test database with appx. 10 Mio records and creating an index.
> Without drastically increasing the cache size sqlite appears
> not to be able to create an index on a certain field - it
> never stops within, say,  2 hours.
> Sounds quite dissapointing keeping in mind how fast sqlite usually
> operates, but it becomes clear when we consider that sqlite
> is told to use only a few MB of memory per default. ALso your
> quicksort mentioned below will be very slow if he needs
> reload data from disk all the time.
> So in my case it helps to tell sqlite to use appx 500MB memory
> via pragma cache_size.
>
> Please note that if you create an index on a text field sqlite
> will basically make a copy of the hole table in your case.
>
> Depending how you are using that Text field in a filter statement
> you may consider adding an integer hash (e.g. CRC32) entry in your
> table and create an index on that and slightly change your queries
> like:
> SELECT * From TestTable WHERE TextHas=12312 AND Text='Text to search';
> Unfortunately that works only for that simple form of "=" statements.
>
> Marcus
>
> > Jerome,
> >
> > It's an an interesting challenge, thanks for the post
> > I tried to research more and did some tests.
> > My test database contains a table with 10,000,000 records of the text 100
> > chars in length
> >
> > CREATE TABLE [TestTable] (
> > [Id] INTEGER PRIMARY KEY AUTOINCREMENT,
> > [Text] TEXT
> > )
> >
> > I suppose your data is different, but at least this one has more than 1M
> > records and not so small record size. My final base was about 1G in size.
> >
> > The default index creation
> >
> > CREATE INDEX [idx_TestTable] ON [TestTable] ([Text] )
> >
> > took very long time (about two hours or so).
> > Having the index before data insert didn't change anything, first records
> > had a good speed of append (about 10,000 records/sec significantly
> slowing
> > when the number of records exceeded 1-2M).
> >
> > So there was no way to ignore some external files approach and I did it
> > filing memory-mapped file with the contents of Text field, while filling
> > also array in memory

Re: [sqlite] Prohibitive indexing time with large tables

2010-02-14 Thread Max Vlasov

Jerome,

It's an an interesting challenge, thanks for the post
I tried to research more and did some tests.
My test database contains a table with 10,000,000 records of the text 100
chars in length

CREATE TABLE [TestTable] (
[Id] INTEGER PRIMARY KEY AUTOINCREMENT,
[Text] TEXT
)

I suppose your data is different, but at least this one has more than 1M
records and not so small record size. My final base was about 1G in size.

The default index creation

CREATE INDEX [idx_TestTable] ON [TestTable] ([Text] )

took very long time (about two hours or so).
Having the index before data insert didn't change anything, first records
had a good speed of append (about 10,000 records/sec significantly slowing
when the number of records exceeded 1-2M).

So there was no way to ignore some external files approach and I did it
filing memory-mapped file with the contents of Text field, while filling
also array in memory saving offsets and length of the strings. After that
quicksort of that offset array took about 5 minutes and inserting the textes
in sorted order to sqlite base other 5 minutes, so about 10 minutes it
total. First 5 minutes was possible since we exchange only offsets, not data
and other 5 minutes since inserting sorted data into B -tree is really a
fast operation.

Although real life data can be different, the things that worked might be
the same. So anyone can use this method occupying not more than the sqlite
file itself for temporary storage and ending up with the data in necessary
order inside sqlite database after that. I know that there are many things
to take into account like the memory size and the size of the actual data
but it's just a proof of concept.

Also I think sqlite could use the same approach internally for creating
index for existing data. The db is probably already exclusively locked while
CREATE INDEX is in process so having temporary array accessing and storing
for example file offsets of particular records should not be a problem.

Max

On Sat, Feb 13, 2010 at 5:00 PM, Jérôme Magnin wrote:

> Hi,
>
> This post is a question directed to D. Richard Hipp :
>
> I have been using SQLite for 3 years in a records linkage software
> package I have developed. My organization recently had to use the
> package to perform daily linkage of large administrative governmental
> registers (up to 20 million records each). During the linkage process,
> auxiliary tables containing records "fingerprints" must be created, and
> two columns be indexed in them.
>
> SQLite provides decent indexing times for such tables with up to 1M
> rows, but beyond this size the (already well-discussed) critical slowing
> down of indexing performance due to disk nonlocality kicks in. The only
> workaround I could imagine to ease the problem would be to duplicate the
> auxiliary table and load pre-sorted rows in it, with sort key being the
> column I intend to index on. This is unfortunately too costly in terms
> of disk space used.
>
> I therefore had to develop an alternate datasource type (based on flat
> files) in order for my system to be able to efficiently handle big
> files. Which is a pity since SQLite provides great features I still
> would like to be able to rely upon when dealing with large files.
>
> Now my question: in the "To do" section of SQLite's wiki, you mention
> "Develop a new sort implementation that does much less disk seeking. Use
> to improve indexing performance on large tables.". I have been seeing
> this entry for 3 years but nothing concrete seems to have happened on
> this front in the meantime. Do you have any idea about if (and when) you
> will work on this in the future ? Can I nourish reasonable hopes that
> the situation will improve on this aspect within the next 2 years ? This
> really has become a critical factor for me to decide on my future
> development strategy with this product.
>
> Thanks in advance for any useful information.
>
> Jerome
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQlite query performs 10 times slower than MS Access query

2010-01-29 Thread Max Vlasov

>
>
> The sqlite3_stmt_status() interface was designed for this purpose.
> http://www.sqlite.org/c3ref/stmt_status.html
>

Thanks for the info, I hope it will be extended also with other counters

I did some test with this Rowid/Id trick. A larger base, 22mb, 100,000
records (220 bytes per record average) and the difference for "SELECT rowid
.." query was more noticable (20 mb read without Id index, 1mb with it).

But I also noticed that sometimes sqlite doesn't allow to use explicit index
with ORDER BY clause considering them (primary and explicitely created)
equivalent. I think it's not a bug, but in terms of performance maybe it
would be more appropriate to obey ORDER BY clause in this case.

The table is
CREATE TABLE [MainData] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,...
The index is
CREATE INDEX [idx_ID] ON [MainData] ([ID] )

Different examples (the second line is EXPLAIN QUERY PLAN result):

SELECT rowid FROM MainData
 TABLE MainData
22 Mb read (expected, Primary index used)

SELECT rowid FROM MainData ORDER BY Id
 TABLE MainData WITH INDEX idx_ID ORDER BY
1,2 Mb read (expected, idx_ID used)

SELECT rowid FROM MainData Where Id > 1 AND Id < 30
 TABLE MainData USING PRIMARY KEY
22 Mb read (expected, Primary index used)

SELECT rowid FROM MainData Where Id > 1 AND Id < 30 ORDER BY Id
 TABLE MainData USING PRIMARY KEY ORDER BY
22 Mb read (not fully expected, ORDER BY points to Id field, but its
actually an alias)

SELECT rowid FROM MainData INDEXED BY idx_ID Where Id > 1 AND Id < 30
ORDER BY Id
 TABLE MainData WITH INDEX idx_ID ORDER BY
1,2 Mb read (expected, but INDEXED BY is not recommended for query tuning)

So when the query contains WHERE clause only when INDEXED BY was added this
explicit index was used.

As for use usefullness of this trick, I think it really can be useful when
the application wants to load one "virtual" long list of data actually
loading only rowids and optionally query full data record for visible rows.
I think in this case 22 MB disk access vs 1 MB disk access makes difference

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQlite query performs 10 times slower than MS Access query

2010-01-29 Thread Max Vlasov

On Fri, Jan 29, 2010 at 6:02 AM, Doyel5  wrote:

> I deleted all indexes, kept only the indexes on the temporary tables,
> namely
> - tempSimArgs and tempPartArgs, and seemingly my original query's running
> time decreased to around 1 sec (when I ran the query the second time). The
> execution time of the first time I run my query varies wildly, I dunno why.
> Any ideas?
>

You mentioned 800 MB as the size of your base. I think that although indexes
effectively used, the actual data can be spread across the size of your base
so first query involves many hard disk seek operations and disk cache doing
unnecessary job (for example sqlite needs only 1024 bytes at a particular
offset but the OS reads much more than that). To be sure that there's no
full or almost-full table scan I use VFS monitoring in such cases. If you
install your functions as your own file system but forward all calls to
original VFS, you can calculate real data flow for your query.

As for Access and MS db engines like Jet, we only can guess how it works,
maybe when you first open your db, the engine starts some caching
immediately, we don't know, it's closed software.

Also If you want your first join or select operations for a large base to be
faster as for general rule I'd suggest to minimize data record size for any
table that involved in join or select operations or make the page size
lower.

I did some tests to explain why I think it would help. My explanation can be
questionable so please correct me if I'm wrong. Sqlite bases works with
pages that for example can have size 1024 bytes, all data packed in pages
and operate by pages. As long as I know the only operation involving fewer
sizes is header operations. Also B-tree cells always come with the data so
they're not separated inside the db file. The only complex thing is overflow
pages when one single record can not fit inside one page. The consequence of
such format is that if the data record is fewer than 1024 bytes, visiting a
record is always full record read. Just made a quick test with VFS
monitoring:

CREATE TABLE [TestTable] ([Id] INTEGER PRIMARY KEY AUTOINCREMENT,[Text]
TEXT)
100,000 records
INSERT INTO TestTable (Text) VALUES ("1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ")

db size: 4,671,488

SELECT * FROM TestTable
VFS read flow: 4,669,456
SELECT rowid FROM TestTable
VFS read flow: 4,669,456

But if insert records with large texts (I chose intentionally 2 kb for
overflow pages to be present), SELECT rowid made twice as low page reading
comparing to SELECT *.

By the way, just discovered a strange trick that may help in similar or
other cases.
Creating Index for this table on [Id] looks like terrible idea wasting the
space (since rowid already ordered), but actually after creating the index

CREATE INDEX [idx_ID] ON [TestTable] ([ID] )

changing SELECT rowid query to

SELECT rowid FROM TestTable ORDER BY Id

led to only 1,312,000 data flow (comparing to 4,669,456) and it really makes
sence because:
- indexes in sqlite contains the data and rowid so it doesn't need the main
table to return the results, - comparing to the rowid B-tree that contains
the full datarecords, this index B-tree more effectively packed so need to
visit fewer pages. I don't know what one should do to apply this trick in
complex queries, but I hope it is possible.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Concurrency support for multiple process

2010-01-28 Thread Max Vlasov

>  I expect process B able to
> read the old data and displays appropriately. But, process B seems to be
> blocked. Why is this so as I thought SQLite handle concurrency as in this
> case where there are multiple reads and one single write ?
>

After Pavel's explanation just reread
http://www.sqlite.org/lockingv3.htmland can add that PENDING lock
theoretically allow reading for an existing
SHARED reader, but as I suppose single SELECT operation acquires SHARED and
releases it right at the end of the query so if your "display" process is
not in process of some sqlite operation, it probably has no SHARED lock so
for any new Select it is considered "new" so prohibited.

As long as I see the moment when the lock moves from RESERVED to PENDING
depends on the write cache size. I just made a test with two admins
accessing the same base and for a fresh base without any pragma tuning
several megabytes of inserts passed without changing the lock.

But the question is why do you need this kind of concurrency at all? If your
insert is small and wrapped in a single transaction, your "display"
application should not show noticeable "hanging" (if it just waits for
SQLITE_BUSY to disappear). On the other side If you do some caching starting
transaction and ending it only after some time, you may consider Commit on a
time basis so to give your "display" process chance to read the base. Anyway
I think holding single transaction for a long time is not a good idea so
coniser optimizing your write operations

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Concurrency support for multiple process

2010-01-28 Thread Max Vlasov

>
> > SQLite allows multiple readers OR a single writer to access the database
> simultaneously.
> From the SQLite doc, as long as no transaction is pending, other process
> can
> read or obtain reserved lock for write.
>

the docs say: "Locks are not acquired until the first read or write
operation. The first read operation against a database creates a SHARED lock
and the first write operation creates a RESERVED lock.".

So your only chance to have concurrent access from both processes only if
the first one starts transaction and does something innocent like
preparations with selects, temp tables and so on. Any UPDATE or INSERT will
lead to RESERVED lock and process B suspension with db access.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Sqlite3 Optimization Question

2010-01-22 Thread Max Vlasov

> Maybe I'll try a binary search, or something else along those lines.
>

You can mix binary search with sqlite as a storar. The idea is to place your
words ordered by text and use rowid as indexes.

So after creating a new table

CREATE TABLE Table2 AS SELECT * FROM Table ORDER BY Text

you now can access each row by "index" since rowids is guaranteed to be
consequential numbers. So accessing in your binary search is

SELECT Text FROM Table2 WHERE rowid=?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] fastest way to get the min/max

2010-01-20 Thread Max Vlasov

>
>
>
> One of many ways would be to precompute the min/max into a separate
> table and then query that table whenever you need the min/max.
>

Only if one have full control about how the data changes what with so
widespread format as sqlite almost always not the case. I mean without
triggers you won't be able to control the moment when someone do mass
deletes or inserts with an external admin and these are the points when
actual min/max may change. Even the author of the subject can sometimes
forget about the fact that his data has such dependency and do something
affecting this.

I thought at the first moment that Pavel's suggestion on using extra index
is very "expensive" in terms of megabytes on the disk, but despite this
drawback it's actually more robust.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite3_step()

2010-01-16 Thread Max Vlasov

Mike,
as the docs on the page http://www.sqlite.org/c3ref/stmt.html says:

1. sqlite3_prepare_v2()
2. sqlite3_bind_*()
3. sqlite3_step() one or more times
4. sqlite3_reset()
5. goto 2 or sqlite3_finalize().

As you may know sqlite3_prepare_* compiles your SQL query into a VDBE
micro-program with interpreted instructions how to execute the query. I see
reasons why sometimes bind_* can just affect the contents of one register in
this program and nothing more, but sometimes your new bindings can affect
the way the micro-program is executed.

I see some interesting variant of optimization in your question that is
theoretically might be possible. For example your query contains some
expresion about rowid evaluation (like rowid > ?) and on some step, the row
you just read can contain a new information to bypass some of your following
steps, for example initially you wanted to scan rows with rowid > 50 and now
you're ready to bypass some of them and go to > 5000. Changing the binded
value in this case looks like some kind of optimization, but speed advangage
from my point of view is not very big since after _reset sqlite will jump to
that new value almost as fast. But if your new value is lower than 50 (for
example > 20), you will get logic error since the microcode has already
passed values bigger than 20 and there's no way for the microcode to detect
your new requirements and make conditional reset.

So just always follow the pattern of calls described earlier

Max

On Sat, Jan 16, 2010 at 4:10 PM, Mike Johnston wrote:

> Using sqlite 3.6.22.  Here's my sql
> SELECT sno, lname, fname FROM addressbook where sno>?
> Suppose your stepping through a 100 row result set.  I'm calling
> sqlite3_step(), getting my values all good.  If I were to call
> sqlite3_bind_int() function with a new value for my where clause at row 50,
> does that cause the step to re-execute the entire SQL?  Do I have to do a
> reset, bind, step sequence ?
> When does sqlite3_step() evaluate the bound values in the where clause?
> Thank you,Mike
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLITE3_CANTOPEN

2010-01-15 Thread Max Vlasov

As I think, the code you were surprised to see was a fix for this issue:

http://www.mail-archive.com/sqlite-users@sqlite.org/msg35864.html

as Jeremy Spiegel in that post wrote

> when I try calling DeleteFileW on a file with a handle open on
> that file, I see DeleteFileW returning 1 (success) and I see
> GetFileAttributesW returning INVALID_FILE_ATTRIBUTES, I think because
> even though the file still exists it is in a "delete pending" state.

So your variant was no longer sufficient as I suppose to address this
problem. As I see your code relies on DeleteFile return value thorughly.

I just can add that new problems with DeleteFile can appear recently with
introducing Windows 7. I'm not sure but once I saw a strange failure on a
system with Windows 7 and Microsoft Security Essentials installed (so at
least one candidate for "me-to" accessing any new file) when rollback failed
leaving -journal file existing and getting disk i/o error on the first
sqlite3_prepare after this rollback. But still I'm not sure it's related to
winDelete .

Max

On Thu, Jan 14, 2010 at 9:04 PM, Dave Dyer  wrote:

>
> There is a known problem, at least with NTFS, that deleting the
> journal file fails unexpectedly, resulting in sqlite3_cantopen.
> I believe I'm the one who originally reported this problem,
> but the installed solution isn't quite what I recommended.
>
> I'm not sure what OS/File System you are using, but something
> analogous may be going on for you.
>
> In "winDelete", the delete code in the released version is
>
>  if( isNT() ){
>do{
>  DeleteFileW(zConverted);
>}while(   (   ((rc = GetFileAttributesW(zConverted)) !=
> INVALID_FILE_ATTRIBUTES)
>   || ((error = GetLastError()) == ERROR_ACCESS_DENIED))
>   && (++cnt < MX_DELETION_ATTEMPTS)
>   && (Sleep(100), 1) );
>
>
>
> The code I"m using is
>
>  if( isNT() ){
>do{
>  rc = DeleteFileW(zConverted);
>  if(rc==0)
>  { long attr = GetFileAttributesW(zConverted);
>if(attr==0x) { rc=1; }  // ok as long as
> sombody deleted it
>  }
>}while( rc==0
>&& (cnt++ < MX_DELETION_ATTEMPTS)
>&& (Sleep(100), 1) );
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Q. about core SQLite library

2010-01-14 Thread Max Vlasov

>
>  All you really need to do is splice your code between the VFS that
>  the SQLite engine sees and the native VFS layer that comes with the
>  distribution, adding a bit of extra code to xRead() and xWrite() to
>  munge the data.
>

I implemented this approach once, it worked, VFS also so flexible you can
even offset your data, for example write something unique seeding at the
start of the file and shift the actual sqlite data to some offset. Although
in this case xTruncate should be also adjusted.

The only probable disadvantage of using VFS is that sqlite temp file (etilq
... on windows) are bypassed by the VFS as long as I noticed, so if in any
case it is left in the system, the contents of temporary tables will be
exposed. Although I also noticed some tricky method they (temp files) are
created with, so even if I terminate the process unexpectedly they still
dissappear afterwards.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Speed regression after 3.6.17

2010-01-13 Thread Max Vlasov

>
> After much playing about, I have determined that it is necessary to violate
> the guidelines on not using INDEXED BY and NOT INDEXED in an attempt to
> optimise performance.


Although you're very determined about your conclusions, I saw a
misunderstanding about INDEXED BY in your statements.
The docs say "The INDEXED BY clause is *not* intended for use in tuning the
preformance of a query". From my point of you one should think about these
extensions as sqlite's sql equivalent of assertions from high-level
languages.

For example, a quick test.
I have a table Contacts with an index ids_Contacts using field CenterId.
If I try to execute

SELECT * FROM Contacts INDEXED BY idx_Contacts

sqlite says: "cannot use index: idx_Contacts".

Only when a change made adding explicit ORDER BY clause applied

SELECT * FROM Contacts INDEXED BY idx_Contacts ORDER BY CenterId

... the query executes successfully.

If I remove now INDEXED BY from the latter statement nothing will change in
how the query is executed, _only_ if someone excplicitely deletes the index
from the database.

So any of your statements that argues that adding or removing INDEXED BY
affects the way your queries is executed is have to be double checked.

If you're still sure that you found a major flaw in the most deployed sql
database in the world, please narrow your tests and conclusions to something
easier reproducible. Everyone understands that you're solving one particular
problem related to you, but if you find time to make things more simple, it
will be to everyone's benefit.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How can I know which table is operated by one SQL?

2010-01-12 Thread Max Vlasov

On Tue, Jan 12, 2010 at 10:22 AM, zhangzhenggui wrote:

> Dear friend,
>When I execute a SQL with C/C++ API, I want to know which table is
> operated by this SQL. Is there any way to do this except parse the SQL?
>
>
Every approach is not perfect, here is another one.

add EXPLAIN to the query you examine, iterate through the results while
collecting p2 field where opcode field is "OpenRead" or "OpenWrite" (you
have to iterate since you can not use Explain prefixed query as a subquery).
Concatenate the values as a comma-separated string like "3, 4, 7". These are
the root pages of the tables used.

After that use query

SELECT tbl_name FROM sqlite_master WHERE type="table" and rootpage in (%s)

(replace %s with your comma-seprated list). Now you have the list of tables
used as a query result.
I have not high level of certainty this will always work as expected, but
compare this with other "undocumented", "subject to change" variants and
choose the best )

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Speed regression after 3.6.17

2010-01-12 Thread Max Vlasov

On Tue, Jan 12, 2010 at 5:28 AM, ve3meo <holden_fam...@sympatico.ca> wrote:

> Max Vlasov <max.vla...@...> writes:
>
> For 3.5.4 and 3.6.17 which executed in ~240s:
> "order", "from", "detail"
>
> And for 3.6.20 which executed the same query in ~2500s:
> "order","from","detail"
>

If you remove first numeric values columns and compare the output from the
EXPLAIN QUERY PLAN, you will see that the difference is just the names of
the indexes (idxFamilyFatherID vs idxFamilyMotherID). Are you sure you have
identical tests for your comparison? I mean these are the same data sets
with the same scheme? I'm asking because I doubt that different sqlite
versions just randomly choose Mother vs Farther in the same namespace )) It
does matter to compare identical data since we here analize minor changes
and such difference in names makes analizing more difficult.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Speed regression after 3.6.17

2010-01-11 Thread Max Vlasov

On Mon, Jan 11, 2010 at 2:33 AM, Tom Holden wrote:

> I have been trying out a number of SQLite managers, one test being the
> execution time for the same query on the same database on the same computer.
> The scattering of results was very surprising with a spread on the order of
> 10:1.
>

Is the message posted 11 hours ago about the same issue? (
http://www.mail-archive.com/sqlite-users@sqlite.org/msg49650.html)
Anyway I still suggest the same (see in the thread) - compare VDBE code
sequences

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Performace regression bug of sqlite 3.6.18

2010-01-11 Thread Max Vlasov

On Mon, Jan 11, 2010 at 4:17 AM, Hub Dog  wrote:

> I think I found a performance regression bug of sqlite 3.6.1 8. A sql will
> cost 1800 seconds to return the query result with sqlite 3.6.18 and with
> the
> previous version it only cost about 170 seconds.
>

I have a suggestion for you. If you're able to compare the queries in
different versions, can you EXPLAIN them (I suppose you're aware of this
operator that should be appended before your query), export results from
different sqlite versions into two csv files and compare them as texts (Ms
word, diff, whatever). I think there will be not much difference in VDBE
code. If there's a difference, post it here. But if the code sequences are
the same, there's a probably a problem in your code or data files, not
sqlite. I think you're following the idea.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] INTERSECT optimization, is it possible?

2010-01-10 Thread Max Vlasov

On Mon, Jan 11, 2010 at 12:56 AM, D. Richard Hipp <d...@hwaci.com> wrote:

>
> On Jan 10, 2010, at 4:50 AM, Max Vlasov wrote:
>
> > Documentation says that INTERSECT implemented with temporary tables
> > either
> > in memory or on disk. Is it always the case?
>
> No.
>
> If there is an ORDER BY clause, SQLite may run each subquery as a
> separate co-routine and merge the results.

Thanks a lot,
At the first place I tried to append ORDER BY in every query that led to
"ORDER BY clause should come after intersect not before" error, but I did
not read this message properly just to place the ORDER BY once at the end of
the query. The time improved dramatically from 50 seconds to 6 and it was an
extreme case so the real life queries dropped to below a second.
Needless to say, but I'm very impressed about how sqlite takes almost every
aspect into account
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] INTERSECT optimization, is it possible?

2010-01-10 Thread Max Vlasov

Documentation says that INTERSECT implemented with temporary tables either
in memory or on disk. Is it always the case? The problem is that if I have
several selects (six for example) when each produces thousands of results
and the intersection is only hundreds the query takes about minute to
execute. But all these selects are properly ordered to do intersection
without extra files usage. For example I know that I can make it in my code
by splitting these selects into separated prepared statements and making
synchronous stepping outputting only if all steps have equal value(s). If
there's no such optimization, is it possible to implement it in the sqlite
engine? I know that having very complex query with many selects it would be
a hard task to recognize such a specific case but maybe it is easier than it
seems.

Thanks,

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is there a way to "predict" autoincrement rowid range

2010-01-07 Thread Max Vlasov

Thanks for the answers. At the first place I wanted to use rowid to save
space (since rowids always exist). After reading the replies I changed
declaration of ID to the one without AUTOINCREMENT and manually fill the
consecutive values starting current max(rowid)+1. So rowids still used but
now they're under my full control.

I have a query "INSERT ... SELECT" and after it performed it I have to to
> store range of rowids (autoincrement) of the inserted rows. While max(rowid)
> for right bound seems ok, assuming max(rowid)+1 for the left bound (before
> the query) depends on the fact whether there were deletes from the table
> before. If there were deletes, my estimation will differ from the actual
> value. So is there a correct way to know the range of rowids in this case or
> just to know for sure "expected" rowid in similar cases?
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is there a way to "predict" autoincrement rowid range

2010-01-07 Thread Max Vlasov

On Thu, Jan 7, 2010 at 3:56 PM, Igor Tandetnik <itandet...@mvps.org> wrote:

> Max Vlasov wrote:
> > I have a query "INSERT ... SELECT" and after it performed it I have to to
>
> If by autoincrement you mean a column actually declared with the
> AUTOINCREMENT keyword, then the next ID is stored in SQLITE_SEQUENCE table.
> For details, see http://www.sqlite.org/autoinc.html
>
> Igor Tandetnik
>
>
I thought that direct access to that table is not a good idea. If the format
of it is not documented it can change at any moment or at least in the next
major version increment (sqlite 4 for example).
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Is there a way to "predict" autoincrement rowid range

2010-01-07 Thread Max Vlasov

I have a query "INSERT ... SELECT" and after it performed it I have to to
store range of rowids (autoincrement) of the inserted rows. While max(rowid)
for right bound seems ok, assuming max(rowid)+1 for the left bound (before
the query) depends on the fact whether there were deletes from the table
before. If there were deletes, my estimation will differ from the actual
value. So is there a correct way to know the range of rowids in this case or
just to know for sure "expected" rowid in similar cases?

Thanks,

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] memory temp storage and general caching share the same memory pool?

2010-01-04 Thread Max Vlasov

I noticed that setting temp_store to DISK surprisingly improved the
performance in some cases comparing to the same operation when it was set to
MEMORY. Looking at the numbers I noticed that in case of MEMORY one
operation that actually does select spread over a large table always led to
big data read although in case of DISK it appeared that the table completely
went to cache after the first iteration.

So do memory temp_store and general caching share the same memory pool? I
prefer to use MEMORY temp_store for some reasons. If they share the memory,
what is the strategy should be on my side to choose the right cache size if
the amount of data that I plan to transfer through temp tables is
unpredictable. Should I always DELETE or DROP temp table data asap in order
to increase the probability that the disposed memory would be used for
caching in the next SELECT?

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Mozilla's method

2010-01-01 Thread Max Vlasov

On Fri, Jan 1, 2010 at 8:11 PM, Bert Nelsen wrote:

> So SQLite looks at both the database on the disk and in memory?
> Wouldn't that be difficult???
>

I don't think that only the memory can be used. Imagine you can have a very
big transaction, 1,000,000 inserts. As long as I understand the
architecture, all affected sectors are saved in .db3-journal file so the db3
file itself always reflects the current state of the database. For example
you have a table with 100 records and you're inserting 1,000,000 records, so
at the final stage, right before Commit, your db3 file contains all these
1,000,100 records and the indexes changed accordingly, while the
corresponding db-journal consists of the data needed to restore the db file
to the state it was before the transaction began
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Mozilla's method

2010-01-01 Thread Max Vlasov

> Another question would be: When I use transactions, and I said
> ".BeginTrans"
> and insert new records and then, before saying ".CommitTrans", I query the
> records, they seem to be already saved. Can you tell me why this is so?
> Does
> a select command automatically trigger a ".CommitTrans"?
> Wishing you a happy new year!
>
>
I think it comes from the nature of transaction itself, it is not an
isolated database inside the database, it's an "undo" feature. Otherwise
many operations would have unpredictable results.
So in the example below

sqlite> Create Table [TestTable] ([Value] INTEGER);
sqlite> Begin transaction;
sqlite> INSERT INTO TestTable (Value) VALUES (11);
sqlite> SELECT Count(*) FROM TestTable;
1
sqlite> SELECT Max(rowid) FROM TestTable;
1
sqlite> SELECT * FROM TestTable;
11
sqlite> End transaction;

.. all three selects should return the same results regardless of begin/end
transaction existence in the sequence of commands and it makes sense I
think.

Max


> On Fri, Jan 1, 2010 at 9:11 AM, Olaf Schmidt  wrote:
>
> >
> > "Bert Nelsen"  schrieb
> > im Newsbeitrag
> > news:a5ffd530912311004p26a7cc5k1f1bf6f671bef...@mail.gmail.com...
> >
> > > Your .Sychronous = False property does everything
> > > as fast as I want, and I am not afraid of losing some
> > > user data (it's not a critical application) but
> > > I am very much afraid of having a corrupted db.
> > > Can anybody please confirm
> > > that there is no chance of getting my db corrupted?
> >
> > Ah, I see now, where the "confusion" came from.
> > The wrappers Synchronous-Property has nothing to do
> > with the (relative new) async-writer-feature of SQLite -
> > instead it maps to SQLites Synchronous PRAGMA
> > (as a "convenience property").
> >
> > You can set all the Pragmas alternatively also per
> > Cnn.Execute "PRAGMA pragma_name ..."
> >
> > or read out a current Pragma-Value with:
> > Cnn.OpenRecordset("PRAGMA pragma_name")(0).Value
> >
> > Please read about SQLite-Pragmas here:
> > http://www.sqlite.org/pragma.html
> > ... and what's written there about the Synchronous-Pragma-
> > Settings. With the Synchronous-Pragma at 'Off' or '0', you're
> > risking DB-corruption.
> >
> > So, I would not touch the Synchronous-Property in your
> > case (leaving it at its default FULL(2)) - instead you should
> > wrap larger insert- or update-actions within a transaction -
> > that works fast as well.
> > Also consider using the binding-support of the wrapper
> > (the Command-Objects), to achieve faster (and more
> > typesafe) operations in "write direction".
> >
> > Olaf Schmidt
> >
> >
> >
> >
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
>
> --
> Regards,
> Bert
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Requirements for index-aware INSERT SELECT

2009-12-29 Thread Max Vlasov

The code I use can calculate data flow for sql queries (summing xRead iAmt
in VFS) and I noticed that many variations of INSERT SELECT led to very big
data flow (multiplication of the db size). I thought that such queries can
be optimized if both tables are indexed accordingly and finally the
following query

INSERT OR IGNORE INTO Table (Field) SELECT SomeOtherField FROM OtherTable
ORDER BY SomeOtherField

produced significant reduce in data flow. (Field Is indexed in Table). I
don't think the difference is related to some caching since the variant
without ORDER BY shows 50 MB data transfer for 17 MB base while adding ORDER
BY reduces it to 1 MB (1:50 ratio so far).

But my other query uses more complex schema (with LEFT JOIN and several
fields (although indexed together)). The problem is I could not optimize the
query to reduce the data flow in this case.
Are there any specific requirements for the inserts like the first one?
Couldn not to find information about this in the documentation.

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite3_prepare_v2

2009-12-25 Thread Max Vlasov

It's an interesting topic. Tried to search the web, and at least two
well-known programs, Picasa and iTunes probably expects sqlite3.dll to be
located somewhere in a shared place (probably system32 folder) and some
other software silently replaced this dll with its own copy having this
entry (sqlite3_prepare_v2) absent so after that both programs refused to
start.

iTunes
http://www.seasonsecurity.com/how-do-i-download-sqlite3dll-89859
Picasa
http://www.mail-archive.com/sqlite-users@sqlite.org/msg46025.html

First, I'd pass D. Richard Hipp's comment about statical linking to these
vendors as well :) But also this may also comes from the fact that all
sqlite3.dll I saw had no windows version resource, but only version
reporting through the corresponding function. It's not a problem when one
writes his own installer, but as I suppose, all well-known tools for
creating installers like InstallShield, relies mostly on resource version
comparision in their scripts, so can it be that doing all the things by
default, a software developer ends with a script that founds no version
information in both versions (in the installer, and in the Windows folder)
and prefers to "upgrade" anyway. Also is it hard to compile current version
of sqlite3.c to dll with version information in Windows format without
necessity to manually duplicate this information?

Max

On Fri, Dec 25, 2009 at 2:26 AM, Dr. Robert N. Cleaves wrote:

> Thank you very much for your help. The problem was in iTUNES. I removed it
> and no more problem. I then downloaded a new free version and the problem
> was solved.
>
> Dr. Robert N. Cleaves
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SELECT from fts3 tokens, is it possible?

2009-12-24 Thread Max Vlasov

Thanks, Scott,

I see problems with regular tables, but to discuss in details I have to make
some tests comparing the approaches.

I read once again the info about fts3 internals and would like to share new
thoughts.
Let's forget about "hit count". Although I mentioned google suggestions, it
was mainly for recognition of the principle of popup hint list, my primary
interest is just in getting tokens list that starts with entered string.

So if fts3 data stored in b-trees, where a token is the key, there is a
similarity in how sqlite searches for range of indexed varchars and how fts3
searches for tokens. You mentioned that a problem exists with merging
results, but general full-text search already have to merge them for found
doclist so the phrase "..Of course, this scheme makes queries more expensive
(as the FTS3 code may have to look up individual terms in more than one
b-tree and merge the results).." is appliable to tokens as well.
Moreover, I suppose that the code in fts3 that handles queries with
wildcards like Match("some*") can also be used for querying tokens matching
the same mask, the difference probably that the former should merge doclists
(from different segments/nodes) while the latter tokens themselves.

One of the problems (if such queries can be implemented technically) is the
proper language sintax. As the tokens are not columns of the virtual table,
this syntax have to be different from general Select.

On Wed, Dec 23, 2009 at 8:51 PM, Scott Hess <sh...@google.com> wrote:

> Doing this in the tokenizer is probably not great because you can't
> determine which of the tokens actually are in the database after
> accounting for deletes and updates.  When I was last thinking about
> this problem, I think the best option looked like keeping a distinct
> token index with token_id stored in the fts index.  Another
> alternative was to rearrange how segments are stored and merged so
> that the token part of the index would float above the doclists.
>
> If you wanted to hack something in right now, you could keep a set of
> parallel segments mapping tokens to counts of the number of hits for
> that token.  A document insert would generate a bunch of positive
> counts, a document delete a bunch of negative counts, merges would net
> them out, and if a count went to 0 that token would be dropped from
> that segment.  Queries would still be somewhat costly, but this index
> could probably be merged more aggressively than the fts index (it
> should be much smaller).
>
> If your database isn't too large, you could maybe even keep this in a
> regular SQLite-style table.  Maybe, maybe not, fts1 did something like
> that and it got terribly slow once the fts table had a few tens of
> thousands of documents.  The problem was that the tokens were
> distributed across a large portion of the index, so data locality went
> down the drain and every update was like a full table scan.
>
> -scott
>
>
> On Wed, Dec 23, 2009 at 7:21 AM, Max Vlasov <max.vla...@gmail.com> wrote:
> > Scott, thank,
> >
> > I see that the hit counts is the next level of complexity.
> > As for tokens, I think that if one really needs to implement this
> feature,
> > there would be a possible solution in allowing the program to "mirror"
> > tokenized" data. Right now as I see it would be possible (if a custom
> > tokenizer implemented), to collect such data inside xNext function, but
> > there are problems. For example, we should know we're not parsing Match
> > query. Is there a way to distinguish  whether xNext works with MATCH
> > operator content or from UPDATE or INSERT INTO? Because it makes no sense
> to
> > collect data from MATCH, because it may contain irrelevant, non-exising
> > words (but ironically it would be helpful for collecting "hit" data). If
> we
> > know inside xNext that the call from a real data appending then at least
> we
> > will be able to maintain a dictionary of words used at least once.
> >
> > Max
> >
> > On Wed, Dec 23, 2009 at 5:54 PM, Scott Hess <sh...@google.com> wrote:
> >
> >> The tokens are all there, so it is "theoretically possible".  But the
> >> performance would be very bad because it would require iterating over
> >> all the segment data, basically a worst-case query.  Even then, you
> >> wouldn't have information about hit counts.  To do it reasonably would
> >> require redesigning the data format to take this use-case into
> >> consideration.
> >>
> >> -scott
> >>
> >> On Wed, Dec 23, 2009 at 3:56 AM, Max Vlasov <max.vla...@gmail.com>
> wrote:
> >> > Is it theoretically possible to allow some ki

Re: [sqlite] SELECT from fts3 tokens, is it possible?

2009-12-23 Thread Max Vlasov

Scott, thank,

I see that the hit counts is the next level of complexity.
As for tokens, I think that if one really needs to implement this feature,
there would be a possible solution in allowing the program to "mirror"
tokenized" data. Right now as I see it would be possible (if a custom
tokenizer implemented), to collect such data inside xNext function, but
there are problems. For example, we should know we're not parsing Match
query. Is there a way to distinguish  whether xNext works with MATCH
operator content or from UPDATE or INSERT INTO? Because it makes no sense to
collect data from MATCH, because it may contain irrelevant, non-exising
words (but ironically it would be helpful for collecting "hit" data). If we
know inside xNext that the call from a real data appending then at least we
will be able to maintain a dictionary of words used at least once.

Max

On Wed, Dec 23, 2009 at 5:54 PM, Scott Hess <sh...@google.com> wrote:

> The tokens are all there, so it is "theoretically possible".  But the
> performance would be very bad because it would require iterating over
> all the segment data, basically a worst-case query.  Even then, you
> wouldn't have information about hit counts.  To do it reasonably would
> require redesigning the data format to take this use-case into
> consideration.
>
> -scott
>
> On Wed, Dec 23, 2009 at 3:56 AM, Max Vlasov <max.vla...@gmail.com> wrote:
> > Is it theoretically possible to allow some kind of sql SELECT query
> > operations with tokens?
> >
> > Particulary one could implement suggestions similar to Google suggestions
> in
> > Web search, when for example "some" typed in a search box and a listbox
> > popped up with this database tokens starting with these letters shown
> > ("someone", "something" etc).
> > Having some experience with full-text in past once I did my own simple
> > implementation of full-text for sqlite, where all tokens saved in a
> > separated table that also has a text index, so in this case SELECT Title
> > FROM Tokens WHERE Title LIKE "some%" works as expected (with
> > case-sensitivity pragma effectively set). I tried to read the technical
> part
> > of http://www.sqlite.org/fts3.html document, but could not figure out
> > whether it is possible to implement this in the current implementation of
> > fts3
> >
> > Thanks
> >
> > Max
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] SELECT from fts3 tokens, is it possible?

2009-12-23 Thread Max Vlasov

Is it theoretically possible to allow some kind of sql SELECT query
operations with tokens?

Particulary one could implement suggestions similar to Google suggestions in
Web search, when for example "some" typed in a search box and a listbox
popped up with this database tokens starting with these letters shown
("someone", "something" etc).
Having some experience with full-text in past once I did my own simple
implementation of full-text for sqlite, where all tokens saved in a
separated table that also has a text index, so in this case SELECT Title
FROM Tokens WHERE Title LIKE "some%" works as expected (with
case-sensitivity pragma effectively set). I tried to read the technical part
of http://www.sqlite.org/fts3.html document, but could not figure out
whether it is possible to implement this in the current implementation of
fts3

Thanks

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] db corruption with zero bytes

2009-12-22 Thread Max Vlasov

On Tue, Dec 22, 2009 at 1:22 PM, Evilsmile  wrote:

> Hello,
>
> My sqlite version is 3.5.1 and there are a lot of db corruption in my
> system.
>
>

Please, let us know more about your language/platform
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Corrupted database (image is malformed): always the same size

2009-12-18 Thread Max Vlasov

On Fri, Dec 18, 2009 at 6:27 PM, Gianandrea Gobbo  wrote:

> I'm using sqlite (2.8) on an embedded product, running a Linux kernel.
> I'm experiencing sometimes a database corruption, and listing some
> tables contents gets me a "SQL error: database disk image is malformed".
> Ok, there can be many reasons for this, but the strange thing that
> puzzles me is that the database file has always the same size: 409Kb
> (418816 bytes).
>

Gianandrea, I once had an experience with fixing a modified code that led to
database disk image malformation. If you can reproduce the bug with a
comparatively few number of steps, I'd recommend running the program with
checking PRAGMA integrity_check; almost on every step. It costs a little in
terms of performance, but in my case it helped, a report about invalid Index
entries was way ahead of this particular error so it helped to narrow it to
several queries and finally to the invalid code fragment
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BACK API Questions

2009-12-16 Thread Max Vlasov

>
> For ex, If I ran for 200,000 inserts, first 20,000 inserts were done in 9
> secs, but last 20,000 inserts (from 180,000th to 200,000) took almost 110
> secs. It is more than 10 times than what it was initially. These results
> were consistent across all iterations of simulation I did.
>
>
I have several observations about your results:

- As I know rowid is always indexed, so there's always at least one index
for any table that implemented with B-tree. So the bigger the base the
slower is to append a record. The dependency is not linear, but it exists.
- When you're inside a transaction sqlite "delegates" writing logic
(cached/not cached) to the OS, so if it decides to cache one sector and not
to cache another, there's little we can to about it. It can be related to
your RAM size, file cache size, hard-disk characteristics.
- You probably don't want to use sqlite if you plan to develop for example
billing system for a big mobile carrier :), there's a page at
sqlite.orgabout recommendations for sqlite usage. Can you know in
advance what speed
results you'll consider good?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BACK API Questions

2009-12-16 Thread Max Vlasov

On Wed, Dec 16, 2009 at 9:30 AM, Raghavendra Thodime wrote:

>  I did try using batch of transactions with synchronous=OFF PRAGMA set. The
> performance improved slightly. But as db file started to grow larger and
> larger in size, the performance degraded considerably. Is it expected? Or Is
> there a work around for this?
>

If you did everything according this topic:

(19) INSERT is really slow - I can only do few dozen INSERTs per second (
http://www.sqlite.org/faq.html#q19 )

from FAQ, you probably have extra overload with your development
language/library. Also be aware that you will probably never get the same
speed as the general copying of same amount of data with the file system
routines since the db engine have to deal with indexes and other extra job.
So the performance high or low might be a subjective estimate.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

< 1 2 3 4 5 >

301 - 400 of 400 matches

Mail list logo