Re: [sqlite] Scrolling through results of select

2011-02-21 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

>>> 1. Is there any significant overhead on SQLite from my selecting from a 
>>> view representing the original arbitrary select? That is, will SQLite still 
>>> use any indexes etc correctly? Or do I need to dissect/parse the original 
>>> select statement, changing the where statement etc?
>>
>> You can work out the answer yourself by using EXPLAIN and EXPLAIN QUERY PLAN 
>> of some representative examples.
> 
> Yes, but I wondered if there was some overriding logic that SQLite uses that 
> would provide a theoretical/logical rather than experimental guide.

There isn't as trying EXPLAIN would show you.

>> The rest of your questions assume a particular solution.  The only thing 
>> that will reliably work is to reissue the query using skip and limit
> 
> By "skip" do you mean select where rowid > , or offset 
> or something else?

No, I mean skipping the first N results.  You can't use rowid since it won't
exist in many cases.  For example 'select 3+4' or 'select x+y from a,b where
...'.

> Hmm, true. I hadn't thought of user defined function side effects. I don't 
> have to allow for that at the moment, but  I'll keep it in mind. Is it common 
> or even good practice for a user function (used in a select statement) to 
> modify the table from which it's selecting? That seems like bad practice to 
> me and I can't see why you'd do that rather than use update, insert or delete 
> rather than select to make changes.

The UDF could take a filename as a parameter and return the size or last
access time.  By changing when bits of the query execute you'll get
different answers (eg the file size changes between page scrolls in the query).

>> then the solution is to 'CREATE TEMP TABLE results AS ...select...'.  This 
>> will also work if someone uses
>> "ORDER BY random()" or any other udf that depends on more than its arguments.
> 
> Hmm, good thinking. I'll consider that. The downside is that creating a 
> temporary table would require SQLite to process every row in the select, 
> whereas prepare/step only processes the rows as they are shown. This would 
> make a big difference for very large data sets or for a view/select 
> containing a complex calculation for each row.

I think you are overthinking the problem.  No one is going to scroll through
100,000 results so there is no need to save 100,000 of them.  Pick an
arbitrary number (eg 1000), use the CREATE TEMP TABLE ...  AS .. select ...
approach, and add a 'LIMIT 1000' on the end.

This will work with any query and work reliably no matter what else happens
to the database (eg other processes modifying it).

If you want to refine things then there are several callbacks you can use.
For example if the database is coming back with one row per second then you
don't really want to wait 1000 seconds.  You can have a monitoring thread
and call sqlite3_interrupt to abort the query.  If you don't want to use
another thread then you can register a progress callback which knows when
the query started.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1jMKcACgkQmOOfHg372QTfbQCgoO3rzpBFmcZIZf2FKJitXaWv
t7AAniZ//1kazi0NIXFeUoGCqTkUwKs3
=ISAF
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Stephen Oberholtzer
On Mon, Feb 21, 2011 at 11:35 AM, Jim Wilcoxson  wrote:
> On Mon, Feb 21, 2011 at 11:05 AM, Sven L  wrote:
>>
>> Thank you for your detailed explanation!
>> First, can you please tell me how to purge the cache in Windows 7? This 
>> could be very useful for my tests!
>
> Sorry, dunno for Windows.  On Mac OSX it is the purge command, in the
> development tools.  On Linux, you do: echo 3 >
> /prog/sys/vm/drop_caches

Just make sure you either (a) quote the 3 (echo '3' >
/proc/sys/vm/drop_caches) or (b) put a space between the 3 and the >.
If you don't quote it, and you don't put the space in (echo
3>/proc/sys/vm/drop_caches) it doesn't do anything, for reasons I
won't go into.

-- 
-- Stevie-O
Real programmers use COPY CON PROGRAM.EXE
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite server using execnet ?

2011-02-21 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/21/2011 12:37 PM, Jay A. Kreibich wrote:
>   Yes, but in something like memcached, the database is not aware of
>   that structure, and can't take advantage of it.

Memcached does understand some types and has atomic increment/decrement etc
but does not understand structure.

>   When storing serialized objects, it is all too common to see code that
>   fetches an object, un-marshals it, alters some simple value, re-marshals
>   the whole object, and then write the whole thing back.

This all circles back to what you are doing and in particular what the OP
was doing.  If you need to do queries, modifications and durability then
what you really need falls under the category of 'database'.  On the other
hand if there is no need for querying or changes then something like
memcached is a great way for a bunch of machines/processes to get the data.

>    Redis 

Redis is a database :-)

>   Yes and no.  Redis, like memcached, is essentially an always in-memory
>   key/value store.

*All* databases are in-memory for practical purposes.  Their working set
will need to be in memory either explicitly due to their implementation, or
implicitly via the operating system or through administration (eg indices).
 If accesses to the working set of data require disk accesses then the
performance will be dismal.  (Some exceptions for data only accessed
sequentially.)

>   Its main selling point is memcached-like speed,

Incidentally MongoDB claims the same thing :-)  The places I use memcached
are where I do not want disk touched.

>... but it is a good fit ...

It looks like we are seeing what happened with the first generation of DVCS.
 Relational representation is being changed to be less
constrained/structured.  There are numerous "databases" with varying and
overlapping sweet spots in terms of querying, persistence, performance,
distribution etc.  I expect we'll see similar shakeouts and end up with a
small number of strong products.,

>   Like SQLite itself, I tend do all my virtual table modules in
>   extremely vanilla C.

Brave :-)  My personal preference is to do the initial development in Python
and then reimplement in C if needed for portability/performance reasons.
The Python development is a lot quicker and then acts as a test suite for
the C implementation.

I wonder how many of the other bindings for SQLite have bothered to
implement virtual tables as that probably holds back usage of virtual tables
a lot.  (A 'hello world' virtual table in Python/APSW is about half a
screenful of code.  An example one I have that represents information about
files on disk is just under a screenful.)

>   I happen to think virtual tables are one of the more powerful features
>   of SQLite, but also one of the most under-utilized features.

Agreed.  Unfortunately it does require that the underlying data be
representable in a relational manner which is also very constraining.

>   Since a big part of writing these is to get them out for other people
>   to use them, 

Where do you publish them?  It is probably also worth trying to encourage a
'contrib' location for SQLite that is more active and in wider use than
http://www.sqlite.org/contrib

>  Working in C avoids adding complexity, like someone working
>  in Java wanting to use your MongoDB module.  I suppose it could be
>  done, but I wouldn't want to be the one trying to make it all work.

MongoDB is client server so this issue does not arise.  (Nor do they have
virtual tables.)  In order to perform "programming" on the server side such
as for map/reduce you have to use Javascript which is slowly becoming the
most popular language for that kind of thing including on the desktop.  (Eg
see node.js and Seed.)

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1i/zkACgkQmOOfHg372QT9cgCgyV7NaECzQUrrrDZr9zYri0tq
RkkAoKSuRlclVshN/oIxSXOy0dtXZcot
=xEyA
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Scrolling through results of select

2011-02-21 Thread BareFeetWare
On 21/02/2011, at 12:41 PM, Roger Binns wrote:

>> How can I best scroll though the results of an arbitrary select query?
> 
> If you have infinite time and memory then there is no problem.

Memory and processor are limited (typically iPad/iPhone). Disk space would cope 
with creating temporary tables if necessary. In general, I am trying to cater 
for a table or view (ie arbitrary select), the results of which would not fit 
entirely into memory.

> You haven't stated your constraints or how arbitrary "arbitrary" really is.

By arbitrary, I mean that the user can type any select query that SQLite allows 
(or have a view in their schema), which might make use of order by, group by, 
where, union etc.

>> 1. Is there any significant overhead on SQLite from my selecting from a view 
>> representing the original arbitrary select? That is, will SQLite still use 
>> any indexes etc correctly? Or do I need to dissect/parse the original select 
>> statement, changing the where statement etc?
> 
> You can work out the answer yourself by using EXPLAIN and EXPLAIN QUERY PLAN 
> of some representative examples.

Yes, but I wondered if there was some overriding logic that SQLite uses that 
would provide a theoretical/logical rather than experimental guide.

>> 2. If the arbitrary select statement already contains an "order by" clause, 
>> then I obviously need to use the order by column(s) specified there as the 
>> keyColumn for the scrolling, rather than override the order by. Is there any 
>> way to get the sort order of an arbitrary select statement?
> 
> Results are returned in the order requested or randomly(*) if not.  Given you 
> can have subqueries with ordering, collations and all sorts of other things, 
> trying to extract the actual ordering is as difficult as implementing the 
> SQLite engine itself.
> 
> (*) In practise it is in btree iteration order but that is not something you 
> should depend on.

Yes, I understand that the sort order of results cannot be counted on as 
consistent if no order by clause is give. But if I am imposing a sort order (eg 
by rowid) I want to as closely as possible match the undefined sort order so 
the results look the same. Is sorting by rowid in a table as close I could get 
to this? What order by sequence could I best use to match the results of a 
select from joined tables?

> You can even "ORDER BY random()".

Hmm, good point. I guess in that case (are there other cases?) I can't count on 
the results being the same from one select to the next, so preparing the 
statement, extracting some rows, closing, then preparing and extracting again 
when the user scrolls won't work, since the results will change. If there is a 
random() component then I would have to leave the query/prepare open, denying 
all other access to that database file, until there will definitely be no more 
scrolling. Correct?

> The rest of your questions assume a particular solution.  The only thing that 
> will reliably work is to reissue the query using skip and limit

By "skip" do you mean select where rowid > , or offset or 
something else?

> assuming no changes in between.

Yes, I can assume no changes in between (though the random() possibility above 
will make this approach fail I think).

> This is if you are trying to save memory/disk and there is no possibility of 
> changes between scrolling operations.

Yes.

> If you need to be resilient to that too (implied by "arbitrary" since user 
> defined functions could have side effects)

Hmm, true. I hadn't thought of user defined function side effects. I don't have 
to allow for that at the moment, but  I'll keep it in mind. Is it common or 
even good practice for a user function (used in a select statement) to modify 
the table from which it's selecting? That seems like bad practice to me and I 
can't see why you'd do that rather than use update, insert or delete rather 
than select to make changes.

> then the solution is to 'CREATE TEMP TABLE results AS ...select...'.  This 
> will also work if someone uses
> "ORDER BY random()" or any other udf that depends on more than its arguments.

Hmm, good thinking. I'll consider that. The downside is that creating a 
temporary table would require SQLite to process every row in the select, 
whereas prepare/step only processes the rows as they are shown. This would make 
a big difference for very large data sets or for a view/select containing a 
complex calculation for each row.

Thanks for taking the time to explore some possibilities for me.
Tom
BareFeetWare

 --
Comparison of SQLite GUI tools:
http://www.barefeetware.com/sqlite/compare/?ml

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Scrolling through results of select

2011-02-21 Thread BareFeetWare
On 21/02/2011, at 12:41 PM, Simon Slavin wrote:

>> How can I best scroll though the results of an arbitrary select query?
> 
> Suppose the results of the SELECT change between your original decision to do 
> the scrolling and the time the user decides to scroll.  Should what's shown 
> on the display reflect the data as it originally was, or up-to-date data ?  
> Or could there never be any such changes ?

Thanks for the reply Simon.

Good question. In most or all cases for now, the select results will not change 
mid scroll. The user interface will be clearly either scrolling or editing. In 
the longer term I will probably allow editing mid scroll, but that's fine, 
since I can flag within my app that results need to be re-selected.

>> 1. Is there any significant overhead on SQLite from my selecting from a view 
>> representing the original arbitrary select? That is, will SQLite still use 
>> any indexes etc correctly? Or do I need to dissect/parse the original select 
>> statement, changing the where statement etc?
> 
> A VIEW is a saved SELECT query -- the query, not the results, are saved.  So 
> I think you don't need to make the extra effort you describe.

Yes, I understand that a view doesn't save any results. I'm unclear, however, 
as to how smart is SQLite's query optimizer to, for instance, realise that when 
it's selecting from a view, it can use the indexes of source columns for the 
where filter.

>> 2. If the arbitrary select statement already contains an "order by" clause, 
>> then I obviously need to use the order by column(s) specified there as the 
>> keyColumn for the scrolling, rather than override the order by. Is there any 
>> way to get the sort order of an arbitrary select statement?
> 
> I don't know of any.

Hmm, OK. I guess I'll need to impose a sort order on otherwise unordered query 
results. For a table, I'll just sort by rowid. For a view, I'll probably just 
look for any indexed column.

>> 3. This method requires that keyColumn is defined as unique (or primary 
>> key), otherwise it can skip rows of data. Is there any way to allow for a 
>> non-unique keyColumn?
> 
> No, but instead of using just keyColumn you could use (keyColumn,rowid).  
> This would ensure your key was always unique, and will work on arbitrary 
> SQLite tables unless someone is intentionally messing with how SQLite works.

Yes, I was thinking along those lines (ie order by keyColumn, rowid). That will 
work for tables. I'll have to adapt it somehow for views.

>> 5. I understand that "Rule Number 1" is to "not leave queries open".
> 
> Correct.  Don't allow a user to create and close a query just by choosing 
> when they want to scroll through a list.
> 
>> So what's the best way to minimize the overhead of repeatedly running the 
>> same query but with a different where clause and limit (and order if 
>> reversing). I'm thinking I would be best to actually keep the query (ie 
>> prepared statement) open while live scrolling (eg flicking through rows on 
>> an iPhone/iPad), not using a limit clause at all, but instead just keep 
>> getting more rows as needed to fill the scrolling, until the user stops 
>> scrolling, then finalize, close etc. When they begin scrolling again, fire 
>> up a new prepare (with a new maxVisibleKeyValue) .
> 
> To get the following or previous line to one which is already being shown, 
> find the key for that row (which you should save in memory as you're 
> displaying the line) and use
> 
> SELECT  FROM  WHERE (keyColumn||rowid)>lastlineKey ORDER BY 
> keyColumn,rowid LIMIT 1
> 
> to get the following line or
> 
> SELECT  FROM  WHERE (keyColumn||rowid) keyColumn,rowid LIMIT 1

Yes, that's where I was headed, but I wondered if running that query 
continuously might have unnecessary overhead since it's constantly preparing a 
new query for each row. I wondered if I'd be better just having

SELECT  FROM  WHERE (keyColumn||rowid)>lastlineKey ORDER BY 
keyColumn,rowid

ie with no limit, leaving the query open and getting new rows as the scroll 
progresses, finally closing the query when the scrolling stops. Obviously I 
would need to guarantee that no changes are attempted to be made to the 
database while the query is open and the rows are scrolling.

Thanks for your thoughts,
Tom
BareFeetWare

 --
Comparison of SQLite GUI tools:
http://www.barefeetware.com/sqlite/compare/?ml

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite GUI comparison

2011-02-21 Thread BareFeetWare
On 22/02/2011, at 4:31 AM, skywind mailing lists wrote:

> "Supports SQLite extension" would be an accurate feature description. And in 
> the cell (value) I suggest to put - if supported - FTS2, FTS3, RTree etc., 
> otherwise a "-". A yes or no is insufficient because some support RTree but 
> not FTS and vice versa.

OK, that sounds good. I'll probably use "no" or "none" if no extension is 
supported.

Can anyone please tell me what should go in this cell for any SQLite GUI app 
they know (ie either FTS2, FTS3, RTree or none). Or how can I easily test this 
feature?

Thanks,
Tom
BareFeetWare

--
Comparison of SQLite GUI tools:
http://www.barefeetware.com/sqlite/compare/?ml

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3_busy_handler

2011-02-21 Thread Simon Slavin

On 21 Feb 2011, at 9:46pm, Frank Chang wrote:

> I was wondering why other selects who need to only read from a sqlite 
> database need to use the sqlite3_busy_handlers. Is it because database 
> connection which are writing to the sqlite database have a higher priority 
> then database connections which are reading from the sqlite database?

Correct.  You cannot read from a database while something is making changes in 
it.  You might get part of a batch of changes.  For instance, you might get a 
new transaction, but not the updated account balance.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT :Re: sqlite3_busy_handler

2011-02-21 Thread Black, Michael (IS)
If I'm not mistaken only WAL mode supports simulaneous read/write.
For any other mode any write function will lock the database.
So...selects may run into a need to wait until a write finishes.


Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Frank Chang [frank_chan...@hotmail.com]
Sent: Monday, February 21, 2011 3:46 PM
To: sqlite-users@sqlite.org
Subject: EXT :Re: [sqlite] sqlite3_busy_handler

Michael D. Black, I will try different batch sizes so that other processes do 
their thing potentially. I was wondering why other selects who need to only 
read from a sqlite database need to use the sqlite3_busy_handlers. Is it 
because database connection which are writing to the sqlite database have a 
higher priority then database connections which are reading from the sqlite 
database? Thank you for your suggestions.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3_busy_handler

2011-02-21 Thread Frank Chang

Michael D. Black, I will try different batch sizes so that other processes do 
their thing potentially. I was wondering why other selects who need to only 
read from a sqlite database need to use the sqlite3_busy_handlers. Is it 
because database connection which are writing to the sqlite database have a 
higher priority then database connections which are reading from the sqlite 
database? Thank you for your suggestions. 
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite server using execnet ?

2011-02-21 Thread Petite Abeille

On Feb 21, 2011, at 9:37 PM, Jay A. Kreibich wrote:

> I was once forced to look at SOAP over SMTP

Ah, yes... double S!

The S stands for Simple
http://wanderingbarque.com/nonintersecting/2006/11/15/the-s-stands-for-simple/


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite server using execnet ?

2011-02-21 Thread Jay A. Kreibich
On Sun, Feb 20, 2011 at 08:37:04PM -0800, Roger Binns scratched on the wall:
> On 02/20/2011 06:23 PM, Jay A. Kreibich wrote:
> > On Sun, Feb 20, 2011 at 05:23:09PM -0800, Roger Binns scratched on the wall:
> >> If you want to use SQL then use Postfix.
> > 
> >   I might suggest PostgreSQL instead.
> >   (Sorry, Roger, I couldn't resist.)
> 
> Yeah, long night :-)  However, technically SQL over SMTP is possible and
> would actually work.  And if anyone is insane enough to try that then using
> Postfix and Postgres are a good combination.

  I was once forced to look at SOAP over SMTP.  Thankfully we were able
  to talk them out of it.  As if SOAP wasn't bad enough, I sure didn't
  want to mix it with SMTP.
  
  "Any networking protocol with 'Simple' in the name isn't."

> >> If you need lots of processes on the network to access data quickly then
> >> consider memcached.
> > 
> >   More seriously, in this category you might also consider Redis.
> >   Redis allows your data to have some structure, 
> 
> The Python binding pylibmc does structure the data for you automagically.

  Yes, but in something like memcached, the database is not aware of
  that structure, and can't take advantage of it.
  
  When storing serialized objects, it is all too common to see code that
  fetches an object, un-marshals it, alters some simple value, re-marshals
  the whole object, and then write the whole thing back.

  While this is fine for complex or less frequently accessed objects,
  where the ease of use outweighs the overhead, it seems like a
  lot of work for simpler items, such as lists and basic dictionaries
  or hashes.  Of course, these types of objects are also the most
  frequently modified data-types in many applications.  The best way
  to do key lists in memcached is practically a religious topic.

  Having the database aware of a few very simple and primitive structures 
  can provide significant improvements for these basic, most common
  operations.  For example, Redis can append an element to a list with
  a single O(1) database operation that is independent of list size,
  and the only payload data that crosses the network is the list key
  and the value to insert.  It has a few other tricks, like atomically
  incrementing counter values.

> >   plus it has the ability to persist the data to disk. 
> 
> The moment you talk about persistence you then have significant overlap with
> databases.  My personal favourite is MongoDB but there are loads of others
> such as Cassandra, HBase, Tokyo Cabinet etc.

  Yes and no.  Redis, like memcached, is essentially an always in-memory
  key/value store.  Functionally, it is much more like memcached than
  most of these other examples.  Its main selling point is memcached-like
  speed, without the "cache" aspects of memcached-- you're data stays
  there until you get rid of it, even across restarts.  In many cases,
  that also allows you to get rid of your backing database.  Redis also
  provides just enough internal structure to be useful, without
  really getting in the way, in much the same way that a very basic
  container library provides building-block tools without defining a
  whole class tree.

  Those features definitely bring it closer into the domain of NoSQL
  databases, but I'd argue that's only because memcached is so far
  removed from the rest of these.  Thanks to its caching nature, it
  can't really be considered a "database" at all (useful tool, yes;
  database, not really).


  Each of these products has its place.  I don't mean to sound like
  such a Redis fan-boy, but I've been messing around with it a lot
  lately, and found it to be both extremely simple to setup and
  configure (something most of these other products cannot claim),
  while also extremely useful.  You're not going to use Redis to
  replace something like Cassandra or HBase, but it is a good fit
  for situations where memcached is a good fit, but you want more
  structure and a known life-cycle to your data.

> What programming language are you using to implement the virtual tables?

  Like SQLite itself, I tend do all my virtual table modules in
  extremely vanilla C.
  
  I happen to think virtual tables are one of the more powerful features
  of SQLite, but also one of the most under-utilized features.  As a
  way to relax and explore, I sometimes write virtual table modules to
  bolt together odd data stores or storage formats.  Partly this is
  just for fun, but the process also helps develop a deep understanding
  of the native data model used by these different products, and that
  knowledge is useful for other work I do.  If it opens some eyes and
  helps promote virtual tables, and SQLite in general, that's also a
  great bonus.
 
  As such, much of the virtual table development work I do is somewhat
  isolated, outside of any environment or problem context.  My approach
  tends to be extremely general-purpose, since I'm often approaching
  the problem from a very 

Re: [sqlite] process monitor: wal file access

2011-02-21 Thread Pavel Ivanov
> I find especially the *-wal access attempt pretty strange as we do not have 
> WAL enabled for our database: it is set to the default journal mode (DELETE). 
> Is this normal behaviour? Or is this influenced by some setting I don't know 
> of?

I think when SQLite opens the database it can't trust its contents
right away, so it can't know yet that it's in DELETE journal mode.
SQLite must check for presence of both files *-wal and *-journal. If
none of them exist (failed QueryOpen) then database should be in
complete and trusted state, so SQLite can read its contents and see
what journal mode is in it. That's what you see and it can't be
changed.


Pavel

On Mon, Feb 21, 2011 at 9:15 AM, Gert Corthout
 wrote:
>
>
>
>
> hello,
>
> we recently switched from SQLite version 3.5.4 to 3.7.5. When I run process 
> monitor on our machines I see a lot of failed file access attempts that 
> weren't there before (v 3.5.4).
> It is failed QueryOpen (NAME NOT FOUND) events for DBASE-NAME.db3-wal and 
> DBASE-NAME.db3-journal.
>
> I find especially the *-wal access attempt pretty strange as we do not have 
> WAL enabled for our database: it is set to the default journal mode (DELETE). 
> Is this normal behaviour? Or is this influenced by some setting I don't know 
> of?
>
> kind regards,
> Gert
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite GUI comparison (was: ANN: Base 2.0, Mac SQLite GUI)

2011-02-21 Thread skywind mailing lists
Hi Tom,

"Supports SQLite extension" would be an accurate feature description. And in 
the cell (value) I suggest to put - if supported - FTS2, FTS3, RTree etc., 
otherwise a "-". A yes or no is insufficient because some support RTree but not 
FTS and vice versa.
Alternatively you may have a row for each type of extension but I find this a 
bit too much.

Hartwig

Am 20.02.2011 um 21:48 schrieb BareFeetWare:

> On 21/02/2011, at 3:20 AM, skywind mailing lists wrote:
> 
>> in your comparison chart it would also be nice to see which software is able 
>> to support SQLite extension. A couple of them do not support the FTS nor 
>> RTree capabilities of SQLite.
> 
> Sure, I'd be happy to add that. How do you suggest that the feature is worded 
> in the table? Is "Supports SQLite extension" accurate? Please let me know 
> what value (eg yes or no) I should show for this feature for any apps you 
> know so I can add those entries.
> 
> Thanks,
> Tom
> BareFeetWare
> 
> --
> Comparison of SQLite GUI tools:
> http://www.barefeetware.com/sqlite/compare/?ml
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] update of a blob

2011-02-21 Thread Robert Hairgrove
On Mon, 2011-02-21 at 15:13 +0100, Dietmar Hummel wrote:
>  std::string strStatement( "UPDATE persistence SET 
> name=\"blub\",expiration=\"2011-04-02\",value=\"?\" WHERE id=\"1\"" );

In addition to what Igor said, it isn't really proper (standard?) SQL to
put double quotes around the value literals because these should be
reserved for identifiers (e.g. schema, column or table names). I know
that MS-Access (and probably SQL Server) allows it; perhaps SQLite does,
too, but other databases won't -- you need to enclose them in single
quotes (but only if the value is a string literal, or a date-time value
formatted as a string). With some RDBMS's the character used to enclose
identifiers is optional or configurable, e.g. the backtick character (`)
used by MySQL.

Bob

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

Thanks :)
 
This did the trick:
First make a copy of the database: copy HugeDatabase.db HugeDatabase_copy.db
Then for each run, replace the database with its copy. This is why I thought 
the COUNT operation was somehow written to the database after its first run... 
:P

> From: slav...@bigfraud.org
> Date: Mon, 21 Feb 2011 16:56:01 +
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] COUNT() extremely slow first time!
> 
> 
> On 21 Feb 2011, at 4:35pm, Jim Wilcoxson wrote:
> 
> > On Mon, Feb 21, 2011 at 11:05 AM, Sven L  wrote:
> >> 
> >> Thank you for your detailed explanation!
> >> First, can you please tell me how to purge the cache in Windows 7? This 
> >> could be very useful for my tests!
> > 
> > Sorry, dunno for Windows. On Mac OSX it is the purge command, in the
> > development tools.
> 
> In current versions it's just /usr/bin/purge/, which should be in your path. 
> No idea why a non-programmer should need it, but there it is.
> 
> > On Linux, you do: echo 3 >
> > /prog/sys/vm/drop_caches
> 
> And in Windows it's ... almost impossible. You can sync to disk, using 
> fflush(), and there's no reason you can't do this from the command-line:
> 
> http://technet.microsoft.com/en-us/sysinternals/bb897438.aspx
> 
> But for a purge you have to invalidate the contents of cache, and there's no 
> way for an application to tell Windows to do this. For a start, each 
> application has its own user mode address cache, so if you run another 
> utility to flush cache, it just messes with its own space. Second, there's no 
> system call that allows you access to the cache because Microsoft considers 
> it private to the device level. I am not dissing Microsoft for this: there 
> are fair arguments that this is actually the Right Way to do it.
> 
> So the way to do it is to overrun the cache by yourself. If you know your 
> cache is 2Gig, find a 2Gig disk file that has nothing to do with your test 
> suite and read it. Or generate 2Gig of gibberish and write it to disk, then 
> delete that file. Pah.
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Simon Slavin

On 21 Feb 2011, at 4:35pm, Jim Wilcoxson wrote:

> On Mon, Feb 21, 2011 at 11:05 AM, Sven L  wrote:
>> 
>> Thank you for your detailed explanation!
>> First, can you please tell me how to purge the cache in Windows 7? This 
>> could be very useful for my tests!
> 
> Sorry, dunno for Windows.  On Mac OSX it is the purge command, in the
> development tools.

In current versions it's just /usr/bin/purge/, which should be in your path.  
No idea why a non-programmer should need it, but there it is.

> On Linux, you do: echo 3 >
> /prog/sys/vm/drop_caches

And in Windows it's ... almost impossible.  You can sync to disk, using 
fflush(), and there's no reason you can't do this from the command-line:

http://technet.microsoft.com/en-us/sysinternals/bb897438.aspx

But for a purge you have to invalidate the contents of cache, and there's no 
way for an application to tell Windows to do this.  For a start, each 
application has its own user mode address cache, so if you run another utility 
to flush cache, it just messes with its own space.  Second, there's no system 
call that allows you access to the cache because Microsoft considers it private 
to the device level.  I am not dissing Microsoft for this: there are fair 
arguments that this is actually the Right Way to do it.

So the way to do it is to overrun the cache by yourself.  If you know your 
cache is 2Gig, find a 2Gig disk file that has nothing to do with your test 
suite and read it.  Or generate 2Gig of gibberish and write it to disk, then 
delete that file.  Pah.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Jim Wilcoxson
On Mon, Feb 21, 2011 at 11:05 AM, Sven L  wrote:
>
> Thank you for your detailed explanation!
> First, can you please tell me how to purge the cache in Windows 7? This could 
> be very useful for my tests!

Sorry, dunno for Windows.  On Mac OSX it is the purge command, in the
development tools.  On Linux, you do: echo 3 >
/prog/sys/vm/drop_caches

>
> I'm quite sure my database itself is not fragmented, since I have only 
> inserted data.

If you insert the data in primary key order, or you don't have any key
(SQLite generates one) it's probably not very fragmented.  But keep in
mind that while to you, things are happening sequentially as you add
records, inside SQLite, things are happening quite differently.  For
example, let's say you have 4 indexes on this database.  So for every
insert you do, there are 5 database pages being affected (one for the
data row, one for each index).  These pages will fill up at different
rates, depending on the size of your keys, and will be written to disk
at different times.  So you are very likely to have data pages and
various index pages all intermixed in your SQLite file.  Also with
multiple indexes, it's unlikely that every index will be in sorted
order, based on the records you are inserting.  So IMO, there's no
practical way to avoid fragmentation within the SQLite file.

> The file system is in good shape too; Windows reports 0% fragmentation. 
> Perhaps there is some other bottleneck, like disk performance in general 
> (this is a 5400 rpm)? Or antivirus-related CPU holds? (I did turn real-time 
> scanning off though.)
> I have even turned Windows Search off (and got a 20% performance gain!). My 
> 32-bit application is running under Windows 7 (64-bit). Could WOW64 have 
> something to do with this performance issue?
>
> The size of the database is 1,98 GB, with 1.5M rows. Couldn't it be that the 
> table scan simply is darn slow for huge tables?

How much free RAM do you have?  Windows being the pig that it is, I'm
guessing not much, unless your system has > 4GB of RAM.  For
comparison, you could write a program that just read through the file,
4K at a time (I think that's the default page size for SQLite on
Windows).  Starting with a cold cache, this is near the best time you
will ever see for count(*) in SQLite.

>
> In an ideal world the COUNT() would be performed on the primary key in RAM. 
> That's not possible? :P

Sure, if all of the primary key records are in either the SQLite cache
or OS cache.  If they aren't, you have seek time.

I just did a small test to compare count(*) with count(primary key).
Here is count(*):

sqlite> explain select count(*) from logs;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Trace  0 0 000
1 Goto   0 8 000
2 OpenRead   1 120 keyinfo(1,BINARY)  00
3 Count  1 1 000
4 Close  1 0 000
5 SCopy  1 2 000
6 ResultRow  2 1 000
7 Halt   0 0 000
8 Transaction0 0 000
9 VerifyCookie   0 23000
10TableLock  0 8 0 logs   00
11Goto   0 2 000

Here is count(primary key):

sqlite> explain select count(logid) from logs;
addr  opcode p1p2p3p4 p5  comment
  -        -  --  -
0 Trace  0 0 000
1 Null   0 2 000
2 Null   0 1 000
3 Goto   0 14000
4 OpenRead   0 8 0 0  00
5 Rewind 0 9 000
6 Rowid  0 3 000
7 AggStep0 3 1 count(1)   01
8 Next   0 6 001
9 Close  0 0 000
10AggFinal   1 1 0 count(1)   00
11SCopy  1 4 000
12ResultRow  4 1 000
13Halt   0 0 000
14Transaction0 0 000
15VerifyCookie   0 23000
16TableLock  0 8 0 logs   00
17Goto   0 4 000

It's not clear to me whether for count(*), SQLite is reading each data
page or reading the primary key index pages.  But when timed, count(*)
is 16x faster than count(logid) the first time, and 8x faster once all
pages are cached.  This is just 

Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Jay A. Kreibich
On Mon, Feb 21, 2011 at 03:37:50PM +0100, Sven L scratched on the wall:

> I've learnt that COUNT(*) is slower than COUNT(ID),

  That's usually not true.

> since * means the engine has to traverse all columns 

  Actually, count(*) is the one case when the engine does *not* need to
  traverse any columns.

> and it might even return another value if there are NULL-values...

  Yes, but that's how count() is defined to work.

  The expression "count(id)" only counts rows where id is not NULL. 
  This requires that the database engine retrieve the value of the id
  column from each row in order to test for NULL.
  
  The expression "count(*)" strictly counts rows, without concern over
  NULLs.  As such, there is no need to actually retrieve any row data,
  because there is no need to test any row values.  The count(*)
  function can scan the table tree, counting the number of rows,
  without actually loading or decoding the row data.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Jim Wilcoxson
On Mon, Feb 21, 2011 at 10:38 AM, Puneet Kishor  wrote:
> On Mon, Feb 21, 2011 at 10:17:03AM -0500, Jim Wilcoxson wrote:
>> This is a common issue on the mailing list.  The first time you do
>> count(*), SQLite (actually your OS) has to load data into memory.
>> Most OS's will keep the file in a buffer cache, so the 2nd count(*)
>> doesn't have to read from disk.
>
>
> One question I have related to the above -- how long does that buffer
> cache remain filled with the data? I am assuming it is until new stuff
> to be cached pushes out old stuff, no?

For most OS's, the time data remains in the cache and the size of the
cache will vary as a function of available RAM.

>
> I was doing some R*Tree selects, and the first query was dog slow,
> although benchmarking showed that the actual CPU time was very small.
> Subsequent queries were lightning fast. I am assuming that the buffer is
> not getting filled with the results as much as it is getting filled with
> whatever part of the db that the program needs to open to do its work.

Right.  SQLite doesn't cache query results.  It does cache database
pages in its own cache, which by default is rather small: 2000 pages.
At the default page size is 1K, that's a 2MB cache.  To fetch records
not in it's own cache, SQLite will use the OS.  If the page is in the
OS cache, there is no seek time and no read latency (for spinning
media).

Jim

>
>
>>
>> Here's a timing from my own system, after a purge command to clear the
>> buffer cache:
>>
>> $ time sqlite3 hb.db 'select count(*) from logs'
>> -- Loading resources from /Users/jim/.sqliterc
>> count(*)
>> --
>> 734909
>>
>> real  0m0.580s
>> user  0m0.190s
>> sys   0m0.034s
>>
>> Same command again, with the file cached:
>>
>> $ time sqlite3 hb.db 'select count(*) from logs'
>> -- Loading resources from /Users/jim/.sqliterc
>> count(*)
>> --
>> 734909
>>
>> real  0m0.189s
>> user  0m0.165s
>> sys   0m0.019s
>>
>> This time is consistent no matter how many times I run it, because the
>> file is still cached.  Doing a purge command to clear the cache and
>> re-running the query, we get:
>>
>> $ purge
>> $ time sqlite3 hb.db 'select count(*) from logs'
>> -- Loading resources from /Users/jim/.sqliterc
>> count(*)
>> --
>> 734909
>>
>> real  0m0.427s
>> user  0m0.175s
>> sys   0m0.024s
>>
>> On my system, there is not a huge difference, but it is consistent.
>> Now, if you have a fragmented file system, you will see a much larger
>> difference.  There are many posts on the mailing list about both file
>> system fragmentation and logical fragmentation within the SQLite file
>> itself.  Your first count(*) is subject to these fragmentation
>> effects, while your 2nd usually is not, because the file is in memory.
>>
>> Some people on the list believe fragmentation is an unimportant detail
>> you shouldn't worry about, because you have little control over it.
>> That may be true, but it's useful to understand how it can affect
>> performance.  I think you are seeing this first hand.
>>
>> Jim
>> --
>> HashBackup: easy onsite and offsite Unix backup
>> http://www.hashbackup.com
>>
>>
>>
>> On Mon, Feb 21, 2011 at 9:37 AM, Sven L  wrote:
>> >
>> > Same result :(
>> > Note that I have compiled SQLite with the following switches:
>> > SQLITE_ENABLE_STAT2
>> > SQLITE_THREADSAFE=2
>> >
>> > I've learnt that COUNT(*) is slower than COUNT(ID), since * means the 
>> > engine has to traverse all columns and it might even return another value 
>> > if there are NULL-values...
>> >
>> > Also, this is quite interesting:
>> >
>> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
>> > 0|0|0|SCAN TABLE Items (~100 rows)
>> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
>> > sqlite>
>> >
>> > I would expect an index scan on the first statement. The second statement 
>> > tells me nada?!
>> >
>> > Thanks for your help!
>> >
>> >
>> >> From: slav...@bigfraud.org
>> >> Date: Mon, 21 Feb 2011 14:24:50 +
>> >> To: sqlite-users@sqlite.org
>> >> Subject: Re: [sqlite] COUNT() extremely slow first time!
>> >>
>> >>
>> >> On 21 Feb 2011, at 2:23pm, Sven L wrote:
>> >>
>> >> > SELECT COUNT(ItemID) FROM Items;
>> >> >
>> >> > This takes around 40 seconds the first time! WHY?!
>> >>
>> >> Try again, doing everything identically except that instead of the above 
>> >> line use
>> >>
>> >> SELECT COUNT(*) FROM Items;
>> >>
>> >> Simon.
>> >> ___
>> >> sqlite-users mailing list
>> >> sqlite-users@sqlite.org
>> >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>> >
>> > ___
>> > sqlite-users mailing list
>> > sqlite-users@sqlite.org
>> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>> >
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> 

Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

Thank you for your detailed explanation!
First, can you please tell me how to purge the cache in Windows 7? This could 
be very useful for my tests!
 
I'm quite sure my database itself is not fragmented, since I have only inserted 
data. The file system is in good shape too; Windows reports 0% fragmentation. 
Perhaps there is some other bottleneck, like disk performance in general (this 
is a 5400 rpm)? Or antivirus-related CPU holds? (I did turn real-time scanning 
off though.)
I have even turned Windows Search off (and got a 20% performance gain!). My 
32-bit application is running under Windows 7 (64-bit). Could WOW64 have 
something to do with this performance issue?
 
The size of the database is 1,98 GB, with 1.5M rows. Couldn't it be that the 
table scan simply is darn slow for huge tables?
 
In an ideal world the COUNT() would be performed on the primary key in RAM. 
That's not possible? :P
 
Thanks again!
 
 
> Date: Mon, 21 Feb 2011 10:17:03 -0500
> From: pri...@gmail.com
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] COUNT() extremely slow first time!
> 
> This is a common issue on the mailing list. The first time you do
> count(*), SQLite (actually your OS) has to load data into memory.
> Most OS's will keep the file in a buffer cache, so the 2nd count(*)
> doesn't have to read from disk.
> 
> Here's a timing from my own system, after a purge command to clear the
> buffer cache:
> 
> $ time sqlite3 hb.db 'select count(*) from logs'
> -- Loading resources from /Users/jim/.sqliterc
> count(*)
> --
> 734909
> 
> real 0m0.580s
> user 0m0.190s
> sys 0m0.034s
> 
> Same command again, with the file cached:
> 
> $ time sqlite3 hb.db 'select count(*) from logs'
> -- Loading resources from /Users/jim/.sqliterc
> count(*)
> --
> 734909
> 
> real 0m0.189s
> user 0m0.165s
> sys 0m0.019s
> 
> This time is consistent no matter how many times I run it, because the
> file is still cached. Doing a purge command to clear the cache and
> re-running the query, we get:
> 
> $ purge
> $ time sqlite3 hb.db 'select count(*) from logs'
> -- Loading resources from /Users/jim/.sqliterc
> count(*)
> --
> 734909
> 
> real 0m0.427s
> user 0m0.175s
> sys 0m0.024s
> 
> On my system, there is not a huge difference, but it is consistent.
> Now, if you have a fragmented file system, you will see a much larger
> difference. There are many posts on the mailing list about both file
> system fragmentation and logical fragmentation within the SQLite file
> itself. Your first count(*) is subject to these fragmentation
> effects, while your 2nd usually is not, because the file is in memory.
> 
> Some people on the list believe fragmentation is an unimportant detail
> you shouldn't worry about, because you have little control over it.
> That may be true, but it's useful to understand how it can affect
> performance. I think you are seeing this first hand.
> 
> Jim
> --
> HashBackup: easy onsite and offsite Unix backup
> http://www.hashbackup.com
> 
> 
> 
> On Mon, Feb 21, 2011 at 9:37 AM, Sven L  wrote:
> >
> > Same result :(
> > Note that I have compiled SQLite with the following switches:
> > SQLITE_ENABLE_STAT2
> > SQLITE_THREADSAFE=2
> >
> > I've learnt that COUNT(*) is slower than COUNT(ID), since * means the 
> > engine has to traverse all columns and it might even return another value 
> > if there are NULL-values...
> >
> > Also, this is quite interesting:
> >
> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
> > 0|0|0|SCAN TABLE Items (~100 rows)
> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
> > sqlite>
> >
> > I would expect an index scan on the first statement. The second statement 
> > tells me nada?!
> >
> > Thanks for your help!
> >
> >
> >> From: slav...@bigfraud.org
> >> Date: Mon, 21 Feb 2011 14:24:50 +
> >> To: sqlite-users@sqlite.org
> >> Subject: Re: [sqlite] COUNT() extremely slow first time!
> >>
> >>
> >> On 21 Feb 2011, at 2:23pm, Sven L wrote:
> >>
> >> > SELECT COUNT(ItemID) FROM Items;
> >> >
> >> > This takes around 40 seconds the first time! WHY?!
> >>
> >> Try again, doing everything identically except that instead of the above 
> >> line use
> >>
> >> SELECT COUNT(*) FROM Items;
> >>
> >> Simon.
> >> ___
> >> sqlite-users mailing list
> >> sqlite-users@sqlite.org
> >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Puneet Kishor
On Mon, Feb 21, 2011 at 10:17:03AM -0500, Jim Wilcoxson wrote:
> This is a common issue on the mailing list.  The first time you do
> count(*), SQLite (actually your OS) has to load data into memory.
> Most OS's will keep the file in a buffer cache, so the 2nd count(*)
> doesn't have to read from disk.


One question I have related to the above -- how long does that buffer
cache remain filled with the data? I am assuming it is until new stuff
to be cached pushes out old stuff, no?

I was doing some R*Tree selects, and the first query was dog slow,
although benchmarking showed that the actual CPU time was very small.
Subsequent queries were lightning fast. I am assuming that the buffer is
not getting filled with the results as much as it is getting filled with
whatever part of the db that the program needs to open to do its work.


> 
> Here's a timing from my own system, after a purge command to clear the
> buffer cache:
> 
> $ time sqlite3 hb.db 'select count(*) from logs'
> -- Loading resources from /Users/jim/.sqliterc
> count(*)
> --
> 734909
> 
> real  0m0.580s
> user  0m0.190s
> sys   0m0.034s
> 
> Same command again, with the file cached:
> 
> $ time sqlite3 hb.db 'select count(*) from logs'
> -- Loading resources from /Users/jim/.sqliterc
> count(*)
> --
> 734909
> 
> real  0m0.189s
> user  0m0.165s
> sys   0m0.019s
> 
> This time is consistent no matter how many times I run it, because the
> file is still cached.  Doing a purge command to clear the cache and
> re-running the query, we get:
> 
> $ purge
> $ time sqlite3 hb.db 'select count(*) from logs'
> -- Loading resources from /Users/jim/.sqliterc
> count(*)
> --
> 734909
> 
> real  0m0.427s
> user  0m0.175s
> sys   0m0.024s
> 
> On my system, there is not a huge difference, but it is consistent.
> Now, if you have a fragmented file system, you will see a much larger
> difference.  There are many posts on the mailing list about both file
> system fragmentation and logical fragmentation within the SQLite file
> itself.  Your first count(*) is subject to these fragmentation
> effects, while your 2nd usually is not, because the file is in memory.
> 
> Some people on the list believe fragmentation is an unimportant detail
> you shouldn't worry about, because you have little control over it.
> That may be true, but it's useful to understand how it can affect
> performance.  I think you are seeing this first hand.
> 
> Jim
> --
> HashBackup: easy onsite and offsite Unix backup
> http://www.hashbackup.com
> 
> 
> 
> On Mon, Feb 21, 2011 at 9:37 AM, Sven L  wrote:
> >
> > Same result :(
> > Note that I have compiled SQLite with the following switches:
> > SQLITE_ENABLE_STAT2
> > SQLITE_THREADSAFE=2
> >
> > I've learnt that COUNT(*) is slower than COUNT(ID), since * means the 
> > engine has to traverse all columns and it might even return another value 
> > if there are NULL-values...
> >
> > Also, this is quite interesting:
> >
> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
> > 0|0|0|SCAN TABLE Items (~100 rows)
> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
> > sqlite>
> >
> > I would expect an index scan on the first statement. The second statement 
> > tells me nada?!
> >
> > Thanks for your help!
> >
> >
> >> From: slav...@bigfraud.org
> >> Date: Mon, 21 Feb 2011 14:24:50 +
> >> To: sqlite-users@sqlite.org
> >> Subject: Re: [sqlite] COUNT() extremely slow first time!
> >>
> >>
> >> On 21 Feb 2011, at 2:23pm, Sven L wrote:
> >>
> >> > SELECT COUNT(ItemID) FROM Items;
> >> >
> >> > This takes around 40 seconds the first time! WHY?!
> >>
> >> Try again, doing everything identically except that instead of the above 
> >> line use
> >>
> >> SELECT COUNT(*) FROM Items;
> >>
> >> Simon.
> >> ___
> >> sqlite-users mailing list
> >> sqlite-users@sqlite.org
> >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Jim Wilcoxson
This is a common issue on the mailing list.  The first time you do
count(*), SQLite (actually your OS) has to load data into memory.
Most OS's will keep the file in a buffer cache, so the 2nd count(*)
doesn't have to read from disk.

Here's a timing from my own system, after a purge command to clear the
buffer cache:

$ time sqlite3 hb.db 'select count(*) from logs'
-- Loading resources from /Users/jim/.sqliterc
count(*)
--
734909

real0m0.580s
user0m0.190s
sys 0m0.034s

Same command again, with the file cached:

$ time sqlite3 hb.db 'select count(*) from logs'
-- Loading resources from /Users/jim/.sqliterc
count(*)
--
734909

real0m0.189s
user0m0.165s
sys 0m0.019s

This time is consistent no matter how many times I run it, because the
file is still cached.  Doing a purge command to clear the cache and
re-running the query, we get:

$ purge
$ time sqlite3 hb.db 'select count(*) from logs'
-- Loading resources from /Users/jim/.sqliterc
count(*)
--
734909

real0m0.427s
user0m0.175s
sys 0m0.024s

On my system, there is not a huge difference, but it is consistent.
Now, if you have a fragmented file system, you will see a much larger
difference.  There are many posts on the mailing list about both file
system fragmentation and logical fragmentation within the SQLite file
itself.  Your first count(*) is subject to these fragmentation
effects, while your 2nd usually is not, because the file is in memory.

Some people on the list believe fragmentation is an unimportant detail
you shouldn't worry about, because you have little control over it.
That may be true, but it's useful to understand how it can affect
performance.  I think you are seeing this first hand.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://www.hashbackup.com



On Mon, Feb 21, 2011 at 9:37 AM, Sven L  wrote:
>
> Same result :(
> Note that I have compiled SQLite with the following switches:
> SQLITE_ENABLE_STAT2
> SQLITE_THREADSAFE=2
>
> I've learnt that COUNT(*) is slower than COUNT(ID), since * means the engine 
> has to traverse all columns and it might even return another value if there 
> are NULL-values...
>
> Also, this is quite interesting:
>
> sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
> 0|0|0|SCAN TABLE Items (~100 rows)
> sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
> sqlite>
>
> I would expect an index scan on the first statement. The second statement 
> tells me nada?!
>
> Thanks for your help!
>
>
>> From: slav...@bigfraud.org
>> Date: Mon, 21 Feb 2011 14:24:50 +
>> To: sqlite-users@sqlite.org
>> Subject: Re: [sqlite] COUNT() extremely slow first time!
>>
>>
>> On 21 Feb 2011, at 2:23pm, Sven L wrote:
>>
>> > SELECT COUNT(ItemID) FROM Items;
>> >
>> > This takes around 40 seconds the first time! WHY?!
>>
>> Try again, doing everything identically except that instead of the above 
>> line use
>>
>> SELECT COUNT(*) FROM Items;
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

Same result :(
Note that I have compiled SQLite with the following switches:
SQLITE_ENABLE_STAT2
SQLITE_THREADSAFE=2
 
I've learnt that COUNT(*) is slower than COUNT(ID), since * means the engine 
has to traverse all columns and it might even return another value if there are 
NULL-values...
 
Also, this is quite interesting:
 
sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items;
0|0|0|SCAN TABLE Items (~100 rows)
sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items;
sqlite>

I would expect an index scan on the first statement. The second statement tells 
me nada?!
 
Thanks for your help!

 
> From: slav...@bigfraud.org
> Date: Mon, 21 Feb 2011 14:24:50 +
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] COUNT() extremely slow first time!
> 
> 
> On 21 Feb 2011, at 2:23pm, Sven L wrote:
> 
> > SELECT COUNT(ItemID) FROM Items;
> > 
> > This takes around 40 seconds the first time! WHY?!
> 
> Try again, doing everything identically except that instead of the above line 
> use
> 
> SELECT COUNT(*) FROM Items;
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] update of a blob

2011-02-21 Thread Igor Tandetnik
Dietmar Hummel  wrote:
> Maybe someone could help me with a little problem. I am trying to update
> an existing entry in the db
> with an update statement where one of the columns is a blob type. I have
> code that looks like this:
> 
> sqlite3_stmt* m_pStatement = NULL;
> 
> std::string strStatement( "UPDATE persistence SET
> name=\"blub\",expiration=\"2011-04-02\",value=\"?\" WHERE id=\"1\"" );

std::string strStatement( "UPDATE persistence SET 
name='blub',expiration='2011-04-02',value=? WHERE id=1" );

'?' (let alone "?") is not a parameter placeholder - it's a string literal 
consisting of a single character ?.
-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Simon Slavin

On 21 Feb 2011, at 2:23pm, Sven L wrote:

> SELECT COUNT(ItemID) FROM Items;
> 
> This takes around 40 seconds the first time! WHY?!

Try again, doing everything identically except that instead of the above line 
use

SELECT COUNT(*) FROM Items;

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] COUNT() extremely slow first time!

2011-02-21 Thread Sven L

SQLite 3.7.5.
 
I have a table (13 columns with 1 primary key, 1 index on a date column, 1 
UNIQUE-constraint on two columns).
I insert ~130 rows, 1000 at a time (transaction-wise).
I close the database, reopen it and immediately perform a COUNT-operation:
 
SELECT COUNT(ItemID) FROM Items;

This takes around 40 seconds the first time! WHY?!
I believe the value is somehow cached inside the database after the above call.
 
 
The table used:
 
CREATE TABLE IF NOT EXISTS Item
(
 ItemID INTEGER PRIMARY KEY,
 A INTEGER NOT NULL,
 B INTEGER NOT NULL,
 C INTEGER NOT NULL,
 D INTEGER NOT NULL,
 E INTEGER NOT NULL,
 F INTEGER NOT NULL,
 G INTEGER NOT NULL,
 H DATE NOT NULL,
 I CHAR(3) NOT NULL,
 J INTEGER NOT NULL,
 K INTEGER NOT NULL,
 L INTEGER NOT NULL,
 
 UNIQUE (B, A)
);
CREATE INDEX IF NOT EXISTS Item_H_Index ON Item (H);
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] process monitor: wal file access

2011-02-21 Thread Gert Corthout




hello,
 
we recently switched from SQLite version 3.5.4 to 3.7.5. When I run process 
monitor on our machines I see a lot of failed file access attempts that weren't 
there before (v 3.5.4).
It is failed QueryOpen (NAME NOT FOUND) events for DBASE-NAME.db3-wal and 
DBASE-NAME.db3-journal.
 
I find especially the *-wal access attempt pretty strange as we do not have WAL 
enabled for our database: it is set to the default journal mode (DELETE). Is 
this normal behaviour? Or is this influenced by some setting I don't know of?
 
kind regards,
Gert
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] update of a blob

2011-02-21 Thread Dietmar Hummel
Hi list!

Maybe someone could help me with a little problem. I am trying to update 
an existing entry in the db
with an update statement where one of the columns is a blob type. I have 
code that looks like this:

 sqlite3_stmt* m_pStatement = NULL;

 std::string strStatement( "UPDATE persistence SET 
name=\"blub\",expiration=\"2011-04-02\",value=\"?\" WHERE id=\"1\"" );

 int iResult = sqlite3_prepare_v2( m_pHandle, strStatement.c_str(), 
strStatement.length() + 1, _pStatement, NULL );

 /* error checking is done here */

!! the error is on the following line. Whatever I enter for the index 
counter variable, I always get the error: "bind or column index out of 
range"

 iResult = sqlite3_bind_blob( m_pStatement, 0, p_bsValue.data(), 
p_bsValue.length(), SQLITE_TRANSIENT );

 /* here again error checking */

 iResult = sqlite3_step( m_pStatement );

 /* here again error checking */

 //dh done
 sqlite3_finalize( m_pStatement );

Maybe someone could give me a hint?

Regards
Dietmar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Upgrade from 3.5.8 -> 3.7.5 - increase memory usage

2011-02-21 Thread Todd Shutts
I am looking at the private byes counter from Windows (MSDN Description
below) for my process.  Is there another counter or stat I should be
looking at?  

(from MSDN - Private Bytes: 
Displays the current number of bytes this process has allocated that
cannot be shared with other processes.)

I receive similar behavior (although it does not climb as fast-because
it doesn't run as fast) with Journal_mode=delete.  When using WAL; I set
the wal_autocheckpoint to 250 and called WAL_CHECKPOINT without any
measurable result to see if I see any difference.

I have left the application running and the private bytes continues to
increase; I am 150MB and counting  My code has to be leaking; but why
not with the older version?

Thanks for all the help

Todd


-Original Message---
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Roger Binns
Sent: Sunday, February 20, 2011 7:48 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Upgrade from 3.5.8 -> 3.7.5 - increase memory
usage

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/20/2011 02:48 PM, Todd Shutts wrote:
> However; memory usage is growing like crazy. 

It is a very frequent occurrence that what people think is being
measured is
not what is actually being measured.  Make sure you understand exactly
what
the tool showing you memory consumption is measuring and especially
understand if it includes memory shared with other processes or memory
mapped files.  Additionally you need to understand the difference
between
memory as the kernel sees and reports it versus memory within particular
APIs such as C's malloc.

> The application
> never used more than 10MB and it is currently using 57+MB and
continues
> to climb.  

The single most likely explanation is this is WAL in action, the memory
is
from a memory mapped file and a WAL checkpoint will release it.

  http://www.sqlite.org/wal.html

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk1hxDYACgkQmOOfHg372QRikgCdHrEuzE5p71LTaiF+WRHfG6j2
9S0An100kCApkwZI74XGYR6zxczr2m7u
=d0xw
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Contributing to System.Data.Sqlite

2011-02-21 Thread Richard Hipp
On Sat, Feb 19, 2011 at 10:11 PM, Alexander Spence <
alexander.spe...@demandmedia.com> wrote:

> Who would I talk to if I wanted to contribute to the System.Data.Sqlite
> project?  We use this code in an enterprise environment and would like to
> contribute our changes to the community.
>

I need to work out a Contributor License Agreement for System.Data.SQLite -
probably something along the lines of
http://www.fossil-scm.org/fossil/doc/trunk/www/copyright-release.html used
for Fossil.  Let me work on that and get back to you.

Thanks for volunteering!


>
> 
> Please NOTE: This electronic message, including any attachments, may
> include privileged, confidential and/or inside information owned by Demand
> Media, Inc. Any distribution or use of this communication by anyone other
> than the intended recipient(s) is strictly prohibited and may be unlawful.
> If you are not the intended recipient, please notify the sender by replying
> to this message and then delete it from your system. Thank you.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] owner/permissions on wal/shm and journal files

2011-02-21 Thread Philip Graham Willoughby
On 21 Feb 2011, at 09:36, Simon Slavin wrote:

> On 21 Feb 2011, at 9:35am, Philip Graham Willoughby wrote:
> 
>> On 20 Feb 2011, at 16:18, thilo wrote:
>> 
>>> My DB is owned by a different user (www) and I as root insert some data.
>> ...
>>> Are there any suggestions on how to make this workable?
>> 
>> Don't do that. Unless you can guarantee that no malicious user has been able 
>> to alter your database it isn't safe to interact with it as root if you're 
>> using a version of the sqlite3 shell that has the .load built-in. Use su or 
>> sudo to become www and interact with your data as the owning user.
> 
> Would it be feasible to copy whatever protection is on the database file to 
> any temporary files like journal files ?

On systems with traditional unix permissions if you have authority to create a 
file in a directory you also have authority to set the created file's 
permissions. You can also change the group of the file to any group of which 
you are a member. Conceptually, the relevant VFSs would need to do something 
like this when creating subsidiary files:

// Allow us to set whatever file mode we want
mode_t oldmode = umask(0);
struct stat dbstat;
// Retrieve database permissions etc
fstat(dbfh, );
// Create file with the right permissions if necessary
int newfd = open(…,…|O_EXCL|O_CREAT,dbstat.st_mode);
// Match the group id - should usually work
if (fchown(newfd,-1,dbstat.st_gid))
{
  // Could not change gid to match db
  // log warning?
}
// Match the user id - should usually fail
if (fchown(newfd,dbstat.st_uid,-1))
{
  // Could not change uid to match db
  // log warning?
}
// restore umask
umask(oldmode);

For me, the mode-matching is definitely worth it - if you have a group-writable 
database you almost certainly want a group-writable log. Similarly the 
gid-matching code is worthwhile for the same reason.

The uid-matching code is only usable by root or processes given that privilege 
by some other mechanism and is therefore less worthwhile.

Best Regards,

Phil Willoughby
-- 
Managing Director, StrawberryCat Limited

StrawberryCat Limited is registered in England and Wales with Company No. 
7234809.

The registered office address of StrawberryCat Limited is:

107 Morgan Le Fay Drive
Eastleigh
SO53 4JH

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] owner/permissions on wal/shm and journal files

2011-02-21 Thread Simon Slavin

On 21 Feb 2011, at 9:35am, Philip Graham Willoughby wrote:

> On 20 Feb 2011, at 16:18, thilo wrote:
> 
>> My DB is owned by a different user (www) and I as root insert some data.
> ...
>> Are there any suggestions on how to make this workable?
> 
> Don't do that. Unless you can guarantee that no malicious user has been able 
> to alter your database it isn't safe to interact with it as root if you're 
> using a version of the sqlite3 shell that has the .load built-in. Use su or 
> sudo to become www and interact with your data as the owning user.

Would it be feasible to copy whatever protection is on the database file to any 
temporary files like journal files ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Upgrade from 3.5.8 -> 3.7.5 - increase memory usage

2011-02-21 Thread Simon Slavin

On 21 Feb 2011, at 5:56am, Dan Kennedy wrote:

> WAL file growth has (apparently) been a problem for people with
> really busy systems. But so far it has been the actual size of
> the file on disk that bothers them, not the memory mapped bit.

If that's what you think is raising concern then fine, someone needs to write 
something about that.  But looking at the actual posts to this list gives me 
questions about reported memory-usage.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] owner/permissions on wal/shm and journal files

2011-02-21 Thread Philip Graham Willoughby
On 20 Feb 2011, at 16:18, thilo wrote:

> My DB is owned by a different user (www) and I as root insert some data.
...
> Are there any suggestions on how to make this workable?

Don't do that. Unless you can guarantee that no malicious user has been able to 
alter your database it isn't safe to interact with it as root if you're using a 
version of the sqlite3 shell that has the .load built-in. Use su or sudo to 
become www and interact with your data as the owning user.

Best Regards,

Phil Willoughby
-- 
Managing Director, StrawberryCat Limited

StrawberryCat Limited is registered in England and Wales with Company No. 
7234809.

The registered office address of StrawberryCat Limited is:

107 Morgan Le Fay Drive
Eastleigh
SO53 4JH

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users