from:"David Barrett"

Re: [sqlite] Replication

2018-10-11 Thread David Barrett

Incidentally, Bedrock is built on a blockchain as well -- though I agree
with the sentiment that blockchain isn't actually new at all, and not that
big of a deal.  More information is here:
http://bedrockdb.com/blockchain.html  Hope you enjoy it!

-david

On Thu, Oct 11, 2018 at 3:06 PM R Smith  wrote:

>
> > WARNING: the following sentence will be claimed to be controversial:
> >
> > No database based on SQL is truly relational.
>
> LOL - who would claim that to be controversial?
>
> It doesn't spur controversy...
>
> It's worthy of a shrug at best, perhaps a "So what?".
>
>
> It sounds like a deepity - much like any of these:
> - Nothing is ever really True...
> - Is reality even really real?
> - No ice-cream machine ever makes TRUE ice-cream.
>
> An SQL database is deemed "Relational" when it can communicate mildly
> relational data using mildly relational (but mathematically sound)
> methods. It doesn't need to be (nor claim to be) the Almighty keeper of
> all relationality, nor even simply conform to various specific
> interpretations of the word "Relation".
>
> In case the point still eludes: We call an SSL hand-shake such because
> it behaves by mutual agreement - much like a human hand-shake - but just
> because we call it so, doesn't bestow upon it a necessity to behave in
> every way like a literal hand-shake, lest some pipe-smoking mountain
> wisdom gazes far in the distance while stroking grey beard slowly and
> declaring: "no SSL hand-shake is truly hand-shaky".
>
>
>
>
> PS: While I feel some ambivalence towards the subject, I was nodding in
> agreement with most of your post, till that line appeared.   :)
>
> PPS: Apologies for inventing some words there
>
> PPPS: Thumbs up for the Bedrock suggestion from another post - that
> system really rocks.
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Safe sqlite over remote filesystem?

2018-08-15 Thread David Barrett

Ah, to clarify, we're very, very actively developing on Bedrock every
single day.  In fact, we're about to roll it out to our new 3-datacenter,
6-server cluster of 384-core, 3TB RAM, 100Gbps fiber-connected machines!
All of Expensify is powered by it, so it's been battle tested with over a
decade of real-world testing at scale -- both very big, and very small.
That said, we personally find it much easier to just build from source and
thus don't really maintain a public binary ready to go -- which I recognize
makes it a little less accessible, but Bedrock bliss is just a "make"
away.  Lmk how I can help!

-david

On Tue, Aug 14, 2018 at 11:13 PM Wout Mertens 
wrote:

> On Tue, Aug 14, 2018 at 6:28 PM Rowan Worth  wrote:
>
> > FWIW in the building I work in we have 20-30 users hitting around a dozen
> > SQLite DBs 8 hours a day 5 days a week, with all DBs served over nfs.
>
>
> Multiple writers? I presume you use WAL mode?
>
>
> > Erm, I got a bit carried away. My point is, it's not all doom and gloom.
> >
>
> Yey :) I think that might be good enough (see my previous email)
>
> And nfs on OSX seems to be a
> > non-starter; not sure what's going on with that client implementation but
> > the brief experimentation I've done with it suggested that corruption was
> > guaranteed.
> >
>
> Aww :( Dammit Apple. I'll have to experiment too.
>
>
> > We plan to migrate to an actual SQL server
> > for that reason, but please don't take it as a criticism of SQLite - I
> > think it does a marvelous job in a scenario it definitely wasn't designed
> > for.
> >
>
> Before you do that, did you see bedrockdb? http://bedrockdb.com/
>  (Although it looks like the github repo is being a little bit ignored by
> Expensify)
>
> Wout.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BedrockDB interview on Floss Weekly

2017-10-29 Thread David Barrett

On Fri, Oct 27, 2017 at 4:34 AM, Warren Young  wrote:

> Before I get to the questions, I haven’t listened to the FLOSS episode
> yet, so please forgive me if these were asked and answered on the podcast.
> Just tell me if so, because I will eventually get to it.
>
> 1. I don’t see a C API.  If so, that means it’s up to the caller to handle
> quoting and escaping properly to avoid inadvertently creating SQL injection
> vulnerabilities.  Is that true, and if so, is there a plan to solve it?  I
> don’t mind running my queries through an HTTP[-ish] layer, but getting
> quoting and escaping right in hand-written code is a big source of errors
> I’d rather not return to.
>

There is a PHP binding (https://github.com/Expensify/Bedrock-PHP) but none
for C yet -- PRs welcome!  That said because it uses standard HTTP escaping
it should also be compatible with pretty much any HTTP C API -- or it's
very easy to hand roll.  When it comes to escaping, for example, you can
generally put the content directly in the HTTP content body, so Bedrock
introduces no "extra" escaping (beyond what is normally required by SQL, of
course).

> 2. No DBD::BedrockDB yet, I see.  When? :)
>

Heh, we don't have much Perl in our environment, but again, this would be
straightforward to add.

> 3. What happens if a local program attaches to the SQLite DB and queries
> it in the face of active replication?
>

SQLite is crazy hardened and reliable in this scenario.  You are free to
access the underlying SQLite database via the command line if you like, but
the locking behavior will be more restrictive.  BedrockDB not only supports
concurrent reads (via SQLite's standard WAL functionality), but concurrent
*writes* as well (via an experimental "BEGIN CONCURRENT" branch developed
by the SQLite team that we're hammering the kinks out of before its merged
into the mainline).  So use the Bedrock plugin architecture where possible
to get maximum concurrency, and never do writes from the command line as
it'll screw up our replication journal, but you are free to do reads via
the command line when more convenient (and we do it all the time).

4. What happens if a local program *inserts* data via the SQLite
> interface?  Does the data get replicated, or does that happen only if you
> insert via port ?
>

Correct, only insertions done via port  will get replicated, so it's
not recommended (but always possible for extreme maintenance scenarios --
about as often as you'd directly modify a local MySQL file with a hex
editor -- meaning, almost never).

> 5. I think your web pages answer this one implicitly, but let me ask just
> to be sure I’m guessing right: Does the MySQL API smooth out any of the
> semantic and language differences between MySQL and SQLite, or must you
> simply give SQLite-compatible queries over the MySQL interface?  (Expected
> answer: “No.”)
>

No -- we only support SQLite syntax (which is really more modeled off of
Postgres), but in practice the various MySQL clients we've tried don't
actually care about the syntax -- only the protocol -- so it works well
enough in practice.

6. What MySQL API version are you compatible with?  As maintainer of
> MySQL++, I can tell you, there is not just one version. :)
>

Good question, we don't have an exhaustive list.  The versions we use, at
least. :)  But the most recent version of SequelPro, the MySQL command line
client that ships with Ubuntu, and Mode Analytics to start.

> 7. Does using your MySQL API solve problem #1?  That is, if you use one of
> the available ways for constructing guaranteed-sound queries via
> libmysqlclient or one of the many libraries built atop it, does BedrockDB
> get the benefit?
>

I'm not sure what "guaranteed-sound" means, sorry!

> 8. Does your MySQL API solve problem #2?  That is, have you tested it
> against DBD::mysql?
>

I don't think we've tested against that as we don't use a lot of Perl, but
I'd be curious for your results!

-david
Founder and CEO of Expensify (and BedrockDB)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BedrockDB interview on Floss Weekly

2017-10-29 Thread David Barrett

Thanks for the fantastic questions!  Answers inline:

I had never heard of you guys, this looks amazing! How can you only have
> 222 github stars?!
>

Heh, feel free to give it one more!

> Paxos, so needs at least 3 nodes?
>

It works with any number of nodes, but I recommend 6 as the ideal: two
nodes in each of three different datacenters.  More on how the
synchronization technology works is here:
http://bedrockdb.com/synchronization.html

> How do you use it from an application that normally uses sqlite? Is it a
> drop-in replacement? I use Node.JS…
>

BedrockDB has an extremely simple wire protocol based around the concept of
"plugins".  The "db" plugin is described here: http://bedrockdb.com/db.html,
but there is also a full featured "cache" and *extremely* full featured
"jobs" plugin as well.  The protocol is easy enough that you can "netcat"
in and use it effectively by hand, or you can use our PHP binding:
https://github.com/Expensify/Bedrock-PHP

Interesting that you emulate mysql, given that sqlite tries to be
> postgresql compatible…
>

Our main use for this was to accommodate users who prefer certain clients
like SequelPro or Mode Analytics.  More information on our MySQL emulation
is here: http://bedrockdb.com/mysql.html

> Any war stories around developing this? Unexpected issues due to a missing
> comma etc?
>

Omg, so many war stories.  This has been powering Expensify from day one,
so we have about 8 years of scars as a result.  However, Bedrock gains the
benefit of all that painfully earned experience, and is hardened against a
million real-world edge cases that you won't need to worry about.

> Would you recommend this for any workload? I'm paticularly interested in
> replicating my append-only event log db.

In general I recommend Bedrock for more or less any workload that you would
give to a database (which in my experience, is about any workload involving
persistent data).  I've long wanted to build a "logs" plugin to Bedrock
that would provide full-text indexing ala Elasticsearch (which we use at
Expensify, but have pretty steady problems with), but it hasn't gotten bad
enough yet to worry about.  Thanks for asking!

-david
Founder and CEO of Expensify (and BedrockDB)

On Thu, Oct 26, 2017 at 10:19 PM, Wout Mertens <wout.mert...@gmail.com>
wrote:

> I had never heard of you guys, this looks amazing! How can you only have
> 222 github stars?!
>
> Paxos, so needs at least 3 nodes?
>
> How do you use it from an application that normally uses sqlite? Is it a
> drop-in replacement? I use Node.JS…
>
> Interesting that you emulate mysql, given that sqlite tries to be
> postgresql compatible…
>
> Any war stories around developing this? Unexpected issues due to a missing
> comma etc?
>
> Would you recommend this for any workload? I'm paticularly interested in
> replicating my append-only event log db.
>
> Wout.
>
> On Thu, Oct 26, 2017, 8:15 AM David Barrett <dbarr...@expensify.com>
> wrote:
>
> > I'm glad you liked it!  I'd be happy to answer any questions you have
> about
> > http://BedrockDB.com, our use of sqlite, or anything else.  Thanks for
> > listening!
> >
> > -david
> >
> > On Wed, Oct 25, 2017 at 4:19 PM, jungle Boogie <jungleboog...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Pardon the usual interruption of complex sqlite questions...
> > >
> > > David Barrett was interviewed on Floss Weekly today and gave a rave
> > > review of his project, which is based on the wonderful sqlite3
> > > database.
> > >
> > > I'm only 10 minutes into the interview and really love it already!
> > > https://twit.tv/shows/floss-weekly/episodes/456
> > >
> > > Thanks to David for appearing on the show and of course to the Sqlite3
> > > team for their amazing efforts to make, and maintain the most widely
> > > deployed database engine in the world - maybe even in the galaxy.
> > >
> > > Thanks,
> > > j.b.
> > >
> > > --
> > > ---
> > > inum: 883510009027723
> > > sip: jungleboo...@sip2sip.info
> > > ___
> > > sqlite-users mailing list
> > > sqlite-users@mailinglists.sqlite.org
> > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> > >
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] BedrockDB interview on Floss Weekly

2017-10-26 Thread David Barrett

I'm glad you liked it!  I'd be happy to answer any questions you have about
http://BedrockDB.com, our use of sqlite, or anything else.  Thanks for
listening!

-david

On Wed, Oct 25, 2017 at 4:19 PM, jungle Boogie <jungleboog...@gmail.com>
wrote:

> Hi All,
>
> Pardon the usual interruption of complex sqlite questions...
>
> David Barrett was interviewed on Floss Weekly today and gave a rave
> review of his project, which is based on the wonderful sqlite3
> database.
>
> I'm only 10 minutes into the interview and really love it already!
> https://twit.tv/shows/floss-weekly/episodes/456
>
> Thanks to David for appearing on the show and of course to the Sqlite3
> team for their amazing efforts to make, and maintain the most widely
> deployed database engine in the world - maybe even in the galaxy.
>
> Thanks,
> j.b.
>
> --
> ---
> inum: 883510009027723
> sip: jungleboo...@sip2sip.info
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Introducing Bedrock: SQLite for the Enterprise

2016-10-20 Thread David Barrett

Hm, that's an interesting case that I don't quite follow.  But you could
definitely use Bedrock on 2 nodes, and Bedrock would handle realtime
replication from one to the other.  This would provide an instant offsite
backup.  Granted, if there were only 2 nodes, you couldn't take one
offline, as the other one wouldn't have quorum.  If your goal is to not
just have a backup, but also copy that backup to tape, I actually recommend
a 3 node deployment.  This way you can take one node offline and the
remaining two still have quorum.

Now, another crazy idea would be to have a 2 node cluster, where one of the
nodes is a "permaslave".  Permaslaves don't participate in quorum, so
taking it offline wouldn't affect the master -- which would go forward
exactly as if it were alone (because from a quorum perspective, it is, and
thus it always represents its own quorum).  This *might* "just work" out of
the box, I haven't tried it.  But it's an interesting case!

-david

On Wed, Oct 19, 2016 at 6:17 AM, Stephen Chrzanowski <pontia...@gmail.com>
wrote:

> So am I to understand you just "server-ized" SQLite with Bedrock?
>
> Looks rather interesting.  I was just talking to my IT manager about how I
> can take something like a SQLite backup and put it somewhere else so it'll
> eventually get to tape, but, if I run a node on my local machine, run one
> on the 'primary' and another on the 'backup' of the primary, it'd satisfy
> quorum, and I'd be thinking less of backups.  (But its still a thought)
>
> Is there a way to disable the check for quorum and either let the split
> brain happen, or at least make the executive decision to which is the
> primary data source at all times?  In the tool I'll be writing, it COULD
> happen that only one node would be available.  We typically run our servers
> here at work in primary/secondary fashion, no tertiary, so if the primary
> goes away, and its only the secondary, then my software would go down,
> which is something I obviously want to avoid.  We also do typically one-way
> replication.
>
> Is there a mechanism that will allow me to run the Backup API to dump the
> database on a particular node?
>
>
> On Tue, Oct 18, 2016 at 8:45 PM, David Barrett <dbarr...@expensify.com>
> wrote:
>
> > Love SQLite?  Wish you could use it to power your enterprise SaaS or web
> > service? Now you can!  Check out Expensify Bedrock, our distributed
> > transaction layer built atop SQLite, powering Expensify's millions of
> > users.  More information is here:
> >
> > http://bedrockdb.com
> >
> >
> > Keep all the power and simplicity of SQLite, but wrapped in a package
> that
> > provides network accessibility, WAN-optimized replication, and
> distributed
> > ACID transactions.  Under continuous development and operation for the
> past
> > 8 years, now it's open sourced and ready for your production use.
> >
> > Thank you to the SQLite team for not only producing such an incredible
> > database, but helping with our countless questions and demanding
> > requirements.  I'm ecstatic to share this with you, and I hope you enjoy
> it
> > too!
> >
> > -david
> > Founder and CEO of Expensify
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Introducing Bedrock: SQLite for the Enterprise

2016-10-18 Thread David Barrett

Love SQLite?  Wish you could use it to power your enterprise SaaS or web
service? Now you can!  Check out Expensify Bedrock, our distributed
transaction layer built atop SQLite, powering Expensify's millions of
users.  More information is here:

http://bedrockdb.com


Keep all the power and simplicity of SQLite, but wrapped in a package that
provides network accessibility, WAN-optimized replication, and distributed
ACID transactions.  Under continuous development and operation for the past
8 years, now it's open sourced and ready for your production use.

Thank you to the SQLite team for not only producing such an incredible
database, but helping with our countless questions and demanding
requirements.  I'm ecstatic to share this with you, and I hope you enjoy it
too!

-david
Founder and CEO of Expensify
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Wish list: allow developers use the power of sqliteparser

2016-01-18 Thread David Barrett

Whoaa!!  This is amazing!

https://www.sqlite.org/c3ref/c_alter_table.html

Wow, thanks!

-david

On Mon, Jan 18, 2016 at 10:54 PM, Scott Hess  wrote:

> On Mon, Jan 18, 2016 at 10:27 PM, David Barrett 
> wrote:
>
> > One use of this I would like is to create a security framework around
> > arbitrary SQL queries from the user.  So, for example, I'd love to
> > determine which tables (and which columns of those tables) a particular
> > query is going to access, and then compare that list against a whitelist
> of
> > columns the user is authorized to access.
>
>
> You shouldn't use the parser interface for access control, you should use
> the authorizer interface.
>
> -scott
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

[sqlite] Wish list: allow developers use the power of sqliteparser

2016-01-18 Thread David Barrett

One use of this I would like is to create a security framework around
arbitrary SQL queries from the user.  So, for example, I'd love to
determine which tables (and which columns of those tables) a particular
query is going to access, and then compare that list against a whitelist of
columns the user is authorized to access.  I'm not confident enough in my
own parsing skills to make something foolproof, but if I were using the
same exact parser as sqlite, then it would be impossible to "trick".

Any suggestions on how to use the private Lemon parser methods to
accomplish this?  Thanks!

-david

On Mon, Jan 18, 2016 at 7:17 AM, Domingo Alvarez Duarte <
sqlite-mail at dev.dadbiz.es> wrote:

> Is this something crazy to ask as a developer ?
>
> I think that even for the author something like this would make several
> tasks
> easier.
>
> Ideally I would like to feed the parser with an sql string, get it's syntax
> tree, maybe do some rewrite and feed it execute it, this possibility can
> open
> the door to amazing things.
>
>
>
> Thanks for all answers so far, I still want to hear any other idea that can
> lead to achieve the original request !
>
>
>
> Cheers !
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-09 Thread David Barrett

Thanks Richard, this is exactly what I was thinking.  One question on this:

On Tue, Oct 6, 2015 at 10:22 PM, Richard Hipp  wrote:

> (2) For the source database connection of the backup, use the same
> database connection that is used for writing to the database.  That
> means that when changes are made to the source database, the backup
> mechanism can go back and resend only those pages that actually
> changed.  If the database is modified by any database connection other
> than the one used for backup, then the backup has no way of knowing
> which pages changed, and so it has to start over again at the
> beginning and rewrite every page.
>

Hm, that's interesting.  I was thinking that when the backup started, all
new writes would just accumulate in the WAL file -- and then be
checkpointed after the backup completed.  Thus the backup would be of the
state of the database at the moment the backup was *started*.  However,
based on your description it sounds like the backup API will copy over
changes that have occurred throughout the backup process itself. Our
database is modified about 40 times per second, so my fear is the backup
API would never actually complete because it would be continuously
re-backing up the same pages.  Am I understanding this correctly?  Is there
any way to obtain the behavior I initially assumed?

Thanks!

-david

[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread David Barrett

On Tue, Oct 6, 2015 at 8:36 PM, Simon Slavin  wrote:

> Or copy an existing /open/ database file to the new server using the
> SQLite Backup API, [requires other connections to stop modifying the
> database for long enough for the copy to be made]
>

Well the backup API works with WAL mode [1] so it can still be modified
while being backed up (though I imagine the WAL file will get huge during
the backup, but that's fine).

Regardless, all those solutions require me to wait for the entire backup to
complete before sending the file to the remote host -- my goal is to send
it one page at a time (eg, send the pages as the backup API processes them)
so as to avoid the large delay and disk space of the backup itself.

-david

[1] http://sqlite.1065341.n5.nabble.com/WAL-mode-and-backup-API-td40107.html

[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread David Barrett

On Tue, Oct 6, 2015 at 2:57 PM, Clemens Ladisch  wrote:

> It backs up to any disk that you can access.
> Do you have a network file system?
>

Well yes, but I'd like to handle it at the application layer.  Basically,
we operate a custom replication layer atop sqlite.  It replicates
individual transactions great with 2-phase commit, but right now you need
to manually "bootstrap" a new server by copying the database from a
different server.  I'd like to auto-bootstrap a new node by just starting
it, it'd connect to a peer, and then download the entire database.

> > how to use this API to do an incremental backup
>
> This API is not incremental; it always copies the entire database.
>

Agreed it will copy the entire database, but incrementally -- one bit at a
time.  Each call to sqlite3_backup_step() backs up a bit more to the target
file on disk.  My goal is to instead have sqlite3_backup_step() copy some
pages to some block of RAM, so I can send that data over the wire and write
it to the remote disk.

-david

[sqlite] Ideas on how to use backup API for remote, incremental backup?

2015-10-06 Thread David Barrett

sqlite has a cool "online backup" API: https://www.sqlite.org/backup.html
 However, it only backs up to a local disk.  I'm wondering if anybody can
think on how to use this API to do an incremental backup over a network
connection to a remote host?  The networking part is easy.  But I can't
figure out how to programmatically get the binary of the database file into
memory such that I can send over the network (other than waiting for the
whole thing to backup to disk and then just doing a remote disk copy,
obviously, but my file is so large I don't want to wait that long nor do I
have enough disk to do that).  I'm not really familiar with the VFS layer
-- would there be some way to do a custom VFS that the online backup API
writes to, and then I just copy the blocks over from that?  Thanks for any
creative ideas!

-david

[sqlite] The Simplest SQLite Common Table Expression Tutorial

2015-09-25 Thread David Barrett

Hey all, just wanted to share this in case anybody is also looking for a
very simple tutorial for CTE's in sqlite:

http://blog.expensify.com/2015/09/25/the-simplest-sqlite-common-table-expression-tutorial/

The Simplest SQLite Common Table Expression Tutorial

I?ve been trying to wrap my head aroundCommon Table Expressions
 for a while, and all the tutorials
I?ve read started out with ?simple? examples that were way too advanced for
me to follow. Here?s my attempt to write a tutorial that starts as simple
as possible.

First, let?s start with the simplest query:

sqlite> SELECT 1;
1
sqlite>

All this does is return a result set containing a row. Next, consider the
simplest subquery:

sqlite> SELECT * FROM ( SELECT 1 );
1
sqlite>

This just selects all the results from the subquery ? which in this case,
is just a single row. A ?Common Table Expression? is basically the same as
a subquery, except assigned a name and defined prior to the query in which
it?s referenced. Accordingly, the simplest CTE version of the above query
would be like:

sqlite> WITH one AS ( SELECT 1 )
SELECT * FROM one;
1
sqlite>

Breaking that down a bit further:

   - We?ve defined a common table expression named ?one?
   - We?ve ?filled? it with the output of SELECT 1, which is just 1 row
   - Then we selected everything from ?one?
   - Such that the final result is a single value: 1

But a CTE can have multiple columns, too, and those columns can be assigned
names:

sqlite> WITH twoCol( a, b ) AS ( SELECT 1, 2 )
SELECT a, b FROM twoCol;
1|2
sqlite>

Similarly, a CTE can query other tables:

sqlite> CREATE TABLE foo ( bar INTEGER );
sqlite> INSERT INTO foo VALUES(1);
sqlite> INSERT INTO foo VALUES(2);
sqlite> SELECT * FROM foo;
1
2
sqlite> WITH fooCTE AS (SELECT * FROM foo)
SELECT * FROM fooCTE;
1
2
sqlite>

Additionally, you can define as many CTEs as you want in a single query:

sqlite> WITH aCTE AS (SELECT 'a'),
 bCTE AS (SELECT 'b')
SELECT * FROM aCTE, bCTE;
a|b
sqlite>

So, common table expressions can be used to restructure a query to make it
more readable, by moving the subqueries out in front. But the real power of
common table expressions is when you define an expression that recursively
selects itself. They key to this is using a ?Compound Select Statements?,
such as the UNION ALL operator. This just combines two result sets into one
(so long as they have the same number of columns):

sqlite> SELECT 1, 2
UNION ALL
SELECT 3, 4;
1|2
3|4
sqlite>

Take this example:

sqlite> WITH RECURSIVE infinite AS (
SELECT 1
UNION ALL
SELECT * FROM infinite
)
SELECT * FROM infinite;
^CError: interrupted
sqlite>

Let?s break down why that query will never finish:

   - ?WITH RECURSIVE infinite? defines a common table expression named
   ?infinite?
   - ?SELECT 1? seeds that CTE?s output with a single row ? containing ?1?
   - Next the ?UNION ALL? says ?combine the output of what?s on the left,
   with the output of what?s on the right
   - And on the right we do ?SELECT * FROM infinite? ? meaning, select
   everything currently in the table.
   - The result is we?re defining a common table expression named
   ?infinite? to be the union of ?a single row? and ?all other rows?.
   - Because no ?cap? has been placed on this (via a WHERE or LIMIT), this
   means we?ve defined an infinitely recurring CTE. Fun!

So we can ?cap? that CTE by writing a query like:

sqlite> WITH RECURSIVE finite AS (
SELECT 1
UNION ALL
SELECT * FROM finite LIMIT 2
)
SELECT * FROM finite;
1
1
sqlite>

This does the same basic thing, but we?ve limited the number of possible
results to only be 2. Ok, so that?s all well and good, but what is this
good for? It turns out, a lot. Say you wanted to generate a table on the
fly containing the numbers one through ten:

sqlite> WITH RECURSIVE ten(x) AS (
SELECT 1
UNION ALL
SELECT x+1 FROM ten WHERE x<10
)
SELECT * FROM ten;
1
2
3
4
5
6
7
8
9
10
sqlite>

To do this, we?ve defined a CTE named ?ten?, with a single column named ?x?
(the column name is optional, but in this case we need it to refer to
later). Then in the recursive UNION ALL, we keep adding one more row to the
result set ? each one larger than the row before ? until we reach a limit
of 10.

So CTEs can be used to generate a wide array of different types of data ?
such as date ranges, perhaps to join against when doing a historical
analysis against a sparse dataset (where some months have no data, so a
simple group-by won?t suffice):

sqlite> WITH RECURSIVE dates(x) AS (
SELECT '2015-01-01'
UNION ALL
SELECT DATE(x, '+1 MONTHS') FROM dates WHERE x<'2016-01-01'
)
SELECT * FROM dates;
2015-01-01
2015-02-01
2015-03-01
2015-04-01
2015-05-01
2015-06-01

[sqlite] What is the best page cache size when the database is larger than system RAM?

2015-09-13 Thread David Barrett

Hello!  If I have a database that is larger than the system's physical RAM,
am I correct in thinking I should actually set a very *small* page cache so
as to avoid "double caching" the same pages in both sqlite and the file
cache?

To clarify, if the database is *smaller* than system RAM, I imagine the
best case scenario (for read performance at least) would be to set a page
cache size equal to or larger than the database size.  This would ensure
there is always enough RAM set aside to store up to the entire database in
the page cache.  (Some subset of the database would also be stored in the
file cache, but once the page cache is warmed up, you would never touch the
file cache.)

However, if the database is *larger* than system RAM, even assigning all of
it to the page cache wouldn't enable it to store the whole database in RAM,
no matter how "warmed up" it gets.  Furthermore, every time you "miss" the
page cache, you would always *also* miss the file system cache (because the
only way a page makes it into the page cache is by first reading it from
the file system).  So every page read into the page cache requires not just
that memory, but the same amount of memory in the file system cache.  So
the larger the ratio between database and system size, the more likely you
spend 2x the RAM for each page in the page cache.

Accordingly, it would seem that the best configuration for a database
significantly larger than the system RAM (assuming a perfectly uniform
access pattern where every page is accessed with equal probability) would
be to disable the page cache entirely -- and thereby preserve as much RAM
as possible for the file cache.  Yes, it would be better to have it in the
page cache than the file cache, but I imagine it's vastly better to read
from the file cache than to go to the disk.  Is this right?

Sorry for the long explanation; I'm not 100% sure how to describe it and
want to make sure I'm asking the question right.  Thanks!

-david

[sqlite] json_* functions in sqlite

2015-06-07 Thread David Barrett

For what it's worth, I'd also love some official JSON support, with JSON
indexes (eg, a function index that pulls a JSON value).  However, I agree
it doesn't make sense to add to the main codebase -- I was more thinking an
official plugin (so we don't just keep writing our own over and over).

David
On Apr 22, 2015 10:57 PM, "James K. Lowden" 
wrote:

> On Tue, 21 Apr 2015 18:09:33 -0700
> Ashish Sharma  wrote:
>
> > Many times I store JSON data in sqlite. It will be useful if sqlite
> > came with functions which understand JSON. Presto has a nice set
> > https://prestodb.io/docs/current/functions/json.html
>
> In case you don't know, you could implement functions such as Presto
> provides yourself with a set of user-defined functions, without any
> help from the SQLite project.
>
> As to whether JSON should be a supported datatype in SQLite, the answer
> is clearly No.  SQLite, let us note, barely recognizes datatypes at
> all, and lacks even a date type.  But more than that, JSON as "native"
> datatype has two fundamental problems: definition and complexity.
>
> Definitionally, we should require any datatype have meaningful
> operations comparable to those of numbers and strings.  What does it
> mean to "add" two JSON objects?  When is one less than another?  Do you
> seriously want to propose as a *type* a thing that can't be a primary
> key?
>
> The problem of complexity is that everything in the JSON blob can,
> fundamentally, be represented as tables in SQLite.  I realize "modern"
> tools read/write JSON, that it's the OODBMS of the Javascript set.  But
> that doesn't change the fact that the JSON tree is a graph, and we know
> every graph can be represented with tables.
>
> Why does that matter?  Because a tree-as-datatype would add a whole new
> theoretical structure (graph theory) that is 100% redundant to the
> relational model embodied in SQLite.  You get a bunch of new functions
> and ways to get at the data.  What you do *not* get is additional query
> power.  In fact you get less, because graph theory gives you less: no
> subsets and no joins, to name just two.
>
> That's not to say there should be some rule preventing you from storing
> JSON in your SQLite database.  You may find it convenient, especially if
> supported with some functions that make it possible to compare (or
> perhaps update) components of it, because it represents some giant
> state-property that for most purposes can be treated as an integral
> unit.  It is to say that every JSON-specific feature you add duplicates
> one already present (in a different form) in SQLite.  Add enough of
> them and you'll replace the DBMS with itself, if you see what I mean.
>
> --jkl
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

[sqlite] INSERT OR REPLACE and INSERT/DELETE/UPDATE triggers

2015-05-31 Thread David Barrett

Ah, naturally I found the answer right after sending this.  The secret:

PRAGMA recursive_triggers = 1;

As stated here: https://www.sqlite.org/lang_conflict.html

"When the REPLACE conflict resolution strategy deletes rows in order to
satisfy a constraint, delete triggers fire if and only if recursive
triggers are enabled."

Re-running my test with this shows it causes the delete trigger to fire as
expected:

sqlite> PRAGMA recursive_triggers = 1;
sqlite> insert or replace into nvps values ( 'name1', 'value4' );
sqlite> select * from nvps;
name1|value4
sqlite> select * from actions;
inserted "name1", "value1"
inserted "name2", "value2"
updated "name2" from "value2" to "value2.1"
deleted "name2", "value2.1"
deleted "name1", "value1"
inserted "name1", "value4"
sqlite>

Awesome.  Go sqlite!

-david

On Sat, May 30, 2015 at 11:33 PM, David Barrett 
wrote:

> I fear I already know the answer to this, but I want to ask to make sure:
> is there ANY way to craft a trigger that can detect when an INSERT OR
> REPLACE trigger does a REPLACE versus an INSERT?  I found this thread which
> suggested it wasn't possible:
>
>
> http://sqlite.1065341.n5.nabble.com/Trigger-logic-with-INSERT-OR-REPLACE-td65541.html
>
> But it wasn't totally clear to me, so I tried to construct a test case
> myself -- and to my dismay, it confirmed (so far as I can tell) that a
> REPLACE looks to the trigger exactly like an INSERT.  Is there any trick
> I'm overlooking, or is that just a current limitation of sqlite?
>
> The test I ran follows:
>
> Davids-MacBook-Air:~ dbarrett$ sqlite3
> SQLite version 3.8.5 2014-08-15 22:37:57
> Enter ".help" for usage hints.
> Connected to a transient in-memory database.
> Use ".open FILENAME" to reopen on a persistent database.
> sqlite> /* -
>...>   Create a simple name/value pair table
>...>   -- */
> sqlite> create table nvps ( name primary key, value );
> sqlite>
> sqlite> /* -
>...>   Create a table of "actions" performed on
>...>   the NVP table
>...>   -- */
> sqlite> create table actions ( actions );
> sqlite>
> sqlite> /* -
>...>Create the triggers to log actions
>...>   -- */
> sqlite> create trigger onInsert after insert on nvps
>...> begin
>...> insert into actions values ('inserted "' || new.name || '", "' ||
> new.value || '"' );
>...> end;
> sqlite>
> sqlite> create trigger onUpdate after update on nvps
>...> begin
>...> insert into actions values ('updated "' || new.name || '" from "'
> || old.value || '" to "' || new.value || '"' );
>...> end;
> sqlite>
> sqlite> create trigger onDelete after delete on nvps
>...> begin
>...> insert into actions values ('deleted "' || old.name || '", "' ||
> old.value || '"' );
>...> end;
> sqlite>
> sqlite> /* -
>...>   Insert a couple values; note that the
>...>   triggers work as expected in all cases.
>...>   -- */
> sqlite> insert into nvps values ( 'name1', 'value1' );
> sqlite> insert into nvps values ( 'name2', 'value2' );
> sqlite> update nvps set value='value2.1' where name='name2';
> sqlite> delete from nvps where name='name2';
> sqlite> select * from nvps;
> name1|value1
> sqlite> select * from actions;
> inserted "name1", "value1"
> inserted "name2", "value2"
> updated "name2" from "value2" to "value2.1"
> deleted "name2", "value2.1"
> sqlite>
> sqlite> /* -
>...>   Attempt to insert a non-unique value into
>...>   a column with a unique constraint -- note
>...>   how it fails.
>...>   -- */
> sqlite> insert into nvps values ( 'name1', 'value3' );
> Error: UNIQUE constraint failed: nvps.name
> sqlite>
> sqlite> /* -
>...>   Finally, attempt the same insert, but with
>...>   an "or replace".  Note how it works, and
>...>   correctly logs the the insert, but doesn't
>...>   also log a delete.  In other words, no
>...>   trigger fired that indicates we replaced a
>...>   value.
>...>   -- */
> sqlite> insert or replace into nvps values ( 'name1', 'value4' );
> sqlite> select * from nvps;
> name1|value4
> sqlite> select * from actions;
> inserted "name1", "value1"
> inserted "name2", "value2"
> updated "name2" from "value2" to "value2.1"
> deleted "name2", "value2.1"
> inserted "name1", "value4"
> sqlite>
>
> Any suggestions?  Thanks!
>
> -david
>

[sqlite] INSERT OR REPLACE and INSERT/DELETE/UPDATE triggers

2015-05-31 Thread David Barrett

I fear I already know the answer to this, but I want to ask to make sure:
is there ANY way to craft a trigger that can detect when an INSERT OR
REPLACE trigger does a REPLACE versus an INSERT?  I found this thread which
suggested it wasn't possible:

http://sqlite.1065341.n5.nabble.com/Trigger-logic-with-INSERT-OR-REPLACE-td65541.html

But it wasn't totally clear to me, so I tried to construct a test case
myself -- and to my dismay, it confirmed (so far as I can tell) that a
REPLACE looks to the trigger exactly like an INSERT.  Is there any trick
I'm overlooking, or is that just a current limitation of sqlite?

The test I ran follows:

Davids-MacBook-Air:~ dbarrett$ sqlite3
SQLite version 3.8.5 2014-08-15 22:37:57
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> /* -
   ...>   Create a simple name/value pair table
   ...>   -- */
sqlite> create table nvps ( name primary key, value );
sqlite>
sqlite> /* -
   ...>   Create a table of "actions" performed on
   ...>   the NVP table
   ...>   -- */
sqlite> create table actions ( actions );
sqlite>
sqlite> /* -
   ...>Create the triggers to log actions
   ...>   -- */
sqlite> create trigger onInsert after insert on nvps
   ...> begin
   ...> insert into actions values ('inserted "' || new.name || '", "' ||
new.value || '"' );
   ...> end;
sqlite>
sqlite> create trigger onUpdate after update on nvps
   ...> begin
   ...> insert into actions values ('updated "' || new.name || '" from "'
|| old.value || '" to "' || new.value || '"' );
   ...> end;
sqlite>
sqlite> create trigger onDelete after delete on nvps
   ...> begin
   ...> insert into actions values ('deleted "' || old.name || '", "' ||
old.value || '"' );
   ...> end;
sqlite>
sqlite> /* -
   ...>   Insert a couple values; note that the
   ...>   triggers work as expected in all cases.
   ...>   -- */
sqlite> insert into nvps values ( 'name1', 'value1' );
sqlite> insert into nvps values ( 'name2', 'value2' );
sqlite> update nvps set value='value2.1' where name='name2';
sqlite> delete from nvps where name='name2';
sqlite> select * from nvps;
name1|value1
sqlite> select * from actions;
inserted "name1", "value1"
inserted "name2", "value2"
updated "name2" from "value2" to "value2.1"
deleted "name2", "value2.1"
sqlite>
sqlite> /* -
   ...>   Attempt to insert a non-unique value into
   ...>   a column with a unique constraint -- note
   ...>   how it fails.
   ...>   -- */
sqlite> insert into nvps values ( 'name1', 'value3' );
Error: UNIQUE constraint failed: nvps.name
sqlite>
sqlite> /* -
   ...>   Finally, attempt the same insert, but with
   ...>   an "or replace".  Note how it works, and
   ...>   correctly logs the the insert, but doesn't
   ...>   also log a delete.  In other words, no
   ...>   trigger fired that indicates we replaced a
   ...>   value.
   ...>   -- */
sqlite> insert or replace into nvps values ( 'name1', 'value4' );
sqlite> select * from nvps;
name1|value4
sqlite> select * from actions;
inserted "name1", "value1"
inserted "name2", "value2"
updated "name2" from "value2" to "value2.1"
deleted "name2", "value2.1"
inserted "name1", "value4"
sqlite>

Any suggestions?  Thanks!

-david

[sqlite] Does length() use strlen() on blobs?

2015-03-01 Thread David Barrett

Thank you everybody; casing to BLOB on insert is the solution I was looking
for.  This will ensure length() returns the number of bytes, which is
exactly what I want.  Thanks!

-david

On Sun, Mar 1, 2015 at 1:05 PM, Keith Medcalf  wrote:

>
> length() on a blob returns the length of the blob without examining the
> blob itself.  The length is encoded in the data structure.
> length() on text returns the number of characters in the text.  This
> requires retrieving the data and counting.
>
> Whether sqlite stores a blob or text in the column depends on what you
> pass it.  For example:
>
> sqlite> create table x(x blob);
> sqlite> insert into x values ('x');
> sqlite> insert into x values (cast('x' as blob));
> sqlite> select * from x;
> x
> x
> sqlite> select typeof(x) from x;
> text
> blob
>
> So, if you insert text into a column, text is stored.  If you insert a
> blob into a column, a blob is stored.  Column affinity is (as in all cases)
> practically irrlevant.  What you put will depend on how you put it (if
> inline as in the above example), or whether you use bind_blob or bind_text
> when storing the data -- just as what you get out depends on the type you
> ask for when retrieving the data.
>
> ---
> Theory is when you know everything but nothing works.  Practice is when
> everything works but no one knows why.  Sometimes theory and practice are
> combined:  nothing works and no one knows why.
>
>
> >-Original Message-
> >From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:sqlite-users-
> >bounces at mailinglists.sqlite.org] On Behalf Of David Barrett
> >Sent: Sunday, 1 March, 2015 11:41
> >To: General Discussion of SQLite Database
> >Subject: [sqlite] Does length() use strlen() on blobs?
> >
> >Hi!  If I define a column as BLOB type, will length() run strlen() on the
> >data, or will it consult some kind of internal value to determine the
> >length?  The documentation says:
> >
> >"For a string value X, the length(X) function returns the number of
> >characters (not bytes) in X prior to the first NUL character.  ... For a
> >blob value X, length(X) returns the number of bytes in the blob."
> >
> >This suggests to me it uses some internal length value (eg, not strlen),
> >but given that sqlite doesn't use strong typing, I'm wondering if I
> >insert
> >a string into that blob if it'll return the blob length or the string
> >length.
> >
> >I ask because I'm going to be using sqlite on a performance-sensitive
> >application to store small strings (as blobs), and one of the most
> >frequent
> >queries will be to get the length of the blob.  I'm wondering if I should
> >create a separate length column, or if I should just use length().
> >
> >Thanks!
> >
> >-david
> >___
> >sqlite-users mailing list
> >sqlite-users at mailinglists.sqlite.org
> >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

[sqlite] Does length() use strlen() on blobs?

2015-03-01 Thread David Barrett

Hi!  If I define a column as BLOB type, will length() run strlen() on the
data, or will it consult some kind of internal value to determine the
length?  The documentation says:

"For a string value X, the length(X) function returns the number of
characters (not bytes) in X prior to the first NUL character.  ... For a
blob value X, length(X) returns the number of bytes in the blob."

This suggests to me it uses some internal length value (eg, not strlen),
but given that sqlite doesn't use strong typing, I'm wondering if I insert
a string into that blob if it'll return the blob length or the string
length.

I ask because I'm going to be using sqlite on a performance-sensitive
application to store small strings (as blobs), and one of the most frequent
queries will be to get the length of the blob.  I'm wondering if I should
create a separate length column, or if I should just use length().

Thanks!

-david

Re: [sqlite] Custom functions, variable parameters, and SELECT triggers

2015-02-01 Thread David Barrett

Thank you both for the thoughtful responses.  Comments:

Re: NEW.* -- Thanks Igor for pointing out that my assumptions were totally
off base.  Assumptions are tricky things!

Re: "Why can't the trigger call myFunc(new.colA, new.colB)" -- that would
definitely work, but I'd like to make the trigger independent from the
table schema, such that I could add this trigger to a table without even
knowing its schema.  The ultimate goal is to just automatically add it to
every table in the database, and thereby monitor all changes to all rows on
all tables -- without needing to hand-roll a bunch of triggers for each.

PRAGMA table_info() - Hm, that's a very clever technique.  My original goal
was to create a reusable trigger query that can be applied verbatim to any
table, that will pass all columns to a function for each row change.  It's
sounding like this isn't possible, unfortunately.  But with your trick, I
could at least programmatically generate the appropriate trigger query as
follow:

1) Get a list of all tables
2) For each table, get a list of all columns with PRAGMA table_info()
3) Assemble and execute the appropriate query for each table to create a
comprehensive trigger.

Cool, thanks for the helpful tips!

-david

On Fri, Jan 30, 2015 at 9:31 PM, Tristan Van Berkom <
tris...@upstairslabs.com> wrote:

> On Sat, 2015-01-31 at 00:04 -0500, Igor Tandetnik wrote:
> > On 1/30/2015 10:44 PM, David Barrett wrote:
> > > Is it possible to create a trigger that calls a custom function and
> passes
> > > in NEW.*?
> >
> > Not literally NEW.* . You'll have to spell out individual columns as
> > parameters.
> >
> > > 2) I'm *assuming* if you pass a "*" into that function, it'll just call
> > > that function with however many columns are available.
> >
> > Your assumption is incorrect. If I recall correctly, the syntax
> > "myFunc(*)" is accepted, and is equivalent to "myFunc()" - that is, no
> > parameters are passed. Pretty much the only reason to allow this syntax
> > is to accept "count(*)".
> >
> > > 3) It seems that there is a way to create a custom trigger that has no
> > > byproduct but to call SELECT.  The only reason I can imagine you'd
> want to
> > > do that is to call a custom function.  But can I call that function
> with
> > > all columns using a *?  (I can't quite figure this out from the docs
> alone.)
> >
> > Well, you could have tested it, and discovered it didn't work. You don't
> > even need a custom function, you could have used a built-in one.
> >
> > >  SELECT myFunc( NEW.* );
> >
> > That would produce a syntax error.
> >
> > > Are these assumptions correct, and should the above generally work?
> >
> > No, and no.
> >
> > > My
> > > goal is to execute myFunc() every time there's an INSERT/UPDATE/DELETE
> on a
> > > given table -- but I want myFunc() to be reusable and not need to know
> the
> > > structure of the table it's being called on.
> >
> > I'm not sure how the necessity of myFunc(NEW.*) syntax follows from
> > this. Why can't the trigger call myFunc(new.colA, new.colB)?
> >
> > You can write a variadic custom function (one that can be called with an
> > arbitrary number of arguments), if that's what you are asking.
>
> Additional note,
>
> In order to generate queries on tables for which you dont know their
> structure (I've found this particularly useful in dynamic schema
> upgrades), you might find this useful:
>
>   PRAGMA table_info ('table_name')
>
> This part should help you to generate a query which passes all the
> columns of a given table to myFunc()
>
> Cheers,
> -Tristan
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Custom functions, variable parameters, and SELECT triggers

2015-01-30 Thread David Barrett

Is it possible to create a trigger that calls a custom function and passes
in NEW.*?  To break that question down:

1) I know it's possible to create custom functions that take a variable
number of parameters.

2) I'm *assuming* if you pass a "*" into that function, it'll just call
that function with however many columns are available.  For example, this
would call myFunc() with two parameters:

CREATE TABLE foo ( colA INTEGER, colB TEXT );
SELECT myFunc( * ) FROM foo;

3) It seems that there is a way to create a custom trigger that has no
byproduct but to call SELECT.  The only reason I can imagine you'd want to
do that is to call a custom function.  But can I call that function with
all columns using a *?  (I can't quite figure this out from the docs alone.)

CREATE TRIGGER myFuncOnFooInsert AFTER INSERT ON foo
BEGIN
SELECT myFunc( NEW.* );
END

Are these assumptions correct, and should the above generally work?  My
goal is to execute myFunc() every time there's an INSERT/UPDATE/DELETE on a
given table -- but I want myFunc() to be reusable and not need to know the
structure of the table it's being called on.  Thanks, as always, I
appreciate your help!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Whish List for 2015

2015-01-18 Thread David Barrett

Got it, so this should work fine, so long as I'm careful to always re-add
the collate functions to each database handle before accessing the table
with that index.  Cool, thanks!

-david

On Sun, Jan 18, 2015 at 1:16 PM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> On 18 Jan 2015, at 8:29pm, David Barrett <dbarr...@expensify.com> wrote:
>
> >SELECT createNewJSONCollationFunction( "FooCollationSequence", "foo"
> );
> >SELECT createNewJSONCollationFunction( "BarCollationSequence", "bar"
> );
>
> Those two might be okay depending on what the quoted values are meant to
> be.
>
> >CREATE INDEX ON testDataFoo ( "json" FooCollationSequence );
> >CREATE INDEX ON testDataBar ( "json" BarCollationSequence );
>
> Syntax wrong.
>
> CREATE INDEX indexname ON tablename (columnname COLLATE collatefunction)
>
> Indexes need a name each.  Inside the brackets goes the name of the column
> you're indexing on, then COLLATE, then the collation function to use (which
> can be your own).  You cannot make the  collation function take a parameter
> at this point.
>
> > Before giving that a shot, however, I'm hoping someone here can give me a
> > sense of whether that'll work?  Specifically, if I'm generating these
> > dynamic collation sequences on the fly, how does sqlite store the index
> on
> > disk?
>
> SQLite uses the simplest most obvious way to do it: the values returned by
> your COLLATE function are stored in the index when a new row is added to a
> table.  If your COLLATE function changes after rows have been added to the
> index, SQLite does not understand the problem and can corrupt the index as
> it tries to change it assuming that the old values are correct.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Whish List for 2015

2015-01-18 Thread David Barrett

Incidentally, I just added this last night to a project I'm working on --
my first custom function!  Once again I'm impressed with the power of
sqlite as I went from concept to working implementation in just 2 hours.

I'm using a JavaScript-style query syntax, where you just directly
dereference the object attributes, and index in with an array.  So a query
on a simple object looks like this:

Query: SELECT extractJSONValue( '{"foo":"bar"}', '.foo' );
bar

This works for nested objects and arrays too:

Query: SELECT extractJSONValue( '{"foo":[1,{"bar":2},3]}',
'.foo[1].bar' );
2

And it works inside aggregate functions:

Query: CREATE TABLE testData ( json TEXT );
Query: INSERT INTO testData VALUES ( '{"foo":1}' );
Query: INSERT INTO testData VALUES ( '{"foo":2}' );
Query: INSERT INTO testData VALUES ( '{"foo":3}' );
Query: SELECT SUM( extractJSONValue( json, '.foo' ) ) FROM testData;
6

Anyway, next up I'd like to start indexing on JSON data.  It looks like I
could do this with a custom COLLATE sequence [1] and then use that function
in an index [2].  Granted, I'll need to create a new collation sequence for
each item I'd like to index (eg, if I have an object {"foo":1,"bar":2} and
want to index on both attributes, I'll need to create two separate
collation sequences).  But I'm thinking I can just create a custom function
create new collation functions on the fly:

SELECT createNewJSONCollationFunction( "FooCollationSequence", "foo" );
SELECT createNewJSONCollationFunction( "BarCollationSequence", "bar" );
CREATE INDEX ON testDataFoo ( "json" FooCollationSequence );
CREATE INDEX ON testDataBar ( "json" BarCollationSequence );

Before giving that a shot, however, I'm hoping someone here can give me a
sense of whether that'll work?  Specifically, if I'm generating these
dynamic collation sequences on the fly, how does sqlite store the index on
disk?  For example, after creating the index and shutting down the database
handle, how does that index work the next time I load the database --
before dynamically re-initializing the collation sequence?

Anyway, any thoughts on this would be appreciated.  Thanks!

-david

[1] https://www.sqlite.org/c3ref/create_collation.html
[2]
http://stackoverflow.com/questions/20942566/indexes-with-custom-collations-in-sqlite

On Fri, Jan 2, 2015 at 1:47 PM, Philippe Riand  wrote:

> We’ll probably look at providing a set of functions for handling JSON in
> SQLite, similarly to what POSTGRESQL is doing. But, to make it efficient,
> we need to index the JSON content. I suggested earlier this year to get
> expression based indexes, so one can index the result of a function like
> json_path(“my column”,”a.b.c”) to speed up the search.
>
> That would be my own, single item, wish list :-) It can definitively open
> many doors.
>
> Happy new year, and I wish you guys a great 2015 year!
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?

2015-01-08 Thread David Barrett

Sorry for the slow response -- yes, this is great logic.  We're just
disabling vacuum.  Thanks!

-david

On Mon, Dec 8, 2014 at 6:18 PM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> On 9 Dec 2014, at 1:36am, David Barrett <dbarr...@expensify.com> wrote:
>
> > *Re: Why VACUUM.*  We vacuum weekly.  This particular database is a
> > "rolling journal" -- we are constantly adding new rows to the end of the
> > table, and every week we truncate off the head of the journal to only
> keep
> > 3M rows at the "tail".  Given that we're truncating the "head", without
> > vacuuming we'd be inserting the new rows at the "front" of the database
> > with the old rows at the "end" -- and then each truncation would leave
> the
> > database more and more fragmented.  Granted, this is on SSDs so the
> > fragmentation doesn't matter a *ton*, but it just adds up and gets worse
> > over time.  Anyway, agreed it's not the most important thing to do, but
> all
> > things being equal I'd like to do it if I can to keep things clean and
> > snappy.
>
> Okay.  I have some great news for you.  You can completely ignore
> VACUUMing without any time or space drawbacks.  You're wasting your time
> and using up the life of your SSD for no advantage.
>
> Fragmentation ceases to become a problem when you move from rotating disks
> to SSD.  SSD is a truly random access medium.  It's no faster to access
> block b then block b+1 than it is block b then block b+1000.  Two
> contiguous blocks used to be faster in rotating disks only because there is
> a physical read/write head and it will already be in the right place.  SSDs
> have no read/write head.  It's all solid state and accessing one block is
> no faster than another.
>
> Delete old rows and you'll release space.  Insert new rows and they'll
> take up the space released.  Don't worry about the internal 'neatness' of
> the file.  Over a long series of operations you might see an extra block
> used from time to time.  But it will be either zero or one extra block per
> table/index.  No more than that.  A messy internal file structure might
> niggle the OCD side of your nature but that's the only disadvantage.
>
> Also, SSD drives wear out fast.  We don't have good figures yet for
> mass-produced drives (manufacturers introduce new models faster than the
> old ones wear out, so it's hard to gather stats) but typical figures show a
> drive failing in from 2,000 to 3,000 write cycles of each single block.
> Your drive does something called 'wear levelling' and it has a certain
> number of blocks spare and will automatically swap them in when the first
> blocks fail, but after that your drive is smoke.  And VACUUM /thrashes/ a
> drive, doing huge amounts of reading and writing as it rebuilds tables and
> indexes.  You don't want to do something like that on an SSD without a good
> reason.
>
> So maybe once every few years, or perhaps if you have another more
> complicated maintenance routine which already takes up lots of time, do a
> VACUUM then.  But it doesn't really matter if you never VACUUM.  (Which is
> entirely unlike my home, dammit.)
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?

2014-12-08 Thread David Barrett

Hi all, great questions:

*Re: Why VACUUM.*  We vacuum weekly.  This particular database is a
"rolling journal" -- we are constantly adding new rows to the end of the
table, and every week we truncate off the head of the journal to only keep
3M rows at the "tail".  Given that we're truncating the "head", without
vacuuming we'd be inserting the new rows at the "front" of the database
with the old rows at the "end" -- and then each truncation would leave the
database more and more fragmented.  Granted, this is on SSDs so the
fragmentation doesn't matter a *ton*, but it just adds up and gets worse
over time.  Anyway, agreed it's not the most important thing to do, but all
things being equal I'd like to do it if I can to keep things clean and
snappy.

*Re: "a simple way is to sleep in the progress callback"* -- Can you tell
me more about this?  Are you referring to the callback provided to
sqlite3_exec(), or something else?

Thanks!

-david

On Mon, Dec 8, 2014 at 3:16 PM, Roger Binns  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 12/08/2014 01:35 PM, Max Vlasov wrote:
> > I wonder whether I/O "sleeping" possible in the first place.
>
> In this particular case the OP wants to vacuum while the machine is
> doing other I/O activity unrelated to the vacuum.  Having more
> sleeping during the vacuum will allow the other I/O a greater share.
>
> Roger
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1
>
> iEYEARECAAYFAlSGMTgACgkQmOOfHg372QRxMACgz3qZHBGcUrOyf4DkFR5Km1a4
> jm4AoL49txXLfzPQefbjlnGg9UZ4GtcP
> =9gAV
> -END PGP SIGNATURE-
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Artificially slow VACUUM by injecting a sleep() somewhere?

2014-12-07 Thread David Barrett

Hi!  I have a large database, and running the VACUUM pragma hammers disk IO
so badly that it dramatically affects performance on the box.  I'm not in a
hurry for the results, so I'm curious if you can think of a way using the
API (or any other way) to essentially "nice" the process by inserting a
short "sleep" into whatever loop runs inside the VACUUM command.

(I had initially thought of putting a sleep() into the callback to
sqlite3_exec() callback, but that's only called once for each result row --
I need something that's called consistently through the running of the
query.)

Other alternatives we're considering are to fork and nice the process, or
to call ioprio_set() directly, but I'm curious if there's a simpler way to
do it.  Thanks!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-30 Thread David Barrett

Ah, to clarify, there is only one "master" at any point in time.  So this
isn't a "multi-master" scenario where each node keeps committing locally
and then somehow merging the results later.  Rather, each node knows if
it's the master or slave (or a variety of other states).  If it's a master,
it organizes the two-phase distributed commit.  If it's a slave, it
escalates to the master.  And if it's something else, then it just holds on
to the request and waits until it's either a slave or a master.

-david

On Wed, Oct 31, 2012 at 2:09 AM, Chris Peachment <ch...@ononbb.com> wrote:

> On Wed, 2012-10-31 at 00:49 +0700, David Barrett wrote:
> > Thanks Alek!  Yes, we're definitely planning on it, just trying to
> > find the right time.  We don't want to go through the work to open
> > source it only to be greeted with silence.  Might you be interested in
> > using it in an actual deployed environment, or just studying it?
> >
> >
> Your proposal to open source the replication method used by Expensify
> has me interested. My application of interest is much smaller than
> yours, just a handful of remote clients that risk loss of connectivity
> but wish to continue with database updates during the downtime.
>
> Aside from the details of protocol usage and statement packaging, the
> concern for collisions during merge is a particular issue of interest.
>
> Chris
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-30 Thread David Barrett

On Tue, Oct 30, 2012 at 1:00 AM, Alek Paunov <a...@declera.com> wrote:

> On 29.10.2012 11:58, David Barrett wrote:
>
>> Because in practice, as someone actually doing it (as opposed to
>> theorizing
>> about it), it works great.  The MySQL portions of our service are always
>> in
>> a semi-constant state of emergency, while our sqlite portions just hum
>> along   And given that we're switching to SSDs, I expect they will hum
>> even
>> better.  What problems would you expect me to be seeing that I can happily
>> report I'm not, or what problems have I not yet encountered but will -- at
>> 100GB, or 1TB?
>>
>
> In your previous thread (2012-02), you have mentioned that you are about
> to open-source your replication method based on SQL statement distribution.
> Probably your work would be of interest for a huge number of sites managing
> data volumes around or bellow your current level, even if you switch to
> PostgreSQL at this point.
>
> IMHO, there might be a future for your replication model, because I think
> that SQLite, can more easily (relative to other proven DB technologies e.g.
> PostgreSQL) be turned to DB engine for more query languages than SQL
> (thanks to his clever VM design).
>
> Furthermore, AFAIK, PostgreSQL replicates at WAL distribution level, most
> NoSQL databases at keys distribution level, whereas your method seems more
> efficient as bandwidth.
>

Thanks Alek!  Yes, we're definitely planning on it, just trying to find the
right time.  We don't want to go through the work to open source it only to
be greeted with silence.  Might you be interested in using it in an actual
deployed environment, or just studying it?

As for the size this works up to, I should emphasize that Expensify uses
this for our main database -- and we have over a *million* users on it.
 That's not to say a million users is the biggest thing ever, but it's a
lot bigger than most websites (with far more complicated data structures),
and it works great.  Furthermore, we're in the process of upgrading all our
hardware and we feel that alone will get us at *least* an order of
magnitude improvement in capacity -- wiithout any algorithmic changes.  And
we've got plenty of ideas how to improve the basic technology and/or
restructure our database to get even more capacity, should we need it.

The upshot is I don't see a specific reason why it couldn't scale up to a
5M, 10M, or larger service.  And if it starts to break down after that?
 Well that's a problem we should all love to have.

Additionally, I think people get so excited about big data that they
overlook the importance of *available* data.  With this technology,
everything is replicated offsite in realtime, ensuring that service can
continue uninterrupted even when a whole datacenter goes underwater (as is
happening to many datacenters at this very moment in NYC) or falls off the
map (as happens to various AWS zones with surprising regularity).  Our
technology seamlessly fails over when any node (even the master) disappears
(or reappears), without dropping a single transaction -- the web layer
doesn't even know if it's talking to a master or slave, or it was a slave
that became master mid-transaction.

This total confidence in the data layer is what allows us to sleep soundly
even when servers crash: similar to how Google only fixes broken servers
every quarter, any business in this day and age that stresses out when a
server dies is doing it wrong.  Indeed, i'm writing this from a hotel in
Bangkok because every year we take the whole company overseas for a month
to work from the beach -- something that would be inconceivable to an
organization that puts all its eggs in one datacenter.

As for SQL versus binary replication, it has its pros and cons -- it's
generally (though not always) more bandwidth efficient, but at a higher CPU
cost: slaves need to redo all the work as the master.  But it's
fantastically simple, and I feel a simple design brings the most important
efficiency of all: easy to understand, easy to debug, easy to verify.

As for Postgre, MySQL, or any other database back end -- yes, it'd designed
to be a layer above the database.  We're in the midst of making it
optionally backed by a MySQL store, but yes, it should be easy to put
anything behind it.

Finally, that's interesting about using this to replicate non-SQL languages
-- yes, it's definitely language agnostic.  Anything that has the notion of
an atomic transaction with ROLLBACK and COMMIT should work fine with it.

Thanks for the interest!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-30 Thread David Barrett

On Mon, Oct 29, 2012 at 8:28 PM, Richard Hipp <d...@sqlite.org> wrote:

> On Mon, Oct 29, 2012 at 5:58 AM, David Barrett <dbarr...@expensify.com
> >wrote:
> > So what specifically do you feel is the problem with sqlite at scale?
> >
> > And to be clear, I'd like to pre-empt "well it doesn't do X, which you'll
> > probably also want to do" -- all those X's are already covered and
> working
> > great.

Generally the argument in favor of client/server databases versus SQLite
> comes down to (1) Concurrency, (2) Replication, and (3) Network access.
> The size of the database file shouldn't really be a factor.

Yes, thank you, I'm sorry I didn't make it clear -- concurrency,
replication, and network access are *not* problems for me.  Specifically:

1) Our server is single-threaded (all writes are serialized), so there are
no concurrency issues.
2) We have our own replication layer (which is better than MySQL and
Postgres replication)
3) We provide our own network access (on top of the replication layer)
4) The backup API works fine
5) Everything is on the local disk

So I agree entirely -- if you need something sqlite doesn't provide, then
sqlite isn't the right choice.  I'm just saying sqlite provides everything
I need, and does it incredibly well.  Similarly, I'm encouraged to hear
that there aren't any particular size restrictions or concerns.  We're
currently at 40GB and it's working great.

But back to the original question -- can anybody point me in a direction to
learn more about MySQL's handling of fragmentation and how it differs from
sqlite?  This isn't an obsession, it doesn't even really affect us in light
of our moving to SSDs.  It's just a gap in my knowledge I'm looking to
fill.  Thanks!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-29 Thread David Barrett

On Mon, Oct 29, 2012 at 2:16 AM, Richard Hipp <d...@sqlite.org> wrote:

> It sounds like you are pushing SQLite well beyond what it was intended to
> do.  Remember the motto:  SQLite is not intended to replace Oracle, it is
> intended to replace fopen().  SQLite does a great job for roles such as
> data storage for a desktop application, or for databases in cellphones or
> other gadgets, or as a local cache to a enterprise network database.  But
> SQLite was never designed or intended to replace an enterprise-level
> RDBMS.  And I think you probably need an enterprise-level RDBMS at this
> point.
>

So what specifically do you feel is the problem with sqlite at scale?

And to be clear, I'd like to pre-empt "well it doesn't do X, which you'll
probably also want to do" -- all those X's are already covered and working
great.  I just mean, what in particular do you feel about sqlite works
great for a 50MB database, but doesn't work at a 50GB database?  I'm very
open to being convinced -- and you'd be the person to convince me.  But I
don't like to make decisions based on vague fears.  "Best practices" often
aren't.

Because in practice, as someone actually doing it (as opposed to theorizing
about it), it works great.  The MySQL portions of our service are always in
a semi-constant state of emergency, while our sqlite portions just hum
along   And given that we're switching to SSDs, I expect they will hum even
better.  What problems would you expect me to be seeing that I can happily
report I'm not, or what problems have I not yet encountered but will -- at
100GB, or 1TB?

Thanks!

-david

PS: Also, if anybody does know anything about MySQL fragmentation, I'd
still love some pointers.  I'm not sure how my casual request became
interpreted as an obsession, but either way, I'd still love the benefit of
your knowledge.





>
> MySQL is a good choice.  But here is another data point to consider:  When
> we were writing the SqlLogicTest test suite for SQLite, we ran the test
> vectors on a wide variety of server-class database engines in addition to
> SQLite.  And in every case (including SQLite) we found cases that would
> crash the server.  Every case, that is, except one.  We were never able to
> crash PostgreSQL, nor find a case where PostgreSQL gave the wrong answer.
>
> Furthermore, whenever there is a question about what the behavior of some
> obscure SQL construct ought to be and whether or not SQLite is doing it
> right, usually the first thing we check is how PostgreSQL responds to the
> same query.  When in doubt, we try to get SQLite to do the same thing as
> PostgreSQL.
>
> Far be it from me to recommend one client/server database engine over
> another.  But in my experience.....  well, you can fill in the rest,
> probably...
>
> On Sun, Oct 28, 2012 at 10:48 AM, David Barrett <dbarr...@expensify.com
> >wrote:
>
> > Wow, I didn't realize this was such a controversial question.
> >
> > I'm a huge sqlite fan.  Expensify is built on sqlite.  We have a 40GB
> > database, replicated using our custom distributed transaction layer
> across
> > 5 severs in three different datacenters.[1]  It's been powering all of
> > Expensify (including our direct deposit reimbursement engine and credit
> > card import layer -- both of which contain incredibly sensitive
> > information, with mistakes causing millions of dollars to move in the
> wrong
> > direction).  On the back of sqlite, we've grown to over million users,
> > processing millions of dollars in expense reports every day.
> >
> > However, we're starting to see problems.  There is so much activity on
> some
> > servers that there is never a chance for our checkpointing thread to do
> its
> > thing, so our WAL file often ballons up to 30GB or more.  This makes
> query
> > times plummet.  We regularly checkpoint manually, and often vacuum, all
> in
> > an effort to keep queries moving quick.  We also do things to trick out
> our
> > indexes in order to ensure proper disk ordering, pay particular attention
> > to block and cache amounts, etc.  This isn't premature optimization for
> the
> > sake of having fun, these are in response to real performance problems
> > affecting our product.
> >
> > In light of that, there is a contingent pushing to drop sqlite in favor
> of
> > MySQL.  There are a wide range of reasons -- it has its own replication,
> > better write concurrency, clustered indexes, and better edge-case data
> > integrity (because we use 2 DBs and WAL, ATTACH doesn't provide atomic
> > commit advantages).  And for each I have a corresponding answer --
> MySQL's
> > replication isn't as good as ours, concurrency doesn't matter because we

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-28 Thread David Barrett

Wow, I didn't realize this was such a controversial question.

I'm a huge sqlite fan.  Expensify is built on sqlite.  We have a 40GB
database, replicated using our custom distributed transaction layer across
5 severs in three different datacenters.[1]  It's been powering all of
Expensify (including our direct deposit reimbursement engine and credit
card import layer -- both of which contain incredibly sensitive
information, with mistakes causing millions of dollars to move in the wrong
direction).  On the back of sqlite, we've grown to over million users,
processing millions of dollars in expense reports every day.

However, we're starting to see problems.  There is so much activity on some
servers that there is never a chance for our checkpointing thread to do its
thing, so our WAL file often ballons up to 30GB or more.  This makes query
times plummet.  We regularly checkpoint manually, and often vacuum, all in
an effort to keep queries moving quick.  We also do things to trick out our
indexes in order to ensure proper disk ordering, pay particular attention
to block and cache amounts, etc.  This isn't premature optimization for the
sake of having fun, these are in response to real performance problems
affecting our product.

In light of that, there is a contingent pushing to drop sqlite in favor of
MySQL.  There are a wide range of reasons -- it has its own replication,
better write concurrency, clustered indexes, and better edge-case data
integrity (because we use 2 DBs and WAL, ATTACH doesn't provide atomic
commit advantages).  And for each I have a corresponding answer -- MySQL's
replication isn't as good as ours, concurrency doesn't matter because we
serialize writes and have a single threaded server anyway, clustered
indexes would be nice but we can get close enough with custom ROWIDs, and
the extremely rare situation where there's a cross-database integrity
problem, we can detect and recover from any of the other slaves.  And I
also add in that sqlite can never crash because it's built into the server;
its performance is fantastic because it runs in the same process; in years
of operation we've never once seen it corrupt data; it's so easy to use;
etc etc.

But there's an argument I've heard come up to which I don't have a
response: MySQL handles fragmentation better, and by extension would give
us better performance on the same hardware.   I'd like to know more about
it, which is why I've asked.  Thanks!

-david

[1] http://permalink.gmane.org/gmane.comp.db.sqlite.general/71868

On Sun, Oct 28, 2012 at 2:12 PM, Pavel Ivanov <paiva...@gmail.com> wrote:

> OK. Curiosity is a good thing in certain situations. But could you
> kindly tell me what will you do with this information (assuming it's
> possible to obtain it of course)?
>
> Pavel
>
> On Sat, Oct 27, 2012 at 11:54 PM, David Barrett <dbarr...@expensify.com>
> wrote:
> > I completely understand the wide and varied differences.  I'm just *also*
> > interested in this very specific issue.
> >
> > -david
> >
> > On Sun, Oct 28, 2012 at 12:38 PM, Pavel Ivanov <paiva...@gmail.com>
> wrote:
> >
> >> > That said, I'd still welcome any quick summary of the differences
> between
> >> > sqlite and mysql when it comes to fragmentation.
> >>
> >> This is far from main differences between sqlite and mysql that you
> >> should consider if you want to choose between them unless of course
> >> your question is just about an academic interest. As you are talking
> >> about employees I guess you are not in some purely academic exercise.
> >> In this case think more about in-process code vs pumping through
> >> TCP/IP stack, designed mostly to be accessible from machine-local
> >> processes only vs accessible to anyone on the network, plain access to
> >> everything vs versatile and complicated authorization and
> >> authentication mechanisms, and so on and so forth. Database format is
> >> never a part of the decision which DBMS you want to use.
> >>
> >> Pavel
> >>
> >>
> >> On Sat, Oct 27, 2012 at 9:32 PM, David Barrett <dbarr...@expensify.com>
> >> wrote:
> >> > Thanks, this is really helpful!  (And I lecture my employees about the
> >> > evils of premature optimization all the time.  In fact, I'll lecture
> >> anyone
> >> > in earshot, so frequently that it's the butt of jokes.)
> >> >
> >> > That said, I'd still welcome any quick summary of the differences
> between
> >> > sqlite and mysql when it comes to fragmentation.  I often get in
> debates
> >> > about sqlite versus other datbases, and I'm always eager to be
> informed.
> >> >  Thanks!
> >> >
> >>

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-28 Thread David Barrett

I completely understand the wide and varied differences.  I'm just *also*
interested in this very specific issue.

-david

On Sun, Oct 28, 2012 at 12:38 PM, Pavel Ivanov <paiva...@gmail.com> wrote:

> > That said, I'd still welcome any quick summary of the differences between
> > sqlite and mysql when it comes to fragmentation.
>
> This is far from main differences between sqlite and mysql that you
> should consider if you want to choose between them unless of course
> your question is just about an academic interest. As you are talking
> about employees I guess you are not in some purely academic exercise.
> In this case think more about in-process code vs pumping through
> TCP/IP stack, designed mostly to be accessible from machine-local
> processes only vs accessible to anyone on the network, plain access to
> everything vs versatile and complicated authorization and
> authentication mechanisms, and so on and so forth. Database format is
> never a part of the decision which DBMS you want to use.
>
> Pavel
>
>
> On Sat, Oct 27, 2012 at 9:32 PM, David Barrett <dbarr...@expensify.com>
> wrote:
> > Thanks, this is really helpful!  (And I lecture my employees about the
> > evils of premature optimization all the time.  In fact, I'll lecture
> anyone
> > in earshot, so frequently that it's the butt of jokes.)
> >
> > That said, I'd still welcome any quick summary of the differences between
> > sqlite and mysql when it comes to fragmentation.  I often get in debates
> > about sqlite versus other datbases, and I'm always eager to be informed.
> >  Thanks!
> >
> > -david
> >
> > On Sat, Oct 27, 2012 at 11:42 PM, Simon Slavin <slav...@bigfraud.org>
> wrote:
> >
> >>
> >> On 27 Oct 2012, at 11:38am, David Barrett <dbarr...@expensify.com>
> wrote:
> >>
> >> > I'm trying to learn more about MySQL versus sqllite when it comes to
> >> > vacuuming and fragmentation, especially as it relates to SSD storage.
> >>
> >>
> >> Rather than answer your questions point-by-point, I'm going to give you
> >> the current state of play.  Your understanding of how the various DBMSes
> >> work is right, but your excellent question "in a world with SSDs and a
> >> tremendous amount of RAM, does vacuuming matter nearly as much as on a
> >> spinning disk with constrained RAM?" cuts to the heart of all your other
> >> points.  The following involves a little simplification and handwaving
> >> because otherwise it would be two chapters long and you'd have to do
> >> homework.
> >>
> >> Fragmentation made a big difference to apps running on Windows, but very
> >> little on any other platform.  This is because Windows does something
> >> called 'read-ahead caching' which assumes that if you read block B,
> you're
> >> soon going to want to read block B+1, so at quite a low level it
> helpfully
> >> pre-reads it for you.  Other operating systems don't make this
> assumption.
> >>  This is why Windows users talk about defragmentation so much, but Unix
> >> users don't care about it.
> >>
> >> SSDs negate the whole point of defragmentation.  On a rotational disk
> it's
> >> faster to read blocks B, B+1, B+2, B+4, B+5 than it is to read five
> random
> >> blocks from the disk, because the read heads are already positioned in
> the
> >> right place, and the disk is going to rotate to show those five blocks
> in
> >> order.  SSDs are just like RAM: they're Random Access Memory.  Reading
> any
> >> five blocks in any order takes roughly the same time.  So nobody cares
> >> about fragmentation on an SSD.  Read whatever blocks you want in
> whatever
> >> order you want.
> >>
> >> As to clever management of disk block alignment with respect to rows and
> >> columns, this is rarely worth attention these days.  The amount of
> >> programming and debugging time it takes to get this right, and the
> amount
> >> of extra processing and disk access you need to do, give you less
> return on
> >> investment than if you spent the same money on buying a faster hard
> disk.
> >>  It's "premature optimization" (look up the term) except for two cases:
> >> overnight runs and realtime 3D graphics.  If your overnight run takes
> more
> >> than one night, you have a problem.  If you're programming realtime 3D
> >> graphics and they're jerky, your users won't enjoy your simulation.  But
> >> you wouldn't be using a SQ

Re: [sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-27 Thread David Barrett

Thanks, this is really helpful!  (And I lecture my employees about the
evils of premature optimization all the time.  In fact, I'll lecture anyone
in earshot, so frequently that it's the butt of jokes.)

That said, I'd still welcome any quick summary of the differences between
sqlite and mysql when it comes to fragmentation.  I often get in debates
about sqlite versus other datbases, and I'm always eager to be informed.
 Thanks!

-david

On Sat, Oct 27, 2012 at 11:42 PM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> On 27 Oct 2012, at 11:38am, David Barrett <dbarr...@expensify.com> wrote:
>
> > I'm trying to learn more about MySQL versus sqllite when it comes to
> > vacuuming and fragmentation, especially as it relates to SSD storage.
>
>
> Rather than answer your questions point-by-point, I'm going to give you
> the current state of play.  Your understanding of how the various DBMSes
> work is right, but your excellent question "in a world with SSDs and a
> tremendous amount of RAM, does vacuuming matter nearly as much as on a
> spinning disk with constrained RAM?" cuts to the heart of all your other
> points.  The following involves a little simplification and handwaving
> because otherwise it would be two chapters long and you'd have to do
> homework.
>
> Fragmentation made a big difference to apps running on Windows, but very
> little on any other platform.  This is because Windows does something
> called 'read-ahead caching' which assumes that if you read block B, you're
> soon going to want to read block B+1, so at quite a low level it helpfully
> pre-reads it for you.  Other operating systems don't make this assumption.
>  This is why Windows users talk about defragmentation so much, but Unix
> users don't care about it.
>
> SSDs negate the whole point of defragmentation.  On a rotational disk it's
> faster to read blocks B, B+1, B+2, B+4, B+5 than it is to read five random
> blocks from the disk, because the read heads are already positioned in the
> right place, and the disk is going to rotate to show those five blocks in
> order.  SSDs are just like RAM: they're Random Access Memory.  Reading any
> five blocks in any order takes roughly the same time.  So nobody cares
> about fragmentation on an SSD.  Read whatever blocks you want in whatever
> order you want.
>
> As to clever management of disk block alignment with respect to rows and
> columns, this is rarely worth attention these days.  The amount of
> programming and debugging time it takes to get this right, and the amount
> of extra processing and disk access you need to do, give you less return on
> investment than if you spent the same money on buying a faster hard disk.
>  It's "premature optimization" (look up the term) except for two cases:
> overnight runs and realtime 3D graphics.  If your overnight run takes more
> than one night, you have a problem.  If you're programming realtime 3D
> graphics and they're jerky, your users won't enjoy your simulation.  But
> you wouldn't be using a SQL engine for 3D graphics anyway.
>
> The matters you mentioned were all worth attention back in the 1980s when
> storage and bandwidth were expensive.  As you pointed out near the end of
> your post, these things matter less now.
>
> Simon.
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] sqlite versus MySQL automatic defragmentation on SSDs?

2012-10-27 Thread David Barrett

I'm trying to learn more about MySQL versus sqllite when it comes to
vacuuming and fragmentation, especially as it relates to SSD storage.  Can
anybody summarize for me the differences?

1) My understanding is sqlite, in general, has no automatic
defragmentation: there is no process to gradually and incrementally
reposition data on disk in index order.  (Even auto-VACUUM just reclaims
deleted data by inserting new transactions in the holes left behind by
previous deletes, which the documentation says can actually worsen
fragmentation.)  The only defragmentation option is a full VACUUM, which
rebuilds the entire database from scratch in index order.  During this
period, the whole database is locked and cannot be used.  Is this roughly
right?

2) My understanding of MySQL is that the equivalent function is to OPTIMIZE
TABLE.  Unlike sqlite, which works on a full database, this works on a
single table, but is otherwise the same -- rebuilds the entire table (and
thus needs 2x the disk space), locks the table during the operation, is not
incremental (either works fully or not at all).  Is this about right?

3) I've heard mention that in some configurations (perhaps some database
engines?) MySQL doesn't need vacuuming for some reason -- somehow it just
doesn't fragment, or resolves fragmentation as it goes -- but I can't track
down an explanation of why.  Any ideas?

4) Does MySQL (or sqlite?) have any fancy defragmentation capabilities,
such as (for example) using a clustered primary index where no two
secondary rows are put into the same block?  (For example, imagine a table
containing historical credit card transactions, where transactions are
grouped first by "cardID" and then each card has a sequential list of rows
identified by "transactionID".  Every night, new transactions are added to
each card, meaning new data would ideally be inserted throughout the entire
table, as opposed to just adding at the very end. I could imagine a
clustered index ordering all transactions for a single card back-to-back on
disk, except skipping to the next full database block before adding
transactions for the next card.  This would intentionally leave space free
in the database for additional transactions to be added to each card in
disk-sequential-order, explicitly avoiding transactions from multiple cards
being stored in the same block.  This wouldn't be a complete solution --
the database blocks for a particular card wouldn't necessarily be
sequential -- but the transactions inside each block *would* be sequential,
and no two blocks would contain transactions from different cards.  Does
this make sens?  Does anything like this exist?)

4) My understanding is disk order matters a lot for spinning disks, as it
reduces seek times when dealing with data in neighboring primary keys.
 However, am I correct in assuming that the necessity for this is
dramatically reduced by SSDs, which allow random access?  Granted,
vacuuming should always help to a degree: if your rows are smaller than the
block size, then it's better to have neighboring rows in the same block (so
you needn't read multiple blocks to get the same data).  And it's always
good to ensure your database block size is the same as the disk block size
(whether SSD or spinning) so you only need to read a single disk block per
database block.  But in a world with SSDs and a tremendous amount of RAM,
does vacuuming matter nearly as much as on a spinning disk with constrained
RAM?

Thanks, I really appreciate your thoughts and links to reading material!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How much disk space is required to checkpoint?

2012-09-29 Thread David Barrett

On Fri, Sep 28, 2012 at 10:49 AM, Jay A. Kreibich  wrote:

>   The standard C I/O library includes the tmpfile() call, which performs
>   the whole process of generating a random temporary file name, opening
>   the file, and then unlinking the file.


Wow, very interesting, thank you.  I stand corrected!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] How much disk space is required to checkpoint?

2012-09-27 Thread David Barrett

Incidentally, I agree with Simon.  It exchanges the extremely easy problem
of temp namespace collision for the very hard problem of invisible files
filling up your hard drive.  Indeed, it was diagnosing this hard problem --
at great cost in time and confusion -- that caused us to discover it in the
first place.  (Though to clarify, it seems to be in /var/tmp for extra
confusion.)

Also, I disagree its a "standard, time tested" solution.  I've never even
heard of this trick before, and despite its passionate defense, I'd
encourage everybody (including the SQLite maintainers) to stop using it.

(No need to defend it further Jay, I'm done.)

-david
On Sep 26, 2012 7:21 AM, "Simon Slavin"  wrote:

>
> On 26 Sep 2012, at 3:07pm, Jay A. Kreibich  wrote:
>
> >  Because it is the standard, time-tested way of doing this kind of
> >  thing on Unix-- for a lot of very good reasons.
>
> Okay.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] database disk image is malformed - Error

2012-09-27 Thread David Barrett

In my experience, retrying does often work.

-david

On Tue, Sep 25, 2012 at 7:31 PM, Rittick Gupta  wrote:

> Do you think a retry of the query would help resolve this issue ? Do I
> have to close & reopen the database.
>
> Thanks for your help.
>
> regards,
>
> Rittick
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] database disk image is malformed - Error

2012-09-25 Thread David Barrett

Yes, still having that problem. We've moved to new servers with SSD's and a
ton of RAM, and that seems to have helped matters -- not sure why, though I
don't know why it was happening in the first place.  (I'm *guessing* the
issue is due to two conflicting queries happening at the same time, so if
the queries happen faster the probability of collision goes down?)

-david

On Tue, Sep 25, 2012 at 6:39 PM, Rittick Gupta  wrote:

> David,
>   Thanks. Do you still have the problem ? Did you find a workaround to
> avoid the problem - appreciate your response.
>
> regards,
>
> Rittick
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] How much disk space is required to checkpoint?

2012-09-25 Thread David Barrett

If my database is X GB, and the WAL file is Y GB, how much total disk space
is required to:

1) Checkpoint the database

2) Vacuum the database

Additionally, if there are any temporary files created, where do they
exist?  We've done some poking around and are having some crazy theory
about a temporary file being created inside /tmp, but in a *deleted* state.
 I'm guessing we're wrong, so I figured it would be easiest just to ask.
 Thanks!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] database disk image is malformed - Error

2012-09-25 Thread David Barrett

You might be interested in this thread:

http://www.theusenetarchive.com/usenet-message-sqlite-does-disabling-synchronous-and-shared-cache-cause-%22error-database-disk-image-is-malformed%22-20780199.htm

No conclusion was ever obtained, but the discussion was good.

-david

On Mon, Sep 24, 2012 at 4:55 PM, Rittick Gupta  wrote:

> I have a mult-threaded application. Each thread uses its own database
> handle. One thread got a return value of 11 (disk image malformed) - with
> the sqlite3_step statement (select statement). When we restarted the
> application after the failure the database recovered and did not give any
> error.
>
> Why did the "sqlite3_step" statement returned a "disk malformed" error
> when the disk is not corrupt ?
>
> Any help will be appreciated.
>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Error: database disk image is malformed

2012-08-16 Thread David Barrett


On 08/16/2012 11:19 AM, Richard Hipp wrote:

On Thu, Aug 16, 2012 at 1:52 PM, Tal Tabakman wrote:


Hi.
we are writing an application that in runtime produces a sqlite database.

we get the following error:*Error: database disk image is malformed
Extended Error code: 11*
*
*
*what can cause this error ? how can I recover from it ? *



Have you read the http://www.sqlite.org/howtocorrupt.html document?

Have you tried running your application using valgrind?


Also, you might read this thread:

http://comments.gmane.org/gmane.comp.db.sqlite.general/71856

I'm having this same issue on a very active 30GB sqlite database.  The 
main process has no trouble whatsoever, but when I attempt to 
simultaneously access the database via the sqlite3 command-line tool, it 
occasionally reports a corrupt database.  Often re-running the command 
works, and it always works when I shut down the main process. 
Similarly, analyzing an idle database shows it's never actually corrupt.


-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Does disabling synchronous AND shared cache cause "Error: database disk image is malformed"?

2012-02-07 Thread David Barrett


On 02/07/2012 05:27 PM, Richard Hipp wrote:

On Tue, Feb 7, 2012 at 8:13 PM, David Barrett<dbarr...@expensify.com>wrote:
My best guess still is that the command-line shell is somehow not seeing
the shared-memory file that is created when the database is running in WAL
mode.  Or, perhaps the posix advisory locks are not working on that
shared-memory file.  The shared-memory file is the one with the "-shm"
suffix.  Of course, the command-line tool and the server process must both
be running on the same machine or else they won't be able to see each
others shared memory, and stuff like this will happen.  You are running
both processes on the same machine, right?


Correct: same machine, same non-networked disk. ext2 file system.  I 
think your theory is a good one, I'm just not sure how to solve it. 
We've seen it for a long time (years now), but I agree upgrading to the 
latest version might get luck.   Bringing in Simon's questions:




Is your shared access to the database multi-thread, multi-process, or 
multi-computer ?  If multi-computer, how is each computer accessing the 
database file: which sort of network access are you using ?  If there's a local 
computer, what format is the hard disk formatted in ?


Multi-threaded (main thread and a checkpointing thread).  The server is 
single-process, but the client operates in a separate process so I guess 
this particular issue is in fact "multi-process".  It's only 
multi-computer insofar as we have a replication layer hovering high above.




The result codes indicating 'malformed database' are transient, right ?  After 
getting that result code, you can try the operation again immediately and 
perhaps not get that result code ?


The result only happens when we access via the client -- the API never 
gives this issue.  But yes, it's transient in that sometimes when you 
re-run the query it works.  My theory is it works if perhaps the server 
doesn't change the database at the same time, which is increasingly 
infrequent.




Have you ever seen them when you have only one thread, one process working on 
the database ?


No, this only happens when accessing the database with the command line 
client while it is actively being accessed by the server.




Do they happen when you have more than one thread/process reading the database ?


Uncertain.  I don't know precisely what the server is doing at the exact 
moment that the command line app accesses the database.




Do they happen when you have more than one thread/process writing the database ?


We only have a writer thread and a checkpointing thread.  We checkpoint 
every 5s; not sure if this affects it.




Are you, perhaps, using -DSQLITE_SHM_DIRECTORY when compiling the servers,
but failing to use the same -D when compiling the command-line tool?


This is possible -- we haven't been hand-compiling the sqlite3 
command-line binary at the same rate that we've been upgrading the 
library.  (I didn't realize this was necessary: I thought old binaries 
were compatible with newer databases?  Ah, but perhaps not when they're 
actively being accessed by newer code?)  When we upgrade to the latest 
version, we'll hand-compile the binary to the same version and try again.


Thanks!

-david
Founder and CEO of Expensify
Follow us at http://twitter.com/expensify
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Does disabling synchronous AND shared cache cause "Error: database disk image is malformed"?

2012-02-07 Thread David Barrett


On 02/07/2012 05:08 PM, Richard Hipp wrote:

There are no known limitations on the use of PRAGMA synchronous=OFF, WAL
mode, and shared cache together.


Ah, I guess that blows that theory.  But thanks for the quick and 
thorough response nonetheless.  I'm a huge sqlite fan.  I just wish I 
could figure out this (non)malformed database issue!


-david
Founder and CEO of Expensify
Follow us at http://twitter.com/expensify
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Does disabling synchronous AND shared cache cause "Error: database disk image is malformed"?

2012-02-07 Thread David Barrett


On 02/07/2012 04:05 PM, Richard Hipp wrote:

This tells me that the error is occurring at
http://www.sqlite.org/src/artifact/5047fb303cdf6?ln=1362 which occurs right
as SQLite is first starting to decode a page that it as loaded from the
disk.  The error indicates that the shell really is seeing a malformed
database file.

Can you tell me more about your "custom distributed transaction layer"?
Might that have something to do with this?  Are you using a custom VFS?
Are you bypassing the built-in locking mechanisms of SQLite and doing some
kind of custom locking?  Are you running this on a network filesystem?


Our custom distributed transaction layer happens a higher layer -- it 
doesn't require any modifications to sqlite itself, and uses the 
standard sqlite API.  So far as each individual sqlite database is 
concerned, it should have no awareness of the existence of cloned copies 
of itself on other servers.


(I'm not sure if it's helpful, but to give some more detail on this.)

Each server starts up and creates two sqlite databases: "main.db" and 
"journal.db".  Journal contains a simple three column table: ID, query, 
and hash.  ID is an auto-incrementing counter, query is the raw SQL of a 
single read/write query (eg, ignoring SELECTs), and hash is an 
incremental SHA1 hash of all read/write SQL queries ever performed.  So 
each time you do a query on "main.db", the same query is also logged in 
"journal.db".


We operate five realtime replicated servers spread across georedundant 
datacenters, where any slave server can perform read operations on its 
local database, but all writes are escalated to the master.  The master 
processes one read/write query at a time, each time negotiating with a 
majority of slaves in a semi-synchronous two-phase commit manner (where 
the master commits once a majority of slaves approve).


That's all pretty easy and straightforward, but the tricky part is in 
bringing new nodes into the system (where the necessary queries are 
replayed in order to rebuild the database), or handling master/slave 
failovers (where the slaves elect a new master, synchronize, and 
continue -- without dropping any of the original queries).


The upshot of all of this is it maintains linear read-scalability and 
continuous uptime in a WAN-replicated manner, in the face of temporary 
network hiccups or permanent hardware failure.  (A lessor benefit is it 
allows seamless upgrading without needing to reconfigure anything, as 
everything automatically detects when the node goes down and repairs 
itself when it comes back up.)


Anyway, it's neat stuff that I love to talk about -- we've considered 
open sourcing the whole thing as it's actually quite nice and powerful. 
 Even MySQL replication only just got semi-synchronous replication, and 
*still* doesn't solve the failover cases.  But I think all this is at a 
higher level than is really relevant to this particular issue.




Or, perhaps you are running the command-line tool on a
different machine where it is not able to access the WAL's database-shm
file in shared memory. So the command-line tool reads a one page of the
database which indicates the the content it is seeking is found on some
other page X.  But by the time the command-line tool has loaded page X, the
server has already shifted the content to someplace else.  The page that
the command-line tool loaded is no longer formatted as the command-line
tool expects it to be, causing exactly the error shown above.


I think this is a very likely candidate.  There is nothing particularly 
fancy about the deployment: no VFS, no custom locking, etc.  But we use 
sqlite3_enable_shared_cache() to disable the cache (even though it's 
disabled by default) -- combined with "PRAGMA synchronous=OFF" (and add 
in WAL just to make it more complex).  Could that explain the behavior 
we're seeing?


Basically, I'm curious if it's a known issue that the combination of the 
above settings creates a situation where the command-line client can't 
be relied upon -- and that the client will incorrectly report a 
malformed database.


Thanks again for all your help!

-david
Founder and CEO of Expensify
Follow us at http://twitter.com/expensify
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Does disabling synchronous AND shared cache cause "Error: database disk image is malformed"?

2012-02-07 Thread David Barrett


On 02/07/2012 03:00 PM, Richard Hipp wrote:

On Tue, Feb 7, 2012 at 5:19 PM, David Barrett<dbarr...@expensify.com>wrote:

2) However, we get erratic behavior when using the sqlite3 command-line
tool to just do a basic select on the database: sometimes it works,
sometimes it returns "Error: database disk image malformed".  Sometimes we
just run the same command many times until it works.


As the very first thing you do in the command-line tool, enter this command:

  .log stdout

That will cause additional error diagnostics to appear on standard output.
Then do your commands that provoke the malformed error, and let us know
what you see as output.


Great idea.  Here's the output:

SQLite version 3.7.2
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .log stdout
sqlite> select count(*) from **redacted**;
(11) database corruption at line 45894 of [42537b6056]
(11) database corruption at line 45932 of [42537b6056]
(11) statement aborts at 16: [select count(*) from **redacted**;] 
database disk image is malformed

Error: database disk image is malformed
sqlite>

It happens very erratically, and each time we've run "PRAGMA 
integrity_check;" after seeing the problem (which requires several hours 
of downtime for that server, so I didn't do it for the above query), it 
comes up clean every single time.


Thanks for your help!

-david

PS: I apologize for redacting the query -- let me know if that would be 
particularly helpful, otherwise I'd like to keep it private.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Does disabling synchronous AND shared cache cause "Error: database disk image is malformed"?

2012-02-07 Thread David Barrett

Hi!  We're seeing the dreaded "Error: database disk image is malformed" 
message, even though the database is actually fine.  I've read a few 
other threads on it and I'd previously concluded that it was just an 
outstanding bug.  But I'm wondering if it's actually due to a 
combination of disabling synchronous writing AND disabling the shared 
cache, and then accessing it from a second process.  Have you heard of 
this?  More details follow:


1) First, we have a server with two threads: the primary thread doing 
the main activity, and a second thread doing checkpointing.


2) However, we get erratic behavior when using the sqlite3 command-line 
tool to just do a basic select on the database: sometimes it works, 
sometimes it returns "Error: database disk image malformed".  Sometimes 
we just run the same command many times until it works.


3) The main server thread opens the database like:
sqlite3_config( SQLITE_CONFIG_MULTITHREAD )
sqlite3_enable_shared_cache( 0 )
#define DB_OPEN_FLAGS SQLITE_OPEN_READWRITE | \
SQLITE_OPEN_CREATE | \
SQLITE_OPEN_NOMUTEX
sqlite3_open_v2( filename.c_str(), &_db, DB_OPEN_FLAGS, NULL )
PRAGMA legacy_file_format = OFF
PRAGMA synchronous = OFF
PRAGMA count_changes = OFF
PRAGMA temp_store = MEMORY
PRAGMA journal_mode = WAL
PRAGMA wal_autocheckpoint = 0

4) All this works great for server threads, and we have the same exact 
commits happening in the same exact order on 5 different servers (using 
a custom distributed transaction layer).


5) And this has worked great even with the command-line access for a 
very long time -- years now, in fact.  But as the database grows (>40GB) 
and activity increases, we've started to get the malformed error more 
and more frequently -- to the point that it's also impossible to query 
the database from the command line anymore.


6) I had just accepted this, but recently we did queries on two servers 
via the sqlite3 command-line, and found they were giving back *very* 
different results -- for several minutes.  Then I'm guessing both 
flushed their changes to disk, after which they gave the same result. 
This made me think "Hm, maybe the command line process isn't able to 
access the in-memory cached data of the server process, and maybe that's 
what's causing the malformed error?"


Anyway, that's the background -- has anybody encountered this before? 
I'd welcome any advice.  Thanks!


-david
Founder and CEO of Expensify
Follow us at http://twitter.com/expensify
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Meaning of sqlite3_wal_checkpoint_v2 parameters

2011-10-06 Thread David Barrett

Hi!  Can you help me understand more exactly what the output parameters 
are from sqlite3_wal_checkpoint_v2()?  Specifically:


1) What is the relationship between pnLog and pnCkpt: is pnLog>=pnCkpt 
always true?


2) Under what circumstances would pnLog>pnCkpt be true?


Also, can you confirm I understand the following correctly:

3) When using SQLITE_CHECKPOINT_PASSIVE, the WAL file will grow as 
needed, indefinitely, without ever shrinking.


4) When using a separate checkpointing thread, pnLog only goes down if 
the read/write thread has no active queries when the checkpoint runs. 
(Eg, if the checkpoint thread runs while the read/write thread has an 
active query, it'll still checkpoint, but it just won't reset the pnLog 
to 0.)



Finally, and I'm not sure if this is related, but I've read in several 
threads that sometimes you can get the SQLITE_CORRUPT error even if the 
database isn't corrupt, and the proper response is to just retry.


5) Is this considered to be a bug, or is it the result of some temporary 
hardware issue?


6) Is this aggravated by the use of WAL, or totally unrelated?


Some data about our setup is at the end of this email; I welcome you 
advice.  Thanks!


-david

--
We have a 3.7.7.1 database named "main.db", which is in WAL mode, with a 
corresponding "main.db-wal" file.  The main.db file is 13519440896 bytes 
(about 13GB) while the WAL file is 25439184 bytes (about 25MB).  Both 
threads open the database with "SQLITE_OPEN_READWRITE | 
SQLITE_OPEN_CREATE | SQLITE_OPEN_NOMUTEX" and SQLITE_CONFIG_MULTITHREAD, 
sqlite3_enable_shared_cache(0), and set "PRAGMA journal_mode = WAL;"  We 
checkpoint the main.db database every 15 seconds using 
SQLITE_CHECKPOINT_PASSIVE, outputting pnLog and pnChkpt each time, with 
the results looking something like:


298 298
263 263
483 483
943 943
11711171
16531653
19471947
212 212
436 436
10111011
12771277
16111611
20352035
24082408
27032703
31283128
548 548
858 858
13381338
18391839
386 386
493 493
679 679
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Write-ahead logging and database locking

2010-08-10 Thread David Barrett

On 08/08/2010 10:09 PM, Dan Kennedy wrote:
>>> 2) However, one process cannot read from the database while another
>>> is writing -- WAL is irrelevant here.
>>
>> Unless shared-cache mode is turned on, multiple threads each using
>> their own sqlite3* connection should behave in the same way as
>> multiple processes do wrt to sqlite locking.
>
> I should be clearer: The above was meant to imply that (2)
> is not a true statement. The others are all correct.

Very interesting!  To confirm I understand, if shared-cache mode is 
enabled, then one process can read while another process is writing.

Also, I see in the documentation that when shared-cache mode is enabled, 
SQLite uses table-level locking (instead of the default file-locking). 
Taken all together, it suggests that you can get table-level locking 
*and* write-ahead logging *and* atomic multi-table commits -- all within 
a single file -- simply by enabling shared cache mode.

Am I reading this correctly, or does shared-cache table-level locking 
still require that each table be put in different files as described here:

http://www.sqlite.org/version3.html

(If so, then it means the only way to get table-locking with WAL is to 
put the tables in different database files, but then WAL "disadvantage 
#3" says you lose atomicity.)

Thanks Dan, I really appreciate your help!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Backup API, .backup command, and writing to handle live backups?

2010-08-10 Thread David Barrett

On 08/08/2010 10:00 PM, Dan Kennedy wrote:
>
> On Aug 9, 2010, at 11:17 AM, David Barrett wrote:
>> 3) When an application performs read/write queries on the database in
>> parallel to the .backup command being run on the database, will the
>> application occasionally get the SQLITE_LOCKED return value to the
>> sqlite3_exec() command?  I think the answer is "yes".
>
> The answer to (3) would be "No." with a WAL mode database.

Ah, got it.  But to clarify, it will occasionally get SQLITE_BUSY if 
non-WAL, right?  Thanks!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Write-ahead logging and database locking

2010-08-08 Thread David Barrett

I'm reading up on the new write-ahead logging feature, and I'm unclear 
on one point: does WAL only help alleviate multi-threaded locking (by 
allowing other threads to continue reading while one is writing), or 
does WAL also help between multiple processes?

My understanding could be completely wrong, but I think the following 
statements are true -- can you help fact check me?  (I've read 
http://www.sqlite.org/lockingv3.html but it doesn't seem to have been 
updated for WAL -- assuming any updates are even needed.)

1) Two processes can read from the same database at the same time.

2) However, one process cannot read from the database while another is 
writing -- WAL is irrelevant here.

3) One process can have two threads that read from the database from the 
same time.

4) And thanks to WAL, now one process can have one thread writing and 
any number of threads reading at the same time.

5) However, WAL does not allow multiple threads to write at the same time.

Is this right?  Or am I totally off?  Are there any good pages outlining 
the intersection between multi-process database locking and WAL?  Thanks!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Backup API, .backup command, and writing to handle live backups?

2010-08-08 Thread David Barrett

I *think* I know the answers these questions, but can you please confirm 
them for me?

1) Does the sqlite3 command-line app .backup command use the 
http://www.sqlite.org/backup.html API, and thus only read-lock the 
database for brief moments (rather than read-locking the database for 
the entire span of the backup)?  I think the answer is "yes".

2) When an application performs read-only selects on the database in 
parallel to the .backup command being run on the database, will the 
application occasionally get the SQLITE_LOCKED return value to the 
sqlite3_exec() command?  I think the answer is "no".

3) When an application performs read/write queries on the database in 
parallel to the .backup command being run on the database, will the 
application occasionally get the SQLITE_LOCKED return value to the 
sqlite3_exec() command?  I think the answer is "yes".

4) Is there any built-in way to identify what other process has locked 
the database when you get the SQLITE_LOCKED return value?  I think the 
answer is "no"

5) Are any of the above answers affected in any way by the new 
write-ahead logging (WAL) feature?  I think the answer is "no".

6) Is there anything an application can do to somehow coordinate with or 
otherwise signal to the .backup script so as to avoid or minimize 
locking conflicts?  I think the answer is "no".


Basically, I'm trying to figure out the best way to do live backups of a 
database thats in active read/write use.  Right now I'm just doing a 
copy of the live file and then running .dump on the copy, but that's 
obviously not ideal and occasionally has corruption issues (especially 
as load increases).  I'm considering using .dump, but I'd like to make 
sure the application is prepared to handle SQLITE_LOCKED correctly. 
Currently it just does a retry after a random timeout, but this 
obviously isn't ideal.  What's the best way to handle this case?  Thanks!

-david
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite3_total_changes() doesn't include schema changes?

2008-10-30 Thread David Barrett

Ah, thanks.  I think I'll wait for the next stable release and go to 
that; in the meantime I'll use the WHERE 1 trick.  Thanks!

-david

Dan wrote:
> On Oct 30, 2008, at 3:10 PM, David Barrett wrote:
> 
>> Ok, getting close: now I test for changes in sqlite3_total_changes()  
>> and
>> PRAGMA schema_version and that works, except for one case:
>>
>>  DELETE FROM table;
>>
>> I see in the docs for sqlite_total_changes() that I can solve this  
>> by doing:
>>
>>  DELETE FROM table WHERE 1;
>>
>> Is there any other way to detect the change without losing that
>> optimization?  Thanks!
> 
> This has been changed in cvs. In version 3.6.5 sqlite3_change()
> and total_changes() will correctly report the number of rows deleted
> by "DELETE FROM table".
> 
> So upgrading to cvs head would work.
> 
> Dan.
> 
> 
> 
>> -david
>>
>> Igor Tandetnik wrote:
>>> David Barrett <[EMAIL PROTECTED]> wrote:
>>>> If so, is there any general way to determine -- given an arbitrary
>>>> query -- whether or not it changed the database?
>>> Run PRAGMA schema_version before and after.
>>>
>>> Igor Tandetnik
>>>
>>>
>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] sqlite3_total_changes() doesn't include schema changes?

2008-10-30 Thread David Barrett

Ok, getting close: now I test for changes in sqlite3_total_changes() and 
PRAGMA schema_version and that works, except for one case:

DELETE FROM table;

I see in the docs for sqlite_total_changes() that I can solve this by doing:

DELETE FROM table WHERE 1;

Is there any other way to detect the change without losing that 
optimization?  Thanks!

-david

Igor Tandetnik wrote:
> David Barrett <[EMAIL PROTECTED]> wrote:
>> If so, is there any general way to determine -- given an arbitrary
>> query -- whether or not it changed the database?
> 
> Run PRAGMA schema_version before and after.
> 
> Igor Tandetnik 
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] sqlite3_total_changes() doesn't include schema changes?

2008-10-28 Thread David Barrett

Am I correct in understanding sqlite3_total_changes() doesn't include 
changes to the "master" tables, and thus won't reflect schema changes?

If so, is there any general way to determine -- given an arbitrary query 
-- whether or not it changed the database?

Basically, I'm creating a replay log and would like to filter out 
everything that doesn't change the database.  sqlite3_total_changes() 
appears to work as expected for everything except schema changes.  Is 
there any more general way to detect changes that would also catch 
schema changes?

Checking the modified date of the database might be one way, but that 
feels like a hack.  Any other ideas?

Thanks!

-david

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Replay logging best practices

2008-10-26 Thread David Barrett

What's the right way to use update/commit/rollback hooks to produce a 
replay log?

Currently I'm doing it at a high level by just recording all SQL 
statements into a replay log, and that works really well except fails in 
some cases like with the use of CURRENT_TIMESTAMP.  (Replaying that will 
insert at the time of replay, and not the time of original insert.)

I'm considering switching to a low-level replay log using the hook 
functions.  What's the recommended way to do this?

In particular:

1) How do I hook schema and index changes?  The docs say the update hook 
ignores changes to system tables.

2) Is there any way to determine which columns of a row changed?

3) Replaying INSERT/DELETE is straightforward, where the replay log 
would simply contain something like:

DELETE FROM  WHERE rowid=;

And:
INSERT INTO  VALUES ( , ... );

But what's the best way to replay an UPDATE?  If I can't determine which 
columns were modified, is there any choice but to update everything?  If 
so, is there any more compact way to do this than:

UPDATE  SET =, =, ... WHERE rowid=;

My tables are rather wide, so this will be pretty inefficient, 
especially if I were to update a single column of an entire table.

Thanks for your suggestions!

-david

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Confirming locking and transaction characteristics

2008-05-14 Thread David Barrett

Igor Tandetnik wrote:
> David Barrett <[EMAIL PROTECTED]> wrote:
>> There is a single table (bar) with a single column (foo) with a single
>> row containing the integer value "1".  Two processes (A and B) have
>> opened the database in READWRITE mode.  They both try to atomically
>> increment the value at the same time.  What happens?
>>
>> 1) [Process A] BEGIN TRANSACTION
>> 2) [Process B] BEGIN TRANSACTION
>> 3) [A] SELECT foo FROM bar
>> (A acquires a SHARED lock)
>> 4) [B] SELECT foo FROM bar
>> (B acquires a SHARED lock)
>> 5) [A] UPDATE bar SET foo=2
>> (A acquires the RESERVED lock)
>> 6) [B] UPDATE bar SET foo=2
>> (? query fails, B's transaction aborted, SHARED lock released?)
> 
> SQLite detects deadlock situation. The call trying to execute the update 
> statement fails immediately with SQLITE_BUSY error. Neither transaction 
> releases its locks: to make progress, one of them has to explicitly roll 
> back.

Aha, got it.  One more question: take the same scenario as before, but 
now have two distinct tables (barA and barB) inside the same database, 
and have each process only deal with its own table:

1) [Process A] BEGIN TRANSACTION
2) [Process B] BEGIN TRANSACTION
3) [A] SELECT foo FROM barA
(A acquires a SHARED lock)
4) [B] SELECT foo FROM barB
(B acquires a SHARED lock)
5) [A] UPDATE barA SET foo=2
(A acquires the RESERVED lock)
6) [B] UPDATE barB SET foo=2
(returns SQLITE_BUSY?)

Would the UPDATE in (6) still return SQLITE_BUSY, and would one of the 
processes still need to rollback for the other to continue?

I guess this is really a question of whether SQLite supports table 
locking or only database locking (and I'm guessing it's the latter, 
meaning the two disjoint tables should be put in different databases so 
they are independently locked).

> Why not simply execute "UPDATE bar SET foo=foo+1;"?

Heh, just looking for a simple example of a multi-query transaction.

Thanks for your help!

-david

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Confirming locking and transaction characteristics

2008-05-14 Thread David Barrett

Can you please double-check the following to make sure it's accurate?

I've read the "File Locking and Concurrency" page and I'm trying to 
verify how SQLite works in the following scenario:

There is a single table (bar) with a single column (foo) with a single 
row containing the integer value "1".  Two processes (A and B) have 
opened the database in READWRITE mode.  They both try to atomically 
increment the value at the same time.  What happens?

1) [Process A] BEGIN TRANSACTION
2) [Process B] BEGIN TRANSACTION
3) [A] SELECT foo FROM bar
(A acquires a SHARED lock)
4) [B] SELECT foo FROM bar
(B acquires a SHARED lock)
5) [A] UPDATE bar SET foo=2
(A acquires the RESERVED lock)
6) [B] UPDATE bar SET foo=2
(? query fails, B's transaction aborted, SHARED lock released?)
7) [A] COMMIT
(A gets the EXCLUSIVE lock, writes, clears the lock)

Is this right?  In particular, in (6) does B's query fail because there 
is already a RESERVED lock held by A?  Or is there some other system 
that notices B's conflicting transaction and aborts it?

Or do I completely misunderstand this, and do both transactions succeed 
(leaving the value as "2" rather than "3")?

Just trying to sort this all out in my head.  Thanks!

-david

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Undo logs and transactions

2008-05-13 Thread David Barrett

Yes, I completely agree.  But a single "undoable event" generates 
multiple changes to the database (which might as well be done inside a 
single transaction).  Thus one click on Undo results in multiple changes 
to undo the event (ie, undoing the entire event's transaction).  This 
implies that somehow the undo log either:

- Tags multiple rows in the undo log as comprising a single event, or
- Accumulates multiple queries into a single row in the undo log

I really liked the elegance of using triggers to automatically create 
the undo log, but I don't see how to accomplish either of the above 
strategies (tagging a group of undo rows, or concatenating multiple 
undos into a single row) in that design without some manual legwork.

So my question was whether there is some "COMMIT_COUNT()" SQL function 
or constant that represents "the number of committed transactions on 
this database", as then I could just stick that in the trigger query and 
insert it along with the undo query into the undo log as follows:

CREATE TEMP TRIGGER _ex1_it AFTER INSERT ON ex1 BEGIN
INSERT INTO undolog VALUES(NULL,COMMIT_COUNT(),'DELETE FROM ex1 WHERE 
rowid='||new.rowid);
END;

Then multiple INSERTs in a single transaction would generate multiple 
DELETEs in the undolog, all tagged with the same commit count.  Later, I 
could look up the set of queries necessary to undo that transaction with:

SELECT sql FROM undolog WHERE commitcount=XXX;

The upshot is I could roll back one transaction at a time, all the way 
back to the beginning (or as far as the undo log goes).

But lacking a COMMIT_COUNT() function, I might instead just create a 
simple table:

CREATE TABLE commitcount(num)

And then increment that at the start of each transaction:

UPDATE commitcount SET num=(SELECT num FROM commitcount)+1;

And then my trigger could be something like:

CREATE TEMP TRIGGER _ex1_it AFTER INSERT ON ex1 BEGIN
INSERT INTO undolog VALUES(NULL,(SELECT num FROM commitcount),'DELETE 
FROM ex1 WHERE rowid='||new.rowid);
END;

This'll do the same thing, but requires a subquery, an extra table, and 
manual incrementing of the commit count.  Works, but is yucky.

Another way might be to create a table that just keeps track of which 
range in the undolog corresponds to which distinct undoable event (aka 
transaction)... Perhaps a bit cleaner, but still not great.

Anyway, I was just curious if the COMMIT_COUNT() function existed, 
because then it'd be really clean and easy.  But it sounds it doesn't, 
and that's what I wanted to know.  Thanks!

-david

D. Richard Hipp wrote:
> On May 13, 2008, at 6:21 PM, David Barrett wrote:
> 
>> True, but even an application would need to "undo" in transactions,  
>> I'd
>> think.  Like, if a user drags a widget from column A to B, it  
>> generates
>> an INSERT in one column and a DELETE in another.  Pressing Undo once
>> would leave it in both columns, which is probably unexpected.
>>
> 
> And undo/redo mechanism typically groups actions together by user  
> input event.  The application is typically event driven.  It sits idle  
> waiting for a user event, such as a mouse click.  That event might  
> trigger a cascade of related events, some of which involve database  
> changes.  All database changes associated with that one event are  
> undone together.  You know that you have reached the end of your event  
> cascade when the event loop goes idle again.
> 
> For processing gestures (drags, and other inputs that involve multiple  
> events spread out over time) you acquire a lock or a "grab" at the  
> beginning of the gesture, hold it throughout the gesture, then release  
> it when the gesture completes.  You do not finish the undo package  
> until the end of the gesture.  So the complete rule for when you stop  
> one undoable entry and start a new one is: you are idle (no pending  
> events) and you are not holding a grab.
> 
> D. Richard Hipp
> [EMAIL PROTECTED]
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Undo logs and transactions

2008-05-13 Thread David Barrett

True, but even an application would need to "undo" in transactions, I'd 
think.  Like, if a user drags a widget from column A to B, it generates 
an INSERT in one column and a DELETE in another.  Pressing Undo once 
would leave it in both columns, which is probably unexpected.

Anyway, I was more thinking of a "transaction count" sort of variable, 
indicating how many total transactions have been committed to the 
database.  Just something to group undo log entries by transaction 
without maintaining a manual transaction identifier.

Thanks for considering it!

-david

Ken wrote:
> the Undo/Redo mechanism described is not really for transactions but rather a 
> "Button" within an application. So that the user can undo/redo changes.
> 
> Only one transaction can be active at a time. So to the best of my knowledge 
> there is no "transaction number" as there can be only 1. Now you may go ahead 
> and implement "pseudo transactions" and implement your own transaction 
> number.  etc...
> 
> HTH,
> Ken
> 
> 
> David Barrett <[EMAIL PROTECTED]> wrote: What's the best way to group undo 
> log entries by transaction?  Is there 
> a function that returns the current transaction number?  Or what's the 
> best way to set a global variable that is inserted into the undo log by 
> the trigger?
> 
> As background, the wiki has a great page on undo/redo:
> 
>  http://www.sqlite.org/cvstrac/wiki?p=UndoRedo
> 
> But unless I misunderstand, it seems that a single transaction will 
> create multiple entries in the log.  For example, a transaction 
> containing both an insert and a delete will generate two entries in the 
> undo log.  Therefore, to undo that transaction, I need to atomically 
> commit the "undo" of both the insert and delete in a single transaction.
> 
> That's fine, but the log created by the example doesn't seem to indicate 
> which group of undo rows correspond to a single transaction.  Indeed, 
> the word "transaction" doesn't even appear in the page.
> 
> Now, digging through the archives I see reference to a "transaction 
> number" here:
> 
> http://www.mail-archive.com/sqlite-users@sqlite.org/msg23962.html
> 
> Is there any explicit way to get the current transaction number?  I was 
> hoping to see some "transaction_number()" function here, but I don't:
> 
>  http://www.sqlite.org/lang_corefunc.html
> 
> If such a function did exist, then I could just update the triggers to 
> insert that along with the corresponding SQL.
> 
> Lacking that, what's the best way to manually to set some kind of global 
> variable at the start of each transaction?  One way is to maintain a 
> separate table with a single cell containing the latest transaction 
> number (which I manually increment at the start of each transaction) and 
> then just SELECT that in the trigger sub-query, but that seems yucky. 
> Is there an easier way I'm missing?
> 
> Thanks!
> 
> -david
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Distributed transaction best practices

2008-05-12 Thread David Barrett

Ah, it sounds like we're building essentially the same thing -- mine is 
also for a financial setting.  You're right, a mirror-replicating mode 
might be adequate, and is certainly simpler...  But part of me is afraid 
that it leaves you vulnerable to failure conditions that a full 
two-phase commit system wouldn't.

For example, I'd like the master to actively reject all transactions 
that aren't committed to at least one slave, else we're operating 
without a net -- if the master is lost during such a period, there is 
unrecoverable data loss.  (And even if the data eventually becomes 
available when the master comes back, until then the backup is 
incomplete and thus putting up a new server is risky.)  I'd rather take 
the service down than risk getting stuck with a huge bill due to an 
accounting error.

As I mentioned, I'm currently just using a serialized two-phase commit 
approach, and that's fine but has very limited throughput. 
Realistically, even that throughput is probably fine to start, but I'd 
like to at least understand the alternatives in case I want to go there 
in the future.

It seems like one alternative might involve undo/redo logs, but I'm not 
quite seeing it.

How does MySQL or Postgres do it?  Does SQLite simply lack some crucial 
feature that is required to do it for real?

-david

Virgilio Alexandre Fornazin wrote:
> The best you can do actually with SQLite is a 'mirror-replicating' mode
> engine
> that works like Microsoft Windows Active Directory Database, to build a king
> of 
> High Availability / Load Balancing server.
> 
> You have a farm of servers (or workstations, etc) receiving SQL commands by
> a 
> channel (socket, pipe, whatever). They force a 'election' do decide what´s
> the
> best servers to become master, then they agree on that and transactions (all
> 
> commands that modify the database) are first applied to the master server
> then
> 'mirrored' to slave servers. In this design, if a 'child' server cannot
> complete
> the transaction that are completed by master and any other server, there is
> a
> critical problem in that slave node, and you must consider it offline or
> some kind of state that you need to check if by hand or by one tool you
> might
> develop for this. In this scenario, you can do a 'load balance' in SELECT´s,
> distributing querying belong all servers, creating affinities for tables, 
> buffering most used tables in a memory database, etc.
> 
> I´m currently implementing services for finantial stock exchanging services
> that
> works in the way I told you, if you are planning something that we can have
> in the
> way SQLite is (not tied to any kind of restrictive license), we can share
> knowledge 
> and implement a solution like that. (PS: I don't have full time to work on
> it, but 
> I can help in free hours)
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Griggs, Donald
> Sent: segunda-feira, 12 de maio de 2008 10:47
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Distributed transaction best practices
> 
> Hi David,
> 
> 
> Regarding:  "What are the recommended "best practices" around using
> SQLite in a distributed scenario?"  [two-phase commit, etc.]
> 
> I trust that someone with some actual relevant knowledge will reply to
> your query later, but I imagine that many would say the the "recommend
> best practice" is *not* to use sqlite, since sqlite was designed to be
> an elegant embedded database -- without even one server -- let alone
> multiple synchronized ones.
> 
> I take it you have strong reasons for rejecting, say, Postgres, which
> now implements two-phase commmit right out of the box?  
>  
> http://www.postgresql.org/docs/current/static/sql-prepare-transaction.ht
> ml
> 
> You may already know everything in articles such as this one
>  
> http://en.wikipedia.org/wiki/Two_phase_commit#Distributed_two-phase_comm
> it_protocol
> And its references (I don't claim to), but I'm listing it here just it
> case it's helpful to you.
> 
> On the other hand, if you *do* develop a solid "distributed sqlite"
> implementation, I'm sure others would be interested.
> 
> Regards,
>   Donald Griggs
> 
>  
> 
> 
> This email and any attachments have been scanned for known viruses using
> multiple scanners. We believe that this email and any attachments are virus
> free, however the recipient must take full responsibility for virus
> checking. 
> This email message is intended for the named recipient only. It may be
> privileged and/or confidential. If you are not the named recipient of this
> email please notify us immediately and do not copy it or use it for any
> purpose, nor disclose its contents to any other person.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> ___
> sqlite-users mailing list
>

[sqlite] Distributed transaction best practices

2008-05-12 Thread David Barrett

What are the recommended "best practices" around using SQLite in a 
distributed scenario?

(I saw Shailesh's email in March; I'm considering building a two-phase 
commit layer atop SQLite nodes and looking for advice.)

Clearly, this is a big topic including failover, recovery, 
load-balancing, and so on.  But right now I'm most curious about the 
transaction layer.

The most obvious approach is to just process all transactions with a 
basic serialized two-phase commit protocol:

1) Master begins transaction in the local database
2) Master sends query to all slaves
3) Slaves begin transaction in their local database
4) Slaves send approval to the master
5) Master commits to the local database
6) Master sends commit command to all slaves
7) Slaves commit the transaction
8) Goto 1

This works great and is what I have now.

But this can only process one transaction at a time, and the commit 
speed is no faster than the slowest slave latency.  I'm curious if you 
can recommend a better way?  (I'm sure this is well documented 
somewhere; any suggestions?)

For example, it seems that transactions could be "pipelined" to decouple 
throughput from latency: after all, barring edge conditions, every slave 
should approve every transaction.  (And any failed transaction would 
probably result in all future transactions failing as well -- eg disk 
fail/full/corrupt.)

So the master could send a constant stream of transactions, and the 
slave could send a constant stream of approvals, and everybody is happy. 
  The only tricky case is in the rare case the slave actually does need 
to abort the transaction.

Assuming there are 10 transactions that haven't yet been committed, the 
slave could commit the first 3 and then abort the remaining 7.  This 
would look like:

1) Master begins 10 transaction in the local database
2) Master sends 10 queries to slave
3) Slave begins 3 transactions successfully, but the next 7 fail.
4) So the slave send 3 approval messages, and then 7 abort messages
5) Master commits the first 3, and then rolls back the remaining 7
6) Master sends commit messages for the first 3, and rollbacks for 7
7) Slave commits the first 3, and rolls back the last 7

Seems pretty straightforward in theory.  But my question is how to 
actually accomplish step 5 with SQLite.

I read a big discussion on nested transactions in the list archives from 
a year ago, and I understand they're not currently supported.  But I'm 
not sure what I want to do is actually nested transactions (else I'd 
nest infinitely deep before committing anything).

Rather, I'm thinking of some kind of... serial commits, or savepoints or 
something.  Basically, maintain a set of "savepoints" that I can revert 
the entire database to at any time.  Then I just commit transactions in 
order, and create a new "savepoint" after each.  Later, when I get the 
approval from the slave, I delete the associated savepoint (to save 
resources, presumably).  Alternatively, if I get a rollback from a 
slave, then I just revert to the savepoint before that transaction and 
-- effect -- rollback every transaction made after.

Can you think of any good way to accomplish this?

One horrifically bad way would be to copy the database after each 
transaction.

One less horrible way would be to manually maintain an undo log (just 
create SQL statements that -- when applied in reverse -- gradually roll 
back the database to any previous state).

http://www.sqlite.org/cvstrac/wiki?p=UndoRedo

But I'm curious if there are any better ways to go about it.  I'm 
guessing this is a pretty well-explored area -- anybody who's gone ahead 
care to share their wisdom?

Thanks!

-david

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

62 matches

Mail list logo