Re: [sqlite] Causal profiling

2019-12-30 Thread Jonathan Brandmeyer
On Wed, Dec 25, 2019 at 4:25 PM Richard Hipp  wrote:

>
> The video provided details on what they did.  I could not find any
> performance improvement by making mutexes direct calls instead of
> indirect calls.


This kind of optimization improves performance by reducing pressure on the
CPU's branch branch-target address caching and prediction.


> Maybe they are measuring performance differently than
> me.  (I use CPU cycle counts measured by valgrind.)
>

By default, valgrind doesn't model either branch predictors or
branch-target address caches.  Its model is somewhat primitive[0], but it
is available through command-line option `--branch-sim` [1,2].  When you
performance tested this change, did you enable that option?
tool/run-speed-test.sh certainly doesn't.

Of course, since those structures are shared globally, the performance
benefit for de-virtualizing any given function call is highly context
dependent.

Sincerely,
-- 
Jonathan Brandmeyer

[0]: https://valgrind.org/docs/manual/cg-manual.html#branch-sim
[1]:
https://www.valgrind.org/docs/manual/cl-manual.html#cl-manual.options.simulation
[2]: https://valgrind.org/docs/manual/cg-manual.html#cg-manual.cgopts
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Theoretical write performance for low-throughput devices

2019-10-22 Thread Jonathan Brandmeyer
On Mon, Oct 21, 2019 at 5:47 PM Richard Hipp  wrote:
>
> On 10/21/19, Jonathan Brandmeyer  wrote:
> >
> > No significant change.  The target filesystem only caches non-aligned
> > writes, so there usually isn't anything for it to do on fsync anyway.
> >
>
> OK.  I don't have any more ideas at the moment, and without access to
> your code, and your platform, I can't really debug it.  But do please
> know that you should only be getting less than 2x writes.  I suggest
> adding instrumentation and trying to come up with a simplified test
> case.

Should the vLogVfs be picking up writes to both the WAL and the
database proper?  I'm seeing a number of WRITE's in the log equal to
the final number of pages plus a small percentage.  For the case where
the WAL is much larger than the benchmark, the number of WRITE's in
the log is almost equal to the number of final pages in the database.

Thanks,
--
Jonathan Brandmeyer
PlanetiQ
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Theoretical write performance for low-throughput devices

2019-10-22 Thread Jonathan Brandmeyer
On Tue, Oct 22, 2019 at 2:03 AM Wout Mertens  wrote:
>  This probably won't change a thing, but I wonder why you wouldn't set the
> sqlite page size to 2KB? Hopefully this means lots of aligned writes.

At one point, the row blobs were exactly 1024 bytes.  This isn't great
for 4kB pages, but it is a particularly unfavorable row size for 2 kB
database pages.

-- 
Jonathan Brandmeyer
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Theoretical write performance for low-throughput devices

2019-10-21 Thread Jonathan Brandmeyer
On Mon, Oct 21, 2019 at 5:00 PM Richard Hipp  wrote:
>
> On 10/21/19, Jonathan Brandmeyer  wrote:
> > On Mon, Oct 21, 2019 at 10:31 AM Richard Hipp  wrote:
> >>
> >> On 10/21/19, Jonathan Brandmeyer  wrote:
> >> > Or, how many times is each page written by SQLite for an insert-heavy
> >> > test?  The answer appears to be "4", but I can only account for two of
> >> > those four.
> >>
> >> Are you measuring writes at the OS-interface layer, or writes at the
> >> hardware layer?  SQLite should issue no more than 2x writes in WAL
> >> mode, and less if the same page is modified more than once.  But who
> >> knows how the OS might be multiplying this to accomplish its own
> >> filesystem consistency.
> >>
> >> Consider compiling the
> >> https://www.sqlite.org/src/file/ext/misc/vfslog.c extension into
> >> SQLite on a workstation and running your code there, to get more
> >> details about everything that SQLite is doing with storage.
> >
> > Thanks for the pointer.  I didn't do this exactly, but instead
> > instrumented SQLite's unixWrite() and my lowest-level NAND driver's
> > block write function on the target hardware to capture summary
> > counters of both events.  The filesystem's own consistency overhead
> > adds up to about 2% additional writes - not nearly enough to account
> > for the missing 2x.  The throughput implied by the low-level counters
> > is consistent with the benchmark results.
> >
>
> What happens if you set "PRAGMA synchronous=OFF".

No significant change.  The target filesystem only caches non-aligned
writes, so there usually isn't anything for it to do on fsync anyway.

> (I'm not suggesting
> that you do this in production - it is just an experiment to try to
> help figure out what is going on.)

Understood.

If it helps, we are doing a couple of other things that are dissimilar
to a typical workstation configuration.  We are using the builtin
"unix-none" VFS, the -DSQLITE_DEFAULT_WAL_SYNCHRONOUS=1 and
-DSQLITE_DEFAULT_LOCKING_MODE=1 compile-time options, along with a
design rules that there are zero or one database connection per file,
and only one thread accesses that connection at a time.  The target
lacks mmap(), posix file locks, and multiple address spaces, so this
seemed like the right settings to use.

Thanks,
-- 
Jonathan Brandmeyer
PlanetiQ
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Theoretical write performance for low-throughput devices

2019-10-21 Thread Jonathan Brandmeyer
On Mon, Oct 21, 2019 at 10:31 AM Richard Hipp  wrote:
>
> On 10/21/19, Jonathan Brandmeyer  wrote:
> > Or, how many times is each page written by SQLite for an insert-heavy
> > test?  The answer appears to be "4", but I can only account for two of
> > those four.
>
> Are you measuring writes at the OS-interface layer, or writes at the
> hardware layer?  SQLite should issue no more than 2x writes in WAL
> mode, and less if the same page is modified more than once.  But who
> knows how the OS might be multiplying this to accomplish its own
> filesystem consistency.
>
> Consider compiling the
> https://www.sqlite.org/src/file/ext/misc/vfslog.c extension into
> SQLite on a workstation and running your code there, to get more
> details about everything that SQLite is doing with storage.

Thanks for the pointer.  I didn't do this exactly, but instead
instrumented SQLite's unixWrite() and my lowest-level NAND driver's
block write function on the target hardware to capture summary
counters of both events.  The filesystem's own consistency overhead
adds up to about 2% additional writes - not nearly enough to account
for the missing 2x.  The throughput implied by the low-level counters
is consistent with the benchmark results.

Thanks,
-- 
Jonathan Brandmeyer
PlanetiQ
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Theoretical write performance for low-throughput devices

2019-10-21 Thread Jonathan Brandmeyer
Or, how many times is each page written by SQLite for an insert-heavy
test?  The answer appears to be "4", but I can only account for two of
those four.

I'm working on an embedded system that uses a log-structured
filesystem on raw NAND flash.  This is not your typical workstation's
managed flash (SATA/NVMe), or portable managed flash (SD/USB).  It's a
bare-nekkid ONFI-speaking chip.  All reads and writes are one 2kB page
at a time.  There is no readahead, and no write buffering by the
driver or filesystem for page-sized writes.

We got the following performance numbers out of the flash storage:

Streaming reads through the filesystem: 7.5 MB/s.
Streaming writes through the filesystem: 5.4 MB/s.
Single insert performance through SQLite: 0.2 MB/s.
Bulk insert performance through SQLIte: 1.3 MB/s, asymptotic for very
large transactions.

I do expect the single-insert performance to be much lower than the
bulk insert performance.

We are using the WAL, and this benchmark includes the cost of a
checkpoint at the end.  Futziing with the WAL autocheckpoint size has
little impact for smaller WAL sizes.  My working assumption is that
using the WAL for an insertion-heavy workload would consume roughly
half of my available throughput due to checkpoint writes.  Indeed, if
the autocheckpoint level is raised high enough that the entire
benchmark fits in the WAL, then I do observe that the write throughput
asymptotically approaches 2.6 MB/s instead of 1.3 MB/s.

That leaves one more factor of two somewhere.

The table in question has the schema:

```
CREATE TABLE IF NOT EXISTS `chunks` (
`target_id`INTEGER NOT NULL,
`chunk_num`INTEGER NOT NULL,
`chunk_blob`BLOB NOT NULL,
PRIMARY KEY(`target_id`,`chunk_num`)
);
```

Other factors that might help understand our workload:
Blobs are a little less than 1 kB each, and we're using the default DB
page size (4 kB).  So I would expect that SQLite would pack about 3
rows per page, leaving some extra for primary keys, field delimiters
and other metadata.  I understand that the composite primary key
implies an index table that goes with the blobs, which implies some
inherent write amplification to account for the index.  Still, my
expectation is that the write throughput added by the index should be
close to the size of the key columns, not the blob columns.  So 2x
still seems too high.

Any other ideas?

Thanks,
--
Jonathan Brandmeyer
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Should SQLite distinguish between +0.0 and -0.0 on output?

2019-06-12 Thread Jonathan Brandmeyer
Regarding additional uses for the sign of zero:

Branch Cuts for Complex Elementary Functions, or Much Ado About
Nothing's Sign Bit in The State of the Art in Numerical Analysis,
(eds. Iserles and Powell), Clarendon Press, Oxford, 1987.

https://people.freebsd.org/~das/kahan86branch.pdf



On Wed, Jun 12, 2019 at 4:40 PM Richard Hipp  wrote:
>
> On 6/12/19, James K. Lowden  wrote:
> > what kind of computation
> > would lead to a value in memory representing -0,0?
>
>   0.0/-1.0
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
Jonathan Brandmeyer
Vice President of Software Engineering
PlanetiQ
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Should SQLite distinguish between +0.0 and -0.0 on output?

2019-06-12 Thread Jonathan Brandmeyer
On Wed, Jun 12, 2019 at 12:02 PM David Raymond 
wrote:

> https://www.sqlite.org/fileformat2.html#record_format
>
> The storage type of each record is given by an integer. And in the current
> format, all non-negative integers are used.
>

Ouch.  Yes, an additional data type was closer to what I had in mind.


> -Original Message-
> From: sqlite-users  On
> Behalf Of Thomas Kurz
> Sent: Wednesday, June 12, 2019 1:05 PM
> To: SQLite mailing list 
> Subject: Re: [sqlite] Should SQLite distinguish between +0.0 and -0.0 on
> output?
>
> > It would also be a file format change, rendering about 1e12 existing
> database files obsolete.
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Should SQLite distinguish between +0.0 and -0.0 on output?

2019-06-12 Thread Jonathan Brandmeyer
IMO, when acting as a storage engine, SQLite should be good to the last
bit.  The sign of zero, the least significant bit of any mantissa, and the
payload bits of NaN should all be returned exactly as they were stored.

The history of almost-but-not-quite-IEEE754 arithmetic has been convergence
towards fully-IEEE754 pretty much everywhere.  People who are serious about
floating-point rely on its features.  Even though serious users are quite a
bit smaller than the community of casual floating-point users, the
maintenance burden of maintaining two sets of semantics means that you're
better off just supporting IEEE754.

For an SQL engine, the next-best-thing to strict binary IEEE754 is not
sloppy binary IEEE754, its probably strict decimal IEEE754.


On Wed, Jun 12, 2019 at 7:35 AM Richard Hipp  wrote:

> IEEE754 floating point numbers have separate representations for +0.0
> and -0.0.  As currently implemented, SQLite always display both
> quantities as just "0.0".
>
> Question:  Should SQLite be enhanced to show -0.0 as "-0.0"?  Or,
> would that create unnecessary confusion?
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
Jonathan Brandmeyer
Vice President of Software Engineering
PlanetiQ
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me fix the SQLite Git mirror

2019-04-22 Thread Jonathan Brandmeyer
On Mon, Apr 22, 2019 at 12:22 PM Richard Hipp  wrote:

> But before I proceed, I would like to better understand how rewiring
> the refs this way constitutes "changing history".  The refs/heads
> entries are all ephemeral - they are constantly changing on their own,
> and no historical record of their past values is retained.  So if I
> modify the refs to synchronize with the canonical Fossil repository,
> how is that changing history, exactly?
>

Certainly no new SHA's were created, so this is much less obvious of a
re-write than if you had performed a rebase of some kind.
Nonetheless, I claim that this constitutes rewriting history because
it has a similar impact to downstream users.  Some user-visible
symptoms, after a user had already synchronized to the master which
was later abandoned:

- From a context of master, `git pull` alone would construct a merge
commit between the abandoned branch and the new master.  `git pull
--ff-only` would fail.

- From a context of a custom patch series, `git rebase master` has
unexpected effects, in that it also rebases the mistake you tried to
orphan.

- `git fetch` shows a forced update to origin/master.

- A user who was using a merge-based workflow and had merged to your
mistake branch would have a rough time following the change in branch
name.  One method would be to construct a new merge to the newly
corrected master and then rebase any of their subsequent changes onto
the new merge commit.  Their workflow is no longer strictly
merge-based and they still have to deal with the impacts of re-writing
their history.  Alternatively, they could construct the inverse of the
mistake via `git revert` onto their own working branch and then merge
again against the new master.

These user-visible impacts and the recovery actions are almost the
same as what a Git user would see if you had initially constructed (A,
B, C, D) and re-written it to be (A, C', D') instead via a rebase.

IMO, the proper corrective action after pushing the commit with a
mistake in it would have been to commit the inverse of the mistake and
then merge it to the alternate path.  Yes, it would have constructed a
merge commit in the history, which is unfortunate when you are trying
to maintain a clean and linear history.  But the impact to downstream
users would have been negligible.  `git pull --ff-only` would have
Just Worked, `git rebase master` from a patch series would have Just
Worked, and a merge-based workflow would have Just Worked, too.

--
Jonathan Brandmeyer
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Please help me fix the SQLite Git mirror

2019-04-22 Thread Jonathan Brandmeyer
```
# Construct the matching branch name
git branch mistake 9b888fc
# Push the name alone to the remote
git push -u origin mistake
# Move the name of master
git checkout master && git reset --hard 
# Push the new name of master
git push --force
```

Git reset --hard will move the name of the current working branch to
another branch SHA, which is why you need to first check out the
branch being moved: Its context sensitive.  You are re-writing
history, though.  It shouldn't construct any new SHA's, but the impact
on a downstream user's workflow is rough.  Once it got published to
public git the least impactful way forward would be to construct the
inverse of the mistake and push that as its own commit instead of
orphaning it.  `git revert` does this in git-land.

If I'm maintaining some patches against your master, then my normal
workflow might be to rebase them against the current master every once
in a while, with just `git rebase master`.  If I did that once to
rebase against the SHA which was is currently named `master`, and then
invoke `git rebase master` again after your change to history, then
the second rebase will also attempt to rebase your mistake onto the
corrected master.  User's would need to perform a one-time `git rebase
--onto master mistake ` instead.


-- 
Jonathan Brandmeyer
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Stack allocation upper bound with SQLITE_USE_ALLOCA

2019-04-03 Thread Jonathan Brandmeyer
On Wed, Apr 3, 2019 at 8:55 AM Richard Hipp  wrote:
>
> On 4/3/19, Jonathan Brandmeyer  wrote:
> > What is the upper bound for stack consumption under the
> > SQLITE_USE_ALLOCA compile-time option?  I see that there are a number
> > of configurable size limits available as compile-time and/or run-time
> > options.  Which ones affect the maximum alloca?
> >
>
> I think the maximum alloca() allocation will be 7x the page size for
> the database file.  So a little less than 0.5 MB assuming a maximum
> page size of 64K.

Thanks!

-- 
Jonathan Brandmeyer
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Stack allocation upper bound with SQLITE_USE_ALLOCA

2019-04-03 Thread Jonathan Brandmeyer
What is the upper bound for stack consumption under the
SQLITE_USE_ALLOCA compile-time option?  I see that there are a number
of configurable size limits available as compile-time and/or run-time
options.  Which ones affect the maximum alloca?

Thanks,
Jonathan Brandmeyer
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users