Re: [sqlite] SLOW execution: Simple Query Uses More than 1 min

2019-12-09 Thread Keith Medcalf

On Monday, 9 December, 2019 20:02, Richard Damon  
wrote:

>On 12/9/19 4:25 PM, Keith Medcalf wrote:

>>> You could still have fast forking without overcommitting, you’d just
>>> pay the cost in unreachable RAM.

>>> If I have 4 GB of RAM in the system, and the kernel takes 1 GB of
>>> that, I start a 2.5 GB user space process, and my process forks 
>>> itself with the intent of starting an 0.1 GB process, that fork 
>>> would have to fail if overcommitting weren’t allowed.

>> No, it wouldn't, and there is no overcommitment.  You are creating a
>> second process that is using the same V:R mapping as the original process
>> thus it is consuming no more virtual memory after the fork operation than
>> before (except for the bytes to track the new process).  You now have two
>> execution paths through the same mapping which may require more real
>> memory working set, but you have not increased the virtual memory size.
>> That is until one of the processes modifies a memory page in which case
>> an additional virtual page must be allocated to hold the modified page.

>> Overcommittment occurs at the R level, not at the second V in the V:V:R
>> mapping.

>> This is why shared libraries (and discontiguous saved segments) were
>> invented.  It permits the per-process mapping (the first V in V:V:R) to
>> use already existing virtual pages (the second V in V:V:R) without
>> increasing the count of Virtual Pages.  It is not overcommittment unless
>> the number of virtual pages (the second V in V:V:R) exceeds the number of
>> pages in R plus backing store.

> Delaying the conversion of shared to distinct (or at least delaying the
> reservation of backing store) is one form of over-committing. If we
> assume that over-committing has been removed, then the fact that the
> fork succeeded is the promise that both processes have the right to
> access all of their address space. Any page that is writable needs to
> have swap space reserved, or you have allowed over committing. The OS
> can delay actually creating the new pages, and thus save some work, but
> if you haven't reserved the space for the virtual page, you are allowing
> an over commit.

Yes, you are correct.  In order to not allow overcommit all the writeable pages 
would have to be committed storage, even if they are presently mapped to read 
from an already existing virtual page (as in CoW).

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SLOW execution: Simple Query Uses More than 1 min

2019-12-09 Thread Richard Damon
On 12/9/19 4:25 PM, Keith Medcalf wrote:
>> You could still have fast forking without overcommitting, you’d just pay
>> the cost in unreachable RAM.
>>
>> If I have 4 GB of RAM in the system, and the kernel takes 1 GB of that, I
>> start a 2.5 GB user space process, and my process forks itself with the
>> intent of starting an 0.1 GB process, that fork would have to fail if
>> overcommitting weren’t allowed.
> No, it wouldn't, and there is no overcommitment.  You are creating a second 
> process that is using the same V:R mapping as the original process thus it is 
> consuming no more virtual memory after the fork operation than before (except 
> for the bytes to track the new process).  You now have two execution paths 
> through the same mapping which may require more real memory working set, but 
> you have not increased the virtual memory size.  That is until one of the 
> processes modifies a memory page in which case an additional virtual page 
> must be allocated to hold the modified page.
>
> Overcommittment occurs at the R level, not at the second V in the V:V:R 
> mapping.
>
> This is why shared libraries (and discontiguous saved segments) were 
> invented.  It permits the per-process mapping (the first V in V:V:R) to use 
> already existing virtual pages (the second V in V:V:R) without increasing the 
> count of Virtual Pages.  It is not overcommittment unless the number of 
> virtual pages (the second V in V:V:R) exceeds the number of pages in R plus 
> backing store.
>
Delaying the conversion of shared to distinct (or at least delaying the
reservation of backing store) is one form of over-committing. If we
assume that over-committing has been removed, then the fact that the
fork succeeded is the promise that both processes have the right to
access all of their address space. Any page that is writable needs to
have swap space reserved, or you have allowed over committing. The OS
can delay actually creating the new pages, and thus save some work, but
if you haven't reserved the space for the virtual page, you are allowing
an over commit.

-- 
Richard Damon

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SLOW execution: Simple Query Uses More than 1 min

2019-12-09 Thread Keith Medcalf

>You could still have fast forking without overcommitting, you’d just pay
>the cost in unreachable RAM.
>
>If I have 4 GB of RAM in the system, and the kernel takes 1 GB of that, I
>start a 2.5 GB user space process, and my process forks itself with the
>intent of starting an 0.1 GB process, that fork would have to fail if
>overcommitting weren’t allowed.

No, it wouldn't, and there is no overcommitment.  You are creating a second 
process that is using the same V:R mapping as the original process thus it is 
consuming no more virtual memory after the fork operation than before (except 
for the bytes to track the new process).  You now have two execution paths 
through the same mapping which may require more real memory working set, but 
you have not increased the virtual memory size.  That is until one of the 
processes modifies a memory page in which case an additional virtual page must 
be allocated to hold the modified page.

Overcommittment occurs at the R level, not at the second V in the V:V:R mapping.

This is why shared libraries (and discontiguous saved segments) were invented.  
It permits the per-process mapping (the first V in V:V:R) to use already 
existing virtual pages (the second V in V:V:R) without increasing the count of 
Virtual Pages.  It is not overcommittment unless the number of virtual pages 
(the second V in V:V:R) exceeds the number of pages in R plus backing store.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume. 



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SLOW execution: Simple Query Uses More than 1 min

2019-12-09 Thread Warren Young
On Dec 9, 2019, at 12:54 PM, Richard Damon  wrote:
> 
> But without virtual memory, many applications combinations that work 
> acceptably now would just fail to run at all.

You don’t even have to get into swapping to find such cases.

I once ran a headless app on a small cloud VPS that would run the system out of 
memory because it was linked to some rather large GUI libraries, which provided 
some small but essential bits of functionality.  Because those libraries were 
written to assume they were running on multi-gigabyte desktop computers with 
VMM, they did things that were simply ill-considered on my 256 MB VPS, which 
was hosted on tech without VMM.

If my VPS provider had used a hosting technology that allowed for swap space, 
most of those libraries’ pages could have been swapped out, solving the problem.

Instead, I ended up having to upgrade to a 512 MB plan just to give that 
program some scratch space to play with.

> Virtual memory itself isn’t the issue.

Well, every layer of indirection has a cost.  The question then becomes, what’s 
the cost of the alternatives?

To take my GUI library case again, I could have rebuilt my app statically so 
that only the necessary library functions were linked into my program instead 
of having the whole .so mapped into my process's VM space, but that then means 
I need to relink the program every time the OS updates that library.

I can pay that cost at least three different ways:

1. Download the sources again on each library upgrade, build from source, 
install, and remove the build and source trees.

2. Keep the sources around, build, install, and “make clean” on each library 
upgrade, paying extra for the disk space to hold the sources, but saving some 
on bandwidth and disk I/O from not needing to repeatedly unpack tarballs.

3. Keep the objects around as well, paying more for disk to hold the built but 
unlinked binaries in order to save some CPU on each relink.

TANSTAAFL.  You don’t get to not pay, you only get to choose *where* you pay.

> the Linux system to my understanding doesn’t have an easy call to just start 
> up a brand new process with parameters from you

Whether that’s true depends on your definitions.  I’d say that system() and 
posix_spawn() are easy calls for starting brand new processes.

However, these calls may be implemented in terms of fork() or similar, so we 
must continue down the rabbit hole…

> a process will fork itself, creating two identical copies of itself, one will 
> continue, and the other will exec the new process, replacing itself with the 
> desired process. The act of forking SHOULD allocated all the virtual memory 
> for the copy of the process, but that will take a bit of time.

You’re describing fork() before the mid-1980s, roughly.

> Because most of the time, all that memory is just going to be released in a 
> couple of instructions, it made sense to just postpone the actual allocation 
> until it was actually used (which it likely wasn’t).

It’s better to describe what happens as copy-on-write rather than anything 
being “postponed.”  Modern fork() uses the more powerful VMM features of CPUs 
to mark the forked process’s pages as CoW so that they’re shared between the 
two children until one tries to change them.  At that point, both processes get 
an independent copy of the changed page.

In the case of the fork()/exec() pattern, most pages never do get copied, since 
they’re almost immediately released by the exec().  Thus, the cost is a bit of 
setup and tear-down that strictly speaking didn’t need to happen. It’s tiny.

> the system allowed itself to overcommit memory

Which is fine as long as you don’t run the system into swapping and you keep a 
bit of swap space around.

You don’t run into serious problems under that condition until you run the 
system wholly out of swap, causing all of the bills to come due at once.

> If the system was changed to not allow over committing, then forking would be 
> slower which hits all of the standard system routines.  

You could still have fast forking without overcommitting, you’d just pay the 
cost in unreachable RAM.

If I have 4 GB of RAM in the system, and the kernel takes 1 GB of that, I start 
a 2.5 GB user space process, and my process forks itself with the intent of 
starting an 0.1 GB process, that fork would have to fail if overcommitting 
weren’t allowed.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SLOW execution: Simple Query Uses More than 1 min

2019-12-09 Thread Richard Damon
But without virtual memory, many applications combinations that work acceptably 
now would just fail to run at all. Virtual memory itself isn’t the issue. Also, 
an OS could fairly easily be set up so an application that start to thrash its 
virtual memory is dropped in priority to get memory, and even getting pages 
swapped in, so that other applications have their operations only minimally 
impacted.

One of the issues is due to the Linux fork/exec model. If a process wants to 
start a process that runs in parallel to it, the Linux system to my 
understanding doesn’t have an easy call to just start up a brand new process 
with parameters from you, but instead a process will fork itself, creating two 
identical copies of itself, one will continue, and the other will exec the new 
process, replacing itself with the desired process. The act of forking SHOULD 
allocated all the virtual memory for the copy of the process, but that will 
take a bit of time. Because most of the time, all that memory is just going to 
be released in a couple of instructions, it made sense to just postpone the 
actual allocation until it was actually used (which it likely wasn’t). This 
‘optimization’ was so ‘complete’ that the system didn’t really keep track of 
how much memory had been promised to the various processes, so the system 
allowed itself to overcommit memory, and if it actually did run out, it didn’t 
have a good way to determine who was at fault, and no way to tell them that the 
memory that was promised prior to them isn’t really available.

Fixing the issue is more of a political problem. With the current system, when 
a problem arises, you can normally find a user program or something the user 
did that was ‘bad’ and can be blamed for the problem. If the system was changed 
to not allow over committing, then forking would be slower which hits all of 
the standard system routines.  

> On Dec 9, 2019, at 8:39 AM, Digital Dog  wrote:
> 
> For reasons which you've described I'm a big fan of removing virtual memory
> from CPUs altogether. That would speed up things considerably.
> 
>> On Sun, Dec 8, 2019 at 6:43 PM James K. Lowden 
>> wrote:
>> 
>> On Sat, 7 Dec 2019 05:23:15 +
>> Simon Slavin  wrote:
>> 
>>> (Your operating system is allowed to do this.  Checking how much
>>> memory is available for every malloc takes too much time.)
>> 
>> Not really.  Consider that many (all?) operating systems before Linux
>> that supported dynamic memory returned an error if the requested amount
>> couldn't be supplied.  Some of those machines had 0.1% of the
>> processing capacity, and yet managed to answer the question reasonably
>> quickly.
>> 
>> The origin of oversubscribed memory rather has its origins in the
>> changed ratio of the speed of RAM to the speed of I/O, and the price of
>> RAM.
>> 
>> As RAM prices dropped, our machines got more RAM and the bigger
>> applications that RAM supported.  As memory got faster, relatively, the
>> disk (ipso facto) has gotten slower. Virtual memory -- the hallmark of
>> the the VAX, 4 decades ago -- has become infeasibly slow both because
>> the disk is relatively slower than it was, and because more is being
>> demanded of it to support today's big-memory applications.  Swapping in
>> Firefox, at 1 GB of memory, who knows why, is a much bigger deal than
>> Eight Megabytes and Constantly Swapping.
>> 
>> If too much paging makes the machine too slow (however measured) one
>> solution is less paging.  One administrative lever is to constrain how
>> much paging is possible by limiting the paging resource: swap space.
>> However, limiting swap space may leave the machine underutilized,
>> because many applications allocate memory they never use.
>> 
>> Rather than prefer applications that use resources rationally or
>> administer machines to prevent thrashing, the best-effort, least-effort
>> answer was lazy allocation, and its infamous gap-toothed cousin, the
>> OOM.
>> 
>> Nothing technical mandates oversubscribed memory.  The problem, as
>> ever, is not with the stars, but with ourselves.
>> 
>> --jkl
>> 
>> 
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Richard Hipp
On 12/9/19, Rui Zhong  wrote:
> I tried in latest trunk version which check in
> commit 926f796e8feec15f3836aa0a060ed906f8ae04d3 and it crashed.
> I did not see any more recent commit in GITHUB. Could you please
> double check it?

GitHub is a mirror.  The mirror is updated about once per hour.

You and Yongheng Chen should both be using Fossil, which is the
canonical version control.  There are instructions in the README.md
file and at https://www.sqlite.org/getthecode.html#clone



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Rui Zhong
I tried in latest trunk version which check in
commit 926f796e8feec15f3836aa0a060ed906f8ae04d3 and it crashed.
I did not see any more recent commit in GITHUB. Could you please
double check it?




Richard Hipp  于2019年12月9日周一 下午1:23写道:

> On 12/9/19, Rui Zhong  wrote:
> > Hi,
> > We found this bug can be triggered again after fix.
>
> Yes.  I discovered the same thing independently.  The previous fix was
> subtly wrong.  Please try the latest trunk version.
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Slightly offtopic: indexes vs. hash tables

2019-12-09 Thread Simon Slavin
During the lower-volume weekdays, I beg you indulgence for another off-topic 
post.



This article contrasts hash tables vs. indexes, in an attempt to explain why 
indexes are the basis of most DBMSes but hash tables are the basis of many 
in-memory search systems.

It's a semi-technical article, but if you don't understand what "O(log n)" 
means, don't worry, you can skim over that level and still understand the 
discussion.

SQLite uses hash tables internally, though not for data stored in tables.  But 
given there's already an embedded LSM library, it's not impossible that SQLite 
might use hashing for some data purpose in the future.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Richard Hipp
On 12/9/19, Rui Zhong  wrote:
> Hi,
> We found this bug can be triggered again after fix.

Yes.  I discovered the same thing independently.  The previous fix was
subtly wrong.  Please try the latest trunk version.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Rui Zhong
Hi,
We found this bug can be triggered again after fix.
PoC and sqlite version info had been attached as follow.
--

SQLite version 3.31.0 2019-12-09 17:14:48
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE v0 ( v1 DOUBLE CHECK( ( v1 IN ( NULL   )  )  ) , v2
UNIQUE AS( v1 > v1  )  ) ;
sqlite> INSERT INTO v0 VALUES ( 10  ) ON CONFLICT DO NOTHING ;
sqlite> SELECT 10.10 , 10 FROM v0 CROSS JOIN v0 USING ( v1 ) ;
[1]141687 segmentation fault (core dumped)

---



Thanks,

Yongheng & Rui

Richard Hipp  于2019年12月9日周一 下午12:15写道:

> On 12/9/19, Yongheng Chen  wrote:
> > Hi,
> >
> > We found a crash bug in sqlite of master branch.
>
> Should now be fixed on trunk.
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Richard Hipp
On 12/9/19, Yongheng Chen  wrote:
> Hi,
>
> We found a crash bug in sqlite of master branch.

Should now be fixed on trunk.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Jose Isaias Cabrera

Yongheng Chen, on Monday, December 9, 2019 11:14 AM, wrote...
>
> So should we just report the bugs after another release version? We think the 
> sooner
> the bugs get fixed, the better in terms of security, as this approach can 
> minimize
> the number of bugs in future release.

Yongheng, ignore my comments.  I was trying to be funny.  Continue on as Dr. 
Hipp, or whomever has instructed.

Solomon, a proverbist once said, "In the multitude of words there wants not 
sin: but he that refrains his lips is wise." It should probably say, "he that 
refrains his typing, is wise."

> > On Dec 9, 2019, at 10:56 AM, Jose Isaias Cabrera, on
> >
> >
> > Since no one explains... ;-)
> >
> > Richard Hipp, on Monday, December 9, 2019 10:53 AM, wrote...
> >>
> >> On 12/9/19, Jose Isaias Cabrera, on
> >>> Error: near "AS": syntax error
> >>>
> >>> So, I can't replicate your problem.  thanks.
> >>
> >> You have to run off of the latest trunk version, as they are fuzzing
> >> for features that are unreleased.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Yongheng Chen
I see. I totally agree with you. A better sqlite is what we all want.

Best.
Yongheng & Rui

> On Dec 9, 2019, at 11:23 AM, Richard Hipp  wrote:
> 
> On 12/9/19, Yongheng Chen  wrote:
>> So should we just report the bugs after another release version?
> 
> No.  You should report problems as soon as you see them.  That is why
> we have open-source.  That is why all of our changes are out there in
> the open where anybody can see them - so that people kind find and
> report problems before they are released.  That is what makes
> open-source so much better the closed-source.
> 
> I was merely responding to Jose, who was wondering why he could not
> reproduce the problem.
> 
> 
> -- 
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Richard Hipp
On 12/9/19, Yongheng Chen  wrote:
> So should we just report the bugs after another release version?

No.  You should report problems as soon as you see them.  That is why
we have open-source.  That is why all of our changes are out there in
the open where anybody can see them - so that people kind find and
report problems before they are released.  That is what makes
open-source so much better the closed-source.

I was merely responding to Jose, who was wondering why he could not
reproduce the problem.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Yongheng Chen
So should we just report the bugs after another release version? We think the 
sooner the bugs get fixed, the better in terms of security, as this approach 
can minimize the number of bugs in future release. 

> On Dec 9, 2019, at 10:56 AM, Jose Isaias Cabrera  wrote:
> 
> 
> Since no one explains... ;-)
> 
> Richard Hipp, on Monday, December 9, 2019 10:53 AM, wrote...
>> 
>> On 12/9/19, Jose Isaias Cabrera, on
>>> Error: near "AS": syntax error
>>> 
>>> So, I can't replicate your problem.  thanks.
>> 
>> You have to run off of the latest trunk version, as they are fuzzing
>> for features that are unreleased.
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Jose Isaias Cabrera

Since no one explains... ;-)

Richard Hipp, on Monday, December 9, 2019 10:53 AM, wrote...
>
> On 12/9/19, Jose Isaias Cabrera, on
> > Error: near "AS": syntax error
> >
> > So, I can't replicate your problem.  thanks.
>
> You have to run off of the latest trunk version, as they are fuzzing
> for features that are unreleased.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Richard Hipp
On 12/9/19, Jose Isaias Cabrera  wrote:
> Error: near "AS": syntax error
>
> So, I can't replicate your problem.  thanks.

You have to run off of the latest trunk version, as they are fuzzing
for features that are unreleased.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] A crash bug in sqlite

2019-12-09 Thread Jose Isaias Cabrera

Yongheng Chen, on Monday, December 9, 2019 10:16 AM, wrote...
>
> Hi,
>
> We found a crash bug in sqlite of master branch. Here’s the POC
> —
> CREATE TABLE v0 ( v2 DOUBLE CHECK( ( v2 IN ( v2 , v1) ) ) , v1 UNIQUE AS( v2 
> > v2 ) ) ;
> INSERT INTO v0 VALUES ( 10 );
> SELECT v0 . v1 , v0 . v1 FROM v0 JOIN v0 USING ( v1 , v1) ;
> —
> The bug exists in "SQLite version 3.31.0 2019-12-08 00:06:39” and "SQLite 
> version 3.31.0
> 2019-12-09 08:13:43”. We haven’t tested other versions yet. Thanks

I get "Error: near "AS": syntax error" after the first line:

sqlite> CREATE TABLE v0 ( v2 DOUBLE CHECK( ( v2 IN ( v2 , v1) ) ) , v1 UNIQUE 
AS( v2 > v2 ) ) ;
Error: near "AS": syntax error

So, I can't replicate your problem.  thanks.

josé
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Crash Bug Report

2019-12-09 Thread Yongheng Chen
Thanks for the fix.

> On Dec 9, 2019, at 9:43 AM, Richard Hipp  wrote:
> 
> On 12/8/19, Yongheng Chen  wrote:
>> 
>> We found one crash bug in sqlite,
> 
> Simplified test case:
> 
> CREATE TABLE t1(a);
> CREATE VIEW v2(b) AS WITH t3 AS (SELECT b FROM v2) VALUES(1);
> ALTER TABLE t1 RENAME TO t4;
> 
> Notice how the definition of the V2 view refers to itself.  The ALTER
> TABLE command was failing to detect this circular reference in the
> view definition.  This caused an infinite recursion of attempts to
> resolve the definition of V2, ultimately resulting in a stack
> overflow, which on some systems manifests as a segfault.  Dan checked
> in a fix earlier today.
> 
> -- 
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite sync over network

2019-12-09 Thread George
thank you Jens,I have just found litesync 
(litesync - SQLite Replication and Synchronization) 
litesync seems to be a version of sqlite made by Ramos Bernardo from Brazil.

| 
| 
|  | 
litesync - SQLite Replication and Synchronization

litesync makes easy to replicate and synchronize SQLite databases. All nodes 
can write to the database, even whe...
 |

 |

 |


I'm testing this now. For windows OS its easy but still dont know how to use it 
on linux and what about Qt integration, ... how to add it on my application. 
It is working in a simple demo on 3 machines (winOS) but I'm wandering if 
anybody tested in a real life situation.

George C.


Hello from GxG,
gxg_...@yahoo.com
 

On Friday, December 6, 2019, 2:32:53 AM GMT+2, Jens Alfke 
 wrote:  
 
 

> On Dec 5, 2019, at 8:05 AM, George  wrote:
> 
> Changes -> bidirectional. All terminals can save to db. Amount of collected 
> data it's not big and frequency between readings will be minimum 2sec. When 
> we push more readings. Data itself it's very small. Like reading temperature 
> from a probe.

The product I work on — Couchbase Mobile* — does this, but it may not be 
exactly what you want because it uses SQLite internally on the client side, and 
not at all on the server. So it's not something you can plug your existing 
SQLite-based code into. (Our data model is not relational but JSON-based.)

Data sync is hard. I've been working in this area since 2011 so I think I have 
some expertise here :)
Problems you're likely to run into:

1. You can't use regular integer primary keys, because different clients will 
end up creating rows with the same keys and cause collisions when they sync. 
You either have to use UUIDs as keys, or else somehow partition the key space 
in a deterministic way, like prefixing a fixed client ID to a key.

2. If multiple entities can update the db, there will likely be conflicts. 
Conflicts are hard to manage, and how you do it is entirely dependent on your 
high-level schema. In the worst case, conflicts require human intervention.

3. You need a message-oriented protocol. It's best to keep a continuous 
bidirectional connection open. WebSockets is a good protocol for this. You'll 
have to deal with errors establishing the connection, and unexpected 
disconnects, by periodic retries.

4. Schema upgrades in a distributed system are a mess. If your system is 
centralized enough you can take it down and upgrade every peer's database, then 
bring it back up, but of course that doesn't work in a decentralized system. 
(One of the main reasons Couchbase is schema-less.)

5. Keep in mind there is no central source of truth. Data takes finite time to 
propagate, and transient errors greatly increase that time. Even if you have a 
central server, it will be behind the clients that create the data, so it 
doesn't have the latest info. No one does.

—Jens

* https://www.couchbase.com/products/mobile
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] A crash bug in sqlite

2019-12-09 Thread Yongheng Chen
Hi,

We found a crash bug in sqlite of master branch. Here’s the POC
—
CREATE TABLE v0 ( v2 DOUBLE CHECK( ( v2 IN ( v2 , v1) ) ) , v1 UNIQUE AS( v2 > 
v2 ) ) ;
INSERT INTO v0 VALUES ( 10 );
SELECT v0 . v1 , v0 . v1 FROM v0 JOIN v0 USING ( v1 , v1) ;
—
The bug exists in "SQLite version 3.31.0 2019-12-08 00:06:39” and "SQLite 
version 3.31.0 2019-12-09 08:13:43”. We haven’t tested other versions yet. 
Thanks

Yongheng & Rui
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Crash Bug Report

2019-12-09 Thread Richard Hipp
On 12/8/19, Yongheng Chen  wrote:
>
> We found one crash bug in sqlite,

Simplified test case:

CREATE TABLE t1(a);
CREATE VIEW v2(b) AS WITH t3 AS (SELECT b FROM v2) VALUES(1);
ALTER TABLE t1 RENAME TO t4;

Notice how the definition of the V2 view refers to itself.  The ALTER
TABLE command was failing to detect this circular reference in the
view definition.  This caused an infinite recursion of attempts to
resolve the definition of V2, ultimately resulting in a stack
overflow, which on some systems manifests as a segfault.  Dan checked
in a fix earlier today.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Missed index opportunities with sorting?

2019-12-09 Thread Graham Holden
Monday, December 09, 2019, 1:32:40 PM, Digital Dog  
wrote:

> On Sat, Dec 7, 2019 at 3:50 AM Simon Slavin  wrote:

>> On 7 Dec 2019, at 2:26am, Shawn Wagner  wrote:
>>
>> > The first one uses the index for all sorting, but the second one only
>> uses it for sorting a, not b. I feel like the descending sort could make
>> use of the index too, just reading the b sections backwards to get the
>> right order. Is there something I'm overlooking that would make this sort
>> of optimization impractical or otherwise a bad idea?
>>
>> Hmm.  Try running ANALYZE and then doing the EXPLAIN QUERY PLAN lines
>> again.
>>
>> But I think that without 'chunkiness' information (how many values columns
>> a and b have) it would not be worth doing the complicated programming
>> required for reverse-mini-scanning of that index.  The programming is quite
>> complicated and unless your index "b" is chunky it won't save you much time
>> over the plan shown.
>>

> So it's better to allocate memory, block the execution until all rows are
> read and use cpu time to do unnecessary sorting that could easily be
> avoided by just reading index backwards?

It should only need to collect-and-sort groups of rows with equal "a"
values, rather than ALL rows (and the EXPLAIN QUERY PLAN seems to
support this). In theory, it should be possible to emit rows in these
groups, but I don't know enough about reading VDBE (output of EXPLAIN
SELECT...) to know if it does this or accumulates ALL rows before
emitting the first.

>   Is it really so hard to program
> it? I do not think so.

I've no idea. If reverse-mini-scanning an index ISN'T "so hard", then
from past experience there's a moderate chance one of the devs is
looking to see if it CAN be done. However, the fact that it HASN'T
been done, when reverse-mini-scanning seems an "obvious" optimisation,
suggests to me it is not as easy as one might think, and that the
potential saving isn't that great. Especially when you could always
create a second index on "a, b DESC".

> However the heuristic to decide when to do backward index scan needs to be
> smart enough to select this only as last resort optimization, just before
> falling back to explicit sort.

This "decision problem" is -- I believe -- a key factor in deciding
whether to add any specific optimisation. For every (potential)
optimisation that could be added, you need to add code that decides
whether it's worth using that optimisation. The savings from USING the
optimisation only benefit SOME queries, but the code that decides
whether or not to USE the optimisation has to be executed for many,
many more queries. The risk is that you slow lots and lots of queries
down a little bit, for only an occasional gain for the few queries
where the optimisation does make sense.

I don't know the actual code, but I'm guessing the optimiser never
really "knows" there is an index where doing a "reverse-mini-scan"
could help. I suspect it (a) realises there isn't a "prefect" index;
(b) finds an index that sorts as much as possible (in this case, the
only index), and (c) fulfils the rest of the ORDER BY requirements
using a temp b-tree. (With, probably, some exceptions for
optimisations that HAVE been implemented). 

To implement a reverse-mini-scan, not only would you have to add the
code to do the reverse scan itself, but the code that does (b) above
(find AN index that sorts as much as possible) would need to be much
more complex and consider ALL indices that start by ordering "a ASC"
to see if any of THOSE would allow a reverse-scan. It might even need
to see if a "seemingly worse" index (+reverse-scan) might be better
than an "obviously better" index. For instance, with more fields, an
"ORDER BY a, b DESC, c" might be better served by an index on
"( a, b, c )" with reverse-scan than a seemingly better index on
"( a, b DESC, d ).

Graham


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SLOW execution: Simple Query Uses More than 1 min

2019-12-09 Thread Digital Dog
For reasons which you've described I'm a big fan of removing virtual memory
from CPUs altogether. That would speed up things considerably.

On Sun, Dec 8, 2019 at 6:43 PM James K. Lowden 
wrote:

> On Sat, 7 Dec 2019 05:23:15 +
> Simon Slavin  wrote:
>
> > (Your operating system is allowed to do this.  Checking how much
> > memory is available for every malloc takes too much time.)
>
> Not really.  Consider that many (all?) operating systems before Linux
> that supported dynamic memory returned an error if the requested amount
> couldn't be supplied.  Some of those machines had 0.1% of the
> processing capacity, and yet managed to answer the question reasonably
> quickly.
>
> The origin of oversubscribed memory rather has its origins in the
> changed ratio of the speed of RAM to the speed of I/O, and the price of
> RAM.
>
> As RAM prices dropped, our machines got more RAM and the bigger
> applications that RAM supported.  As memory got faster, relatively, the
> disk (ipso facto) has gotten slower. Virtual memory -- the hallmark of
> the the VAX, 4 decades ago -- has become infeasibly slow both because
> the disk is relatively slower than it was, and because more is being
> demanded of it to support today's big-memory applications.  Swapping in
> Firefox, at 1 GB of memory, who knows why, is a much bigger deal than
> Eight Megabytes and Constantly Swapping.
>
> If too much paging makes the machine too slow (however measured) one
> solution is less paging.  One administrative lever is to constrain how
> much paging is possible by limiting the paging resource: swap space.
> However, limiting swap space may leave the machine underutilized,
> because many applications allocate memory they never use.
>
> Rather than prefer applications that use resources rationally or
> administer machines to prevent thrashing, the best-effort, least-effort
> answer was lazy allocation, and its infamous gap-toothed cousin, the
> OOM.
>
> Nothing technical mandates oversubscribed memory.  The problem, as
> ever, is not with the stars, but with ourselves.
>
> --jkl
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Missed index opportunities with sorting?

2019-12-09 Thread Digital Dog
On Sat, Dec 7, 2019 at 3:50 AM Simon Slavin  wrote:

> On 7 Dec 2019, at 2:26am, Shawn Wagner  wrote:
>
> > The first one uses the index for all sorting, but the second one only
> uses it for sorting a, not b. I feel like the descending sort could make
> use of the index too, just reading the b sections backwards to get the
> right order. Is there something I'm overlooking that would make this sort
> of optimization impractical or otherwise a bad idea?
>
> Hmm.  Try running ANALYZE and then doing the EXPLAIN QUERY PLAN lines
> again.
>
> But I think that without 'chunkiness' information (how many values columns
> a and b have) it would not be worth doing the complicated programming
> required for reverse-mini-scanning of that index.  The programming is quite
> complicated and unless your index "b" is chunky it won't save you much time
> over the plan shown.
>


So it's better to allocate memory, block the execution until all rows are
read and use cpu time to do unnecessary sorting that could easily be
avoided by just reading index backwards? Is it really so hard to program
it? I do not think so.

However the heuristic to decide when to do backward index scan needs to be
smart enough to select this only as last resort optimization, just before
falling back to explicit sort.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users