Re: [sqlite] Buffered stderr on Windows (Take 2)

2015-01-23 Thread Stephan Beal
On Fri, Jan 23, 2015 at 5:18 PM, Guilhem Malichier 
wrote:

> I've been experiencing an issue with SQLite's CLI tool on Windows 7, when
> used through a script or spawned process (not when used directly in the
> console).
>

If i'm not mistaken, that was fixed just last week:

http://sqlite.org/src/info/80541e8b94b713e8f9e588ae047ffc5ae804ef1c

Or maybe this is a related problem, not quite the same.

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] TEMP tables with the same name as of main tables

2015-01-23 Thread Jay Kreibich

On Jan 23, 2015, at 9:35 AM, Aldo Buratti  wrote:

> I had a bad programming experience with temporary tables and after some
> googling I found this old post
> 
>   [sqlite] How to select from a temp table with same name as a main table.
>   dochsm Tue, 18 Jun 2013 05:39:04 -0700
> 
> that illustrated exactly the same troubles.
> In short, if you have a table named A and a temporary table named TEMP.A,
> then if you want to refer to the A table, it is strongly recommended to
> explicitly call it MAIN.A ( otherwise if you simply refer to A, you will
> pick ... TEMP.A )


For what it is worth, this behavior is documented:

https://www.sqlite.org/lang_naming.html

Database Object Name Resolution

[…] If no database is specified as part of the object reference, then SQLite 
searches the main, temp and all attached databases for an object with a 
matching name. The temp database is searched first, followed by the main 
database, followed all attached databases in the order that they were attached. 
The reference resolves to the first match found. […]


> In my opinion, and for my experience, I find this behavior
> counter-intuitive,

I will have to disagree, as will, I believe, the SQL language designers.

Temp tables are scoped to the connection that created them, while other 
databases are, in a sense, “global.”  Think of your main database like global 
variables in a program, while the temp database has objects (tables, indexes, 
views, etc.) that are “local” to the execution context, similar to variables 
defined within a function.  In almost all languages, object names within a 
tighter scope hide similarly named objects that live a larger scope, just as a 
function variable named “A” will hide a global variable with the name “A” in 
most languages.  The behavior is consistent with the vast majority of 
programming languages and programming paradigms.

Of course, the wisdom of using the same name for two different objects, even if 
they are in a different scope, is a different discussion— but it is a very 
similar discussion to programmers that reference function variables using the 
same name as an existing global variable.

  -j

--  
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it, but showing 
it to the wrong people has the tendency to make them feel uncomfortable." -- 
Angela Johnson





___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3 very slow even after creating without rowid

2015-01-23 Thread Parakkal, Navin S (Software Engineer)
Hi,

> I also did another experiment. I created this table and did a vaccum and then 
> the select count(*) in sqlite3 was around 2 mins. 
> 
> When I create an index manually after the table is loaded (imported from 
> csv), select count(*) in sqlite3 was within 30 to 40 secs.

>In the second case, to calculate count(*) SQLite was able to use the index you 
>had created.  Since this index was smaller than the table, SQLite was able to 
>count the entries in it faster.  The result would have been the same if you 
>had done whenever the >index had been created

>CREATE TABLE
.>import
>CREATE INDEX
>time the 'select count(*) from hp_table1' command here


>should yield pretty-much the same result as

>CREATE TABLE
>CREATE INDEX
>.import
>time the 'select count(*) from hp_table1' command here

Actually this didn't give me what was expected. It also took more than 20 mins 
twice . 
I'll rerun it again if you insist. 
That is the reason I uploaded the file  to ftp and the schema.

Also I saw that autoindexes were present for the table (primary keys).


>If you are using a table for which rows are INSERTed but never DELETEd, then 
>you will get the same result almost instantly using

>   select max(rowid) from hp_table1

> instead of counting the rows.

We purge data once a week automatically and it is configurable. So we can't use 
the max(rowid) trick always. Yes it works if you don't DELETE..

I'm doing all this on CentOS 7 x64.
I built sqlite myself with latest sqlite-autoconf-3080801

Regards,
Navin
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Buffered stderr on Windows (Take 2)

2015-01-23 Thread Guilhem Malichier
[I got an unexpected _"message's content type was not explicitly allowed"_
bounce message after my first email, so I'm re-sending it in plain text. 
I apologize in advance for the subsequent lack of formatting, and am 
sorry for the double post if the first one actually reached you.]

Hi,

I've been experiencing an issue with SQLite's CLI tool on Windows 7, when 
used through a script or spawned process (not when used directly in the 
console).
For a bit more context, this problem namely affects the way the *dblite* 
library ([https://github.com/WebReflection/dblite], an SQLite wrapper for 
Node.js which spawns the CLI tool behind the scenes) works.

The problem lies in the fact that, as opposed to Linux where stderr is always 
unbuffered, Windows behaves differently depending on whether the output is an 
interactive console or a pipe: on an interactive console, stderr is 
unbuffered, but when piped it becomes buffered (which I personally think is 
overcomplicated, but meh, that's the way it is, although I don't know if it 
is a general rule on Windows, or specific to GCC's implementation, or 
whatever, but I don't think it matters here).

This difference in I/O buffering happens to create a discrepancy in the 
behavior of the CLI tool, between on one side Linux and Windows' 
interactive console, and on the other side piped output on Windows.

Here is a tiny piece of shell script (working on both Windows' cmd and 
Bash) that can be used to see the difference:

(echo fail\; && echo .version) | sqlite3 2>&1

This is the non-piped version. Simply add `| more` at the end to get a piped 
output.

The sections below show the result.
Windows 7 x64:

C:\>(echo fail\; && echo .version) | sqlite3 2>&1
Error: near line 1: near "fail": syntax error
SQLite 3.8.7.4 2014-12-09 01:34:36 f66f7a17b78ba617acde90fc810107f34f1a1f2e

C:\>(echo fail\; && echo .version) | sqlite3 2>&1 | more
SQLite 3.8.7.4 2014-12-09 01:34:36 f66f7a17b78ba617acde90fc810107f34f1a1f2e
Error: near line 1: near "fail": syntax error

Note the swapped output lines: the error shows up after the version when piping 
the output.

Lubuntu 14.10:

gmal@VLubuntuGM:~$ (echo fail\; && echo .version) | sqlite3 2>&1
Error: near line 1: near "fail": syntax error
SQLite 3.8.6 2014-08-15 11:46:33 9491ba7d738528f168657adb43a198238abde19e   
 

gmal@VLubuntuGM:~$ (echo fail\; && echo .version) | sqlite3 2>&1 | more
Error: near line 1: near "fail": syntax error
SQLite 3.8.6 2014-08-15 11:46:33 9491ba7d738528f168657adb43a198238abde19e

All is fine here.

(Unfortunately I don't have an OSX machine, so I don't know how it behaves.)


This issue basically makes it impossible to use the CLI tool the way *dblite* 
does, i.e. a long-living child process with piped input and output. Indeed, the
error messages come in too late, at seemingly random times (in fact, they show 
up whenever the system flushes the buffer, which is usually completely out of 
context).
I am not sure this way of using the CLI tool is actually meant to be supported 
by SQLite, but I'm pretty sure this discrepancy between the different OSes is 
not there on purpose, which is the reason why I'm submitting this as a 
potential bug to you guys. 

Additional info:
After quickly looking at the code (sorry if I'm breaking the _"tachyon 
modulation must be wrongly polarised"_ rule), I found out that SQLite's shell 
tool explicitly flushes stdout after each query but leaves stderr untouched, 
probably trusting it to be unbuffered.
I tried to compile a modified version of shell.c (just adding a few calls to 
`fflush(stderr)` along those to `fflush(p->out)` ), and it appears to work 
like a charm so far. But there may be compatibility issues with other usages, 
so I'm not going to pretend this is an easy fix. We all know there is no such 
thing as an easy fix in projects with a large-enough user base.


Thanks a lot for your patience and congratulations for the awesome work done 
on SQLite. I'm quite new to it but it really seems to be a GREAT piece of 
software.

Guilhem
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] sqlite3 very slow even after creating without rowid

2015-01-23 Thread Parakkal, Navin S (Software Engineer)
Hello,

My Process.csv is around 27G. I've gzipped it and put at 
ftp://navinps:sqlit...@h2.usa.hp.com as process.csv.gz

There is only 1 file there. 
md5sum process.csv.gz 
e77a322744a26d4c8a1ad4d61a84ee72  process.csv.gz

 [root@centosnavin sqlite-autoconf-3080801]# cat sqlite3commands.txt
CREATE TABLE [hp_table1] ( InstanceId INTEGER, LogTime INTEGER, ArrivalTime 
INTEGER, CollectionTime INTEGER, [dml_PROC_TIME] TIME, [dml_PROC_INTERVAL] 
INTEGER, [dml_PROC_INTEREST] TEXT, [dml_PROC_TOP_CPU_INDEX] INTEGER, 
[dml_PROC_TOP_DISK_INDEX] INTEGER, [dml_PROC_STATE_FLAG] INTEGER, 
[dml_PROC_RUN_TIME] REAL, [dml_PROC_STOP_REASON_FLAG] INTEGER, 
[dml_PROC_INTERVAL_ALIVE] INTEGER, [dml_PROC_STOP_REASON] TEXT, 
[dml_PROC_STATE] TEXT, [dml_PROC_PRI] INTEGER, [dml_PROC_NICE_PRI] INTEGER, 
[dml_PROC_CPU_LAST_USED] INTEGER, [dml_PROC_CPU_SWITCHES] INTEGER, 
[dml_PROC_IO_BYTE] REAL, [dml_PROC_VOLUNTARY_CSWITCH] INTEGER, 
[dml_PROC_FORCED_CSWITCH] INTEGER, [dml_PROC_IO_BYTE_RATE] REAL, 
[dml_PROC_CPU_TOTAL_UTIL] REAL, [dml_PROC_CPU_TOTAL_TIME] REAL, 
[dml_PROC_CPU_SYS_MODE_UTIL] REAL, [dml_PROC_CPU_SYS_MODE_TIME] REAL, 
[dml_PROC_CPU_USER_MODE_UTIL] REAL, [dml_PROC_CPU_USER_MODE_TIME] REAL, 
[dml_PROC_THREAD_COUNT] INTEGER, [dml_PROC_CPU_ALIVE_TOTAL_UTIL] REAL , 
[dml_PROC_CPU_ALIVE_USER_MODE_UTIL] 
 REAL, [dml_PROC_CPU_ALIVE_SYS_MODE_UTIL] REAL, [dml_PROC_CPU_SYSCALL_UTIL] 
REAL, [dml_PROC_CPU_SYSCALL_TIME] REAL, [dml_PROC_CHILD_CPU_USER_MODE_UTIL] 
REAL, [dml_PROC_CHILD_CPU_SYS_MODE_UTIL] REAL, [dml_PROC_CHILD_CPU_TOTAL_UTIL] 
REAL, [dml_PROC_DISK_PHYS_READ] INTEGER, [dml_PROC_DISK_PHYS_READ_RATE] REAL, 
[dml_PROC_DISK_PHYS_WRITE] INTEGER, [dml_PROC_DISK_PHYS_WRITE_RATE] REAL, 
[dml_PROC_DISK_PHYS_IO_RATE] REAL, [dml_PROC_MEM_RES] REAL, 
[dml_PROC_MEM_SHARED_RES] REAL, [dml_PROC_MEM_VIRT] REAL, 
[dml_PROC_MEM_DATA_VIRT] REAL, [dml_PROC_MEM_STACK_VIRT] REAL, 
[dml_PROC_PAGEFAULT] INTEGER, [dml_PROC_PAGEFAULT_RATE] REAL, 
[dml_PROC_MINOR_FAULT] INTEGER, [dml_PROC_MAJOR_FAULT] INTEGER, 
[dml_PROC_MEM_LOCKED] REAL, [dml_PROC_DISK_SUBSYSTEM_WAIT_PCT] REAL, 
[dml_PROC_DISK_SUBSYSTEM_WAIT_TIME] REAL, [dml_PROC_PRI_WAIT_PCT] REAL, 
[dml_PROC_PRI_WAIT_TIME] REAL, PRIMARY KEY (InstanceId, CollectionTime)) 
WITHOUT ROWID ;
.timer on
.mode csv
.import /home/navin/oa_nvn/process.csv hp_table1
[root@centosnavin sqlite-autoconf-3080801]# sync
[root@centosnavin sqlite-autoconf-3080801]# ./sqlite3 hptest.db < 
sqlite3commands.txt
[root@centosnavin sqlite-autoconf-3080801]# du -sh hptest.db
14G hptest.db
[root@centosnavin sqlite-autoconf-3080801]# time ./sqlite3 hptest.db "select 
count(*) from hp_table1; "


115349845

real26m56.435s
user0m1.591s
sys 0m21.262s
 [root@centosnavin sqlite-autoconf-3080801]# echo "pragma page_size; " | 
./sqlite3
65536  [ tried with default page size also ]
[root@centosnavin sqlite-autoconf-3080801]#



POSTGRES: [ Same machine, same FS disk , similar load, RAM 4G] 

DB size is around 34.2 GB.

bash-4.2$ time psql -c "select count(*) from dml_Scope__Process; "
   count   
---
 115349845
(1 row)


real4m28.946s
user0m0.001s
sys 0m0.004s
-bash-4.2$

The table is named differently. The data is same. I think postgres SQL is 
around 2 times the size of sqlite3 database but as you see it is like more than 
6 times faster.

I also did another experiment. I created this table and did a vaccum and then 
the select count(*) in sqlite3 was around 2 mins. 

When I create an index manually after the table is loaded (imported from csv), 
select count(*) in sqlite3 was within 30 to 40 secs.

I'm  stuck here how to go about achieving better speeds without always creating 
index after inserting data. ? 

Regards,
Navin

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3 very slow even after creating without rowid

2015-01-23 Thread Parakkal, Navin S (Software Engineer)
Repost:  Since it didn't get into the archives or in the mailing list. Sorry 
about that.
Quoted and replied to simon after [Repost End]


Hello,


[Repost Begin]

My Process.csv is around 27G. I've gzipped it and put at 
ftp://navinps:sqlit...@h2.usa.hp.com as process.csv.gz

There is only 1 file there. 
md5sum process.csv.gz
e77a322744a26d4c8a1ad4d61a84ee72  process.csv.gz

 [root@centosnavin sqlite-autoconf-3080801]# cat sqlite3commands.txt 
CREATE TABLE [hp_table1] ( InstanceId INTEGER, LogTime INTEGER, ArrivalTime 
INTEGER, CollectionTime INTEGER, [dml_PROC_TIME] TIME, [dml_PROC_INTERVAL] 
INTEGER, [dml_PROC_INTEREST] TEXT, [dml_PROC_TOP_CPU_INDEX] INTEGER, 
[dml_PROC_TOP_DISK_INDEX] INTEGER, [dml_PROC_STATE_FLAG] INTEGER, 
[dml_PROC_RUN_TIME] REAL, [dml_PROC_STOP_REASON_FLAG] INTEGER, 
[dml_PROC_INTERVAL_ALIVE] INTEGER, [dml_PROC_STOP_REASON] TEXT, 
[dml_PROC_STATE] TEXT, [dml_PROC_PRI] INTEGER, [dml_PROC_NICE_PRI] INTEGER, 
[dml_PROC_CPU_LAST_USED] INTEGER, [dml_PROC_CPU_SWITCHES] INTEGER, 
[dml_PROC_IO_BYTE] REAL, [dml_PROC_VOLUNTARY_CSWITCH] INTEGER, 
[dml_PROC_FORCED_CSWITCH] INTEGER, [dml_PROC_IO_BYTE_RATE] REAL, 
[dml_PROC_CPU_TOTAL_UTIL] REAL, [dml_PROC_CPU_TOTAL_TIME] REAL, 
[dml_PROC_CPU_SYS_MODE_UTIL] REAL, [dml_PROC_CPU_SYS_MODE_TIME] REAL, 
[dml_PROC_CPU_USER_MODE_UTIL] REAL, [dml_PROC_CPU_USER_MODE_TIME] REAL, 
[dml_PROC_THREAD_COUNT] INTEGER, [dml_PROC_CPU_ALIVE_TOTAL_UTIL] REAL , 
[dml_PROC_CPU_ALIVE_USER_MODE_UTIL] 
 REAL, [dml_PROC_CPU_ALIVE_SYS_MODE_UTIL] REAL, [dml_PROC_CPU_SYSCALL_UTIL] 
REAL, [dml_PROC_CPU_SYSCALL_TIME] REAL, [dml_PROC_CHILD_CPU_USER_MODE_UTIL] 
REAL, [dml_PROC_CHILD_CPU_SYS_MODE_UTIL] REAL, [dml_PROC_CHILD_CPU_TOTAL_UTIL] 
REAL, [dml_PROC_DISK_PHYS_READ] INTEGER, [dml_PROC_DISK_PHYS_READ_RATE] REAL, 
[dml_PROC_DISK_PHYS_WRITE] INTEGER, [dml_PROC_DISK_PHYS_WRITE_RATE] REAL, 
[dml_PROC_DISK_PHYS_IO_RATE] REAL, [dml_PROC_MEM_RES] REAL, 
[dml_PROC_MEM_SHARED_RES] REAL, [dml_PROC_MEM_VIRT] REAL, 
[dml_PROC_MEM_DATA_VIRT] REAL, [dml_PROC_MEM_STACK_VIRT] REAL, 
[dml_PROC_PAGEFAULT] INTEGER, [dml_PROC_PAGEFAULT_RATE] REAL, 
[dml_PROC_MINOR_FAULT] INTEGER, [dml_PROC_MAJOR_FAULT] INTEGER, 
[dml_PROC_MEM_LOCKED] REAL, [dml_PROC_DISK_SUBSYSTEM_WAIT_PCT] REAL, 
[dml_PROC_DISK_SUBSYSTEM_WAIT_TIME] REAL, [dml_PROC_PRI_WAIT_PCT] REAL, 
[dml_PROC_PRI_WAIT_TIME] REAL, PRIMARY KEY (InstanceId, CollectionTime)) 
WITHOUT ROWID ; 
.timer on .mode csv .import /home/navin/oa_nvn/process.csv hp_table1 

[root@centosnavin sqlite-autoconf-3080801]# sync 
[root@centosnavin sqlite-autoconf-3080801]# ./sqlite3 hptest.db < 
sqlite3commands.txt 
[root@centosnavin sqlite-autoconf-3080801]# du -sh hptest.db
14G hptest.db
[root@centosnavin sqlite-autoconf-3080801]# time ./sqlite3 hptest.db "select 
count(*) from hp_table1; "


115349845

real26m56.435s
user0m1.591s
sys 0m21.262s
 [root@centosnavin sqlite-autoconf-3080801]# echo "pragma page_size; " | 
./sqlite3
65536  [ tried with default page size also ]

 [root@centosnavin sqlite-autoconf-3080801]#



POSTGRES: [ Same machine, same FS disk , similar load, RAM 4G] 

DB size is around 34.2 GB.

bash-4.2$ time psql -c "select count(*) from dml_Scope__Process; "
   count   
---
 115349845
(1 row)


real4m28.946s
user0m0.001s
sys 0m0.004s
-bash-4.2$

The table is named differently. The data is same. I think postgres SQL is 
around 2 times the size of sqlite3 database but as you see it is like more than 
6 times faster.

I also did another experiment. I created this table and did a vaccum and then 
the select count(*) in sqlite3 was around 2 mins. 

When I create an index manually after the table is loaded (imported from csv), 
select count(*) in sqlite3 was within 30 to 40 secs.

I'm  stuck here how to go about achieving better speeds without always creating 
index after inserting data. ? 

[Repost end]


Quoting Simon:

> I also did another experiment. I created this table and did a vaccum and then 
> the select count(*) in sqlite3 was around 2 mins. 
> 
> When I create an index manually after the table is loaded (imported from 
> csv), select count(*) in sqlite3 was within 30 to 40 secs.

>In the second case, to calculate count(*) SQLite was able to use the index you 
>had created.  Since this index was smaller than the >table, SQLite was able to 
>count the entries in it faster.  The result would have been the same if you 
>had done whenever the >index >had been created

>CREATE TABLE
.>import
>CREATE INDEX
>time the 'select count(*) from hp_table1' command here


>should yield pretty-much the same result as

>CREATE TABLE
>CREATE INDEX
>.import
>time the 'select count(*) from hp_table1' command here

Actually this didn't give me what was expected. It also took more than 20 mins 
twice . 
I'll rerun it again if you insist. 
That is the reason I uploaded the file  to ftp and the schema.

Also I saw that autoindexes were present for the table (primary keys).


>If you are using a table for which rows are 

Re: [sqlite] sqlite3 performance on select count very slow for 16 GB file

2015-01-23 Thread Jim Wilcoxson
If you have a table where rows are inserted but never deleted, and you
have a rowid column, you can use this:

select seq from sqlite_sequence where name = 'tablename'

This will return instantly, without scanning any rows or indexes, and
is much faster than max(rowid) for huge tables.

If no rows have been inserted, you will get NULL.  If rows have been
inserted, you will get back the last rowid inserted.

Jim


Simon wrote:

If this is a table for which rows are inserted but never deleted, then
you will find that

SELECT max(rowid) FROM hp_table1

returns the same value almost immediately.  Perhaps value-1, but
whatever it is it'll be consistent.

-- 
HashBackup: easy onsite and offsite Unix backup
http://www.hashbackup.com
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Emmanouil Karvounis
Thank you very much, Richard.

I'm sure this enhancement, modelled in a more general way, e.g., like an
isSorted flag on the subquery to be used by the outer query, can be a great
enhancement for many other types of queries employing nesting. I think it
will help both in time and in space complexity, something very important
when aiming at the embedded systems enviroment.

Let me know if you need us to provide any further details or other
assistance.


Regards,
Manos

On 23 January 2015 at 21:30, Richard Hipp  wrote:

> On 1/23/15, Emmanouil Karvounis  wrote:
> > We have two tables that are already sorted on a combination of
> > two fields (which are their primary keys) and we want to union them and
> > apply group by on both the aforementioned fields, like so:
> >
> > select c1, c2, myagg(*) from (
> > select * from tableA
> > union all
> > select * from tableB
> > )
> > group by c1, c2;
> >
> > where tableA, tableB have primary key (c1, c2) and their schema
> comprises 3
> > integers: c1, c2, and prop.
> >
> > The sqlite query plan creates a temporary B-tree to hold all the records
> of
> > both tables to execute the group by.
>
> You are correct.  Here is the test case I am using:
>
> CREATE TABLE tableA(c1 INT, c2 INT, payload INT, PRIMARY KEY(c1,c2));
> CREATE TABLE tableB(c1 INT, c2 INT, payload INT, PRIMARY KEY(c1,c2));
>
> explain query plan
> SELECT * FROM tableA
> UNION ALL
> SELECT * FROM tableB
> ORDER BY 1, 2;
>
> .print -
>
> explain query plan
> SELECT c1, c2, sum(payload) FROM (
>   SELECT * FROM tableA
>   UNION ALL
>   SELECT * FROM tableB
> )
> GROUP BY c1, c2;
>
> The first query - the UNION ALL with the ORDER BY does not use a
> separate B-tree for sorting.  So it seems logical that the second
> query that does a GROUP BY over the same UNION ALL should be able to
> get by without a separate sorting B-Tree too.
>
> For various technical reasons, this is complex change in SQLite.  But
> it seems like a worthwhile enhancement, so we'll take your suggestion
> under advisement and attempt to do a better job of handling your query
> in future releases.
>
> Thanks for posting.
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Having problems with Entity Framework code first db creation

2015-01-23 Thread Mike Nicolino
This sounds like a problem with the connection string being passed to 
SQLiteConnection.  Your attached package didn't come through, so could you send 
the connection string you're using?


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Walter Williams
Sent: Friday, January 23, 2015 6:52 AM
To: sqlite-users@sqlite.org
Subject: [sqlite] Having problems with Entity Framework code first db creation

I'm trying to use a code first model using the System.Data.SQlite NuGet
(v1.0.94.1) package and Entity Framework.  I'm using VS 2013.

I have defined my objects, but when I try to create a new database file using 
them, I get an error "Unable to complete operation. The supplied SqlConnection 
does not specify an initial catalog or AttachDBFileName."

The attached project has a sample of the code which demonstrates the error.
For size the packages folder is not included.  I was able to use the same code 
with SQL Server Compact 4.0 but for preference I'd like to use SQLite.

Is this a bug in the package or am I missing something?


Walter Williams
Senior Software Engineer
Sawtooth Software, Inc.


"Do, or do not.  There is no try."

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Richard Hipp
On 1/23/15, Emmanouil Karvounis  wrote:
> We have two tables that are already sorted on a combination of
> two fields (which are their primary keys) and we want to union them and
> apply group by on both the aforementioned fields, like so:
>
> select c1, c2, myagg(*) from (
> select * from tableA
> union all
> select * from tableB
> )
> group by c1, c2;
>
> where tableA, tableB have primary key (c1, c2) and their schema comprises 3
> integers: c1, c2, and prop.
>
> The sqlite query plan creates a temporary B-tree to hold all the records of
> both tables to execute the group by.

You are correct.  Here is the test case I am using:

CREATE TABLE tableA(c1 INT, c2 INT, payload INT, PRIMARY KEY(c1,c2));
CREATE TABLE tableB(c1 INT, c2 INT, payload INT, PRIMARY KEY(c1,c2));

explain query plan
SELECT * FROM tableA
UNION ALL
SELECT * FROM tableB
ORDER BY 1, 2;

.print -

explain query plan
SELECT c1, c2, sum(payload) FROM (
  SELECT * FROM tableA
  UNION ALL
  SELECT * FROM tableB
)
GROUP BY c1, c2;

The first query - the UNION ALL with the ORDER BY does not use a
separate B-tree for sorting.  So it seems logical that the second
query that does a GROUP BY over the same UNION ALL should be able to
get by without a separate sorting B-Tree too.

For various technical reasons, this is complex change in SQLite.  But
it seems like a worthwhile enhancement, so we'll take your suggestion
under advisement and attempt to do a better job of handling your query
in future releases.

Thanks for posting.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] System.Data.SQLite - Exception Calling SQLiteModule.DeclareTable

2015-01-23 Thread Mike Nicolino
In my specific case, I'm using virtual tables to hook up a non-sql data source 
to SQLite.  Wanted to 'quote' column names to avoid issues with a column 
colliding with an sql keyword rather than avoiding use of strange characters.

MikeN


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Stephen Chrzanowski
Sent: Friday, January 23, 2015 1:07 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] System.Data.SQLite - Exception Calling 
SQLiteModule.DeclareTable

In the 8.3 days, I routinely gave directories an underscore as a delimiter for 
version information, prior to my actually using a version control package.  So 
"game" would be the main thing, and if I wanted to test, "game_1" became the 
new WIP folder.  If I liked what I did, I'd move "game"
to "game__1" and rename "game_1" to "game".  I even sometimes went so far as 
adding an extension to a directory, so, "mkdir game.1".  LOVED how that screwed 
some of the DOS toys up when a directory contained an extension. :]

On Thu, Jan 22, 2015 at 8:13 PM, Keith Medcalf  wrote:

>
> Probably for the same reason people exceed 8.3 filename limitations on 
> Windows or try to embed non-alphanumeric characters in file/directory 
> names (Including spaces and shell special characters) -- they have 
> been told that they could do so and not told of the problems and 
> difficulties  created by doing so.
>
> I cannot fathom why one would use underscores either.
>
> In both cases the only rational explanation I can come up with as that 
> these folks are "users" or "coders" and not "programmers" ...
>
> ---
> Theory is when you know everything but nothing works.  Practice is 
> when everything works but no one knows why.  Sometimes theory and 
> practice are
> combined:  nothing works and no one knows why.
>
> >-Original Message-
> >From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- 
> >boun...@sqlite.org] On Behalf Of Hick Gunter
> >Sent: Thursday, 22 January, 2015 01:39
> >To: 'General Discussion of SQLite Database'
> >Subject: Re: [sqlite] System.Data.SQLite - Exception Calling 
> >SQLiteModule.DeclareTable
> >
> >I have always wondered why people will insist on using human readable 
> >column names (with embedded spaces and special characters) in the 
> >implementation layer (SQL code) instead of the presentation layer 
> >(user interface). The clutter introduced into queries by having to 
> >quote the column names by far outweighs any gain from having "some 
> >strange field name" displayed instead of some_strange_field_name...
> >
> >-Ursprüngliche Nachricht-
> >Von: Mike Nicolino [mailto:mike.nicol...@centrify.com]
> >Gesendet: Donnerstag, 22. Jänner 2015 02:17
> >An: General Discussion of SQLite Database
> >Betreff: Re: [sqlite] System.Data.SQLite - Exception Calling 
> >SQLiteModule.DeclareTable
> >
> >Figured this one out.  DeclareTable doesn't like any 'quoting' around 
> >the column names in the sql.  It works fine with just straight column names.
> >
> >
> >-Original Message-
> >From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users- 
> >boun...@sqlite.org] On Behalf Of Mike Nicolino
> >Sent: Saturday, January 17, 2015 6:17 PM
> >To: sqlite-users@sqlite.org
> >Subject: [sqlite] System.Data.SQLite - Exception Calling 
> >SQLiteModule.DeclareTable
> >
> >I'm getting an exception calling SQLiteModule.DeclareTable that seems 
> >to imply the 'create table' sql being passed is invalid: "SQL logic 
> >error or missing database".  Yet using that same sql on the same 
> >connection as a create table call succeeds.  Reviewing the virtual 
> >table docs don't imply there are restrictions on the create table sql 
> >for virtual tables so I'm at a loss to what's wrong.
> >
> >The create table sql (the line breaks here are for readability and 
> >not present in the actual string send to DeclareTable):
> >
> >create table xxx(
> >"Username" text,
> >"DisplayName" text,
> >"Email" text,
> >"LastLogin" integer,
> >"LastInvite" integer,
> >"Status" text,
> >"SourceDs" text,
> >"Data" text,
> >"SourceDsLocalized" text
> >)
> >
> >Anyone have any input on what might be wrong?
> >Thanks!
> >
> >___
> >sqlite-users mailing list
> >sqlite-users@sqlite.org
> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >___
> >sqlite-users mailing list
> >sqlite-users@sqlite.org
> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> >
> >___
> > Gunter Hick
> >Software Engineer
> >Scientific Games International GmbH
> >FN 157284 a, HG Wien
> >Klitschgasse 2-4, A-1130 Vienna, Austria
> >Tel: +43 1 80100 0
> >E-Mail: h...@scigames.at
> >
> >This communication (including any attachments) is intended for the 
> >use of the intended recipient(s) only and may contain information 
> >that is 

Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Emmanouil Karvounis
>
> Sorry, but SQLite does not understand how the subquery (inside the
> brackets) is going to be used by the main query.  It hqs to complete the
> subquery first and only then can it inspect the main query to find out how
> to optimize it.  This is not a bug, there just isn't enough flexibility to
> do this the way you want.


This is unfortunate, especially for an RDBMS aiming to minimize memory
consumption for embedded systems. I believe there are many other queries
where optimizing the subqueries with regard to the external query will
result in optimal memory usage.


> One of the following may or may not be useful:
>
> You may be able to use
>
> select c1, c2, myagg(*) from (
> select c1, c2 from tableA group by c1,c2
> union all
> select c1, c2 from tableB group by c1,c2
> ) group by c1,c2
>
>
Thank you for your suggestion. But the query plan is the same, how would
that be any faster? We actually run an experiment and it is a bit slower, I
suppose because of the additional overhead of creating each group in the
subquery, which doesn't contribute anything given the way the external
query is handled.

Alternatively, is there a good reason for tableA and tableB not to be
> merged with, perhaps, an extra column indicating 'A' or 'B' ?  This would
> allow you to create an index and get your answer almost instantly.  When
> you see two tables with the same columns it's often an indication that
> there should really be one table.
>

This is an alternative we wish to explore, but in our usecase it would be
considered a hack, and might cause side-effects in other parts of the code.
If we are forced to do it, we might be able to pull it off though. Thank
you for your suggestion.


Best,
Manos



>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Simon Slavin

On 23 Jan 2015, at 4:59pm, Emmanouil Karvounis  wrote:

> tableA and tableB have both primary key on (c1, c2)
> 
> explain query plan
> select c1, c2, count(*) from (
> select c1, c2 from tableA
> union all
> select c1, c2 from tableB
> )
> group by c1,c2
> 
> 2|0|0|SCAN TABLE tableA USING COVERING INDEX sqlite_autoindex_tableA_1
> 3|0|0|SCAN TABLE tableB USING COVERING INDEX sqlite_autoindex_tableB_1
> 1|0|0|COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
> 0|0|0|SCAN SUBQUERY 1
> 0|0|0|USE TEMP B-TREE FOR GROUP BY
> 
> There is no reason to create a new temp B-tree when you can sequentially
> and in-synch scan the B-tree of tableA and of tableB and get the groups in
> one pass.

Sorry, but SQLite does not understand how the subquery (inside the brackets) is 
going to be used by the main query.  It hqs to complete the subquery first and 
only then can it inspect the main query to find out how to optimize it.  This 
is not a bug, there just isn't enough flexibility to do this the way you want.  
One of the following may or may not be useful:

You may be able to use

select c1, c2, myagg(*) from (
select c1, c2 from tableA group by c1,c2
union all
select c1, c2 from tableB group by c1,c2
) group by c1,c2

Alternatively, is there a good reason for tableA and tableB not to be merged 
with, perhaps, an extra column indicating 'A' or 'B' ?  This would allow you to 
create an index and get your answer almost instantly.  When you see two tables 
with the same columns it's often an indication that there should really be one 
table.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Emmanouil Karvounis
To explain that with an example:

tableA
(1,2) (2,3) (2,4) (3,5)

tableB
(1,2) (2,3) (4,6)


Get a pointer on tableA and one on tableB.

(1,2) and (1,2) form a group, run count(*) and output 2.

Advance both pointers.

(2,3) and (2,3) form a group, run count(*) and output 2.

Advance both pointers.

(2,4) is less than (4,6) (because 2 < 4). (2,4) forms a group, run count(*)
and output 1.

Advance pointer in tableA.

(3,5) is less than (4,6) (because 3 < 5). (3,5) forms a group, run count(*)
and output 1.

pointer on table A is depleted. Continue with tableB only.

(4,6) forms a group. run count(*) and output 1.


The sqlite plan calls for merging both tables in a B-tree and scanning it,
which incurs unneeded time and space complecity compared to the above.


Manos

On 23 January 2015 at 18:59, Emmanouil Karvounis  wrote:

> Dear Simon,
>
> Thank you for your answer and I'm sorry if I have used inappropriate
> wording and confused you.
>
> The issue is actually very simple:
>
> tableA and tableB have both primary key on (c1, c2)
>
> explain query plan
> select c1, c2, count(*) from (
> select c1, c2 from tableA
> union all
> select c1, c2 from tableB
> )
> group by c1,c2
>
> 2|0|0|SCAN TABLE tableA USING COVERING INDEX sqlite_autoindex_tableA_1
> 3|0|0|SCAN TABLE tableB USING COVERING INDEX sqlite_autoindex_tableB_1
> 1|0|0|COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
> 0|0|0|SCAN SUBQUERY 1
> 0|0|0|USE TEMP B-TREE FOR GROUP BY
>
> There is no reason to create a new temp B-tree when you can sequentially
> and in-synch scan the B-tree of tableA and of tableB and get the groups in
> one pass. Think of how you would execute the sort step of mergesort on two
> already sorted subarrays.
>
>
> Manos
>
> On 23 January 2015 at 18:42, Simon Slavin  wrote:
>
>>
>> On 23 Jan 2015, at 4:15pm, Emmanouil Karvounis  wrote:
>>
>> > In short, we have two tables that are already sorted on a combination of
>> > two fields
>>
>> There is no such thing as a 'sorted table' in SQL.  Each table is a set
>> of rows and the rows have no order.
>>
>> If you want to make it easy for SQL to access a table's rows in a
>> particular order, create an index or make that order the table's primary
>> key (which is another way of making an index).
>>
>> > select c1, c2, myagg(*) from (
>> > select * from tableA
>> > union all
>> > select * from tableB
>> > )
>> > group by c1, c2;
>>
>> This command tells SQL that you want to construct a list of every row of
>> tableA and every row of tableB.  In other words, if you have 300 rows in
>> tableA and 500 rows in tableB, you are telling SQL to construct a new table
>> of 800 rows.  And because this table doesn't yet exist, it doesn't have any
>> indexes so it can't be searched quickly.  Is that what you wanted ?
>>
>> Is there a good reason for needing this data in two separate tables
>> rather than one for which you can create an index on (c1, c2) ?
>>
>> Do the groups occur entirely within one table or do you have to add the
>> tables together before SQL can figure out the groups.
>>
>> > where tableA, tableB have primary key (c1, c2) and their schema
>> comprises 3
>> > integers: c1, c2, and prop.
>>
>> It might be worth testing with something like 'total(*)' just to make
>> sure it isn't your own function which is causing the problems.
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Emmanouil Karvounis
Dear Simon,

Thank you for your answer and I'm sorry if I have used inappropriate
wording and confused you.

The issue is actually very simple:

tableA and tableB have both primary key on (c1, c2)

explain query plan
select c1, c2, count(*) from (
select c1, c2 from tableA
union all
select c1, c2 from tableB
)
group by c1,c2

2|0|0|SCAN TABLE tableA USING COVERING INDEX sqlite_autoindex_tableA_1
3|0|0|SCAN TABLE tableB USING COVERING INDEX sqlite_autoindex_tableB_1
1|0|0|COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
0|0|0|SCAN SUBQUERY 1
0|0|0|USE TEMP B-TREE FOR GROUP BY

There is no reason to create a new temp B-tree when you can sequentially
and in-synch scan the B-tree of tableA and of tableB and get the groups in
one pass. Think of how you would execute the sort step of mergesort on two
already sorted subarrays.


Manos

On 23 January 2015 at 18:42, Simon Slavin  wrote:

>
> On 23 Jan 2015, at 4:15pm, Emmanouil Karvounis  wrote:
>
> > In short, we have two tables that are already sorted on a combination of
> > two fields
>
> There is no such thing as a 'sorted table' in SQL.  Each table is a set of
> rows and the rows have no order.
>
> If you want to make it easy for SQL to access a table's rows in a
> particular order, create an index or make that order the table's primary
> key (which is another way of making an index).
>
> > select c1, c2, myagg(*) from (
> > select * from tableA
> > union all
> > select * from tableB
> > )
> > group by c1, c2;
>
> This command tells SQL that you want to construct a list of every row of
> tableA and every row of tableB.  In other words, if you have 300 rows in
> tableA and 500 rows in tableB, you are telling SQL to construct a new table
> of 800 rows.  And because this table doesn't yet exist, it doesn't have any
> indexes so it can't be searched quickly.  Is that what you wanted ?
>
> Is there a good reason for needing this data in two separate tables rather
> than one for which you can create an index on (c1, c2) ?
>
> Do the groups occur entirely within one table or do you have to add the
> tables together before SQL can figure out the groups.
>
> > where tableA, tableB have primary key (c1, c2) and their schema
> comprises 3
> > integers: c1, c2, and prop.
>
> It might be worth testing with something like 'total(*)' just to make sure
> it isn't your own function which is causing the problems.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Emmanouil Karvounis
Thank you Clemens.

In the general case, myagg() doesn't have the appropriate property to
distribute over tableA and tableB seperately.

Let me be clear, we are not looking for alternatives to get our query
running more efficiently, we sincerely believe that we have found a defect
in the sqlite query planner. This is a bug report, not a request for
assistance on formulating our query.

I do appreciate the effort and your good will, though. Thank you again.

Manos

On 23 January 2015 at 18:48, Clemens Ladisch  wrote:

> Emmanouil Karvounis wrote:
> > In short, we have two tables that are already sorted on a combination of
> > two fields (which are their primary keys) and we want to union them and
> > apply group by on both the aforementioned fields, like so:
> >
> > select c1, c2, myagg(*) from (
> > select * from tableA
> > union all
> > select * from tableB
> > )
> > group by c1, c2;
> >
> > The sqlite query plan creates a temporary B-tree to hold all the records
> of
> > both tables to execute the group by. This incurs too much time and space
> > complexity
>
> If the number of result rows is small compared to the number of table
> rows, and if myagg() works correctly when called on its own results,
> then you can reduce the size of the temporary B-tree by doing another
> GROUP BY in the subqueries:
>
> select c1, c2, myagg(prop) from (
>   select c1, c2, myagg(prop) as prop from tableA group by c1, c2
>   union all
>   select c1, c2, myagg(prop) from tableB group by c2, c2
> )
> group by c1, c2;
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Clemens Ladisch
Emmanouil Karvounis wrote:
> In short, we have two tables that are already sorted on a combination of
> two fields (which are their primary keys) and we want to union them and
> apply group by on both the aforementioned fields, like so:
>
> select c1, c2, myagg(*) from (
> select * from tableA
> union all
> select * from tableB
> )
> group by c1, c2;
>
> The sqlite query plan creates a temporary B-tree to hold all the records of
> both tables to execute the group by. This incurs too much time and space
> complexity

If the number of result rows is small compared to the number of table
rows, and if myagg() works correctly when called on its own results,
then you can reduce the size of the temporary B-tree by doing another
GROUP BY in the subqueries:

select c1, c2, myagg(prop) from (
  select c1, c2, myagg(prop) as prop from tableA group by c1, c2
  union all
  select c1, c2, myagg(prop) from tableB group by c2, c2
)
group by c1, c2;


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Simon Slavin

On 23 Jan 2015, at 4:15pm, Emmanouil Karvounis  wrote:

> In short, we have two tables that are already sorted on a combination of
> two fields

There is no such thing as a 'sorted table' in SQL.  Each table is a set of rows 
and the rows have no order.

If you want to make it easy for SQL to access a table's rows in a particular 
order, create an index or make that order the table's primary key (which is 
another way of making an index).

> select c1, c2, myagg(*) from (
> select * from tableA
> union all
> select * from tableB
> )
> group by c1, c2;

This command tells SQL that you want to construct a list of every row of tableA 
and every row of tableB.  In other words, if you have 300 rows in tableA and 
500 rows in tableB, you are telling SQL to construct a new table of 800 rows.  
And because this table doesn't yet exist, it doesn't have any indexes so it 
can't be searched quickly.  Is that what you wanted ?

Is there a good reason for needing this data in two separate tables rather than 
one for which you can create an index on (c1, c2) ?

Do the groups occur entirely within one table or do you have to add the tables 
together before SQL can figure out the groups.

> where tableA, tableB have primary key (c1, c2) and their schema comprises 3
> integers: c1, c2, and prop.

It might be worth testing with something like 'total(*)' just to make sure it 
isn't your own function which is causing the problems.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Streaming group by in union of already sorted tables

2015-01-23 Thread Emmanouil Karvounis
Greetings,

We are having an issue with the sqlite query plan for one of our queries,
which seems to be sub-optimal both in time and in space complexity.

In short, we have two tables that are already sorted on a combination of
two fields (which are their primary keys) and we want to union them and
apply group by on both the aforementioned fields, like so:

select c1, c2, myagg(*) from (
select * from tableA
union all
select * from tableB
)
group by c1, c2;

where tableA, tableB have primary key (c1, c2) and their schema comprises 3
integers: c1, c2, and prop.

The sqlite query plan creates a temporary B-tree to hold all the records of
both tables to execute the group by. This incurs too much time and space
complexity compared to the better plan of sequentially iterating over both
tables using their B-tree indices on the primary key (since they contain
the records already sorted in the group by fields). Like what one would do
on the sort step of mergesort to sort two already sorted lists.

When we manually implemented the idea of the sequential scan of the two
tables and manually gathering the groups and applying the aggregate
function, we had a 2X speedup compared to the sqlite plan of creating a
temp B-tree. Of course, this speedup would be even greater if we could push
the agg function inside the sqlite engine and having it do the sequential
scanning plan on the indices directly.

Note: In the general case, tableA and tableB have a non empty intersection
and neither one is a subset of the other.

At the end of the email we provide extensive details for the tables we use
and the query we want to run, plus a rundown on the complexity of the
sqlite plan and our proposed solution.

We are at your disposal for any further clarifications and we are looking
forward to your reaction on this.


Kind Regards,
Manos Karvounis


--



SQLite version: 3.8.8.1 (latest)


pragma table_info(tableA);

0|c1|int(11)|0||1
1|c2|int(11)|0||2
2|prop|int(11)|0||0


pragma table_info(tableB);

0|c1|int(11)|0||1
1|c2|int(11)|0||2
2|prop|int(11)|0||0


select count(*) from tableA;

11095298


select count(*) from tableB;

100


pragma index_list(tableA);

0|sqlite_autoindex_tableA_1|1


pragma index_info(sqlite_autoindex_tableA_1);

0|0|c1
1|1|c2


pragma index_list(tableB);

0|sqlite_autoindex_tableB_1|1


pragma index_info(sqlite_autoindex_tableB_1);

0|0|c1
1|1|c2


explain query plan
select c1, c2, myagg(*) from (
select * from tableA
union all
select * from tableB
)
group by c1, c2;

2|0|0|SCAN TABLE tableA
3|0|0|SCAN TABLE tableB
1|0|0|COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
0|0|0|SCAN SUBQUERY 1
0|0|0|USE TEMP B-TREE FOR GROUP BY


Note that we get the same query plan even if we explicitly ask for ordering
tableA, tableB on c1, c2 prior to group by.

explain query plan
select c1, c2, myagg(*) from (
select * from tableA
union all
select * from tableB
order by c1, c2
)
group by c1, c2;

2|0|0|SCAN TABLE tableA USING INDEX sqlite_autoindex_tableA_1
3|0|0|SCAN TABLE tableB USING INDEX sqlite_autoindex_tableB_1
1|0|0|COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
0|0|0|SCAN SUBQUERY 1
0|0|0|USE TEMP B-TREE FOR GROUP BY


SQLite Query Plan:

(1) Add every record of tableA and tableB on a new B-tree
Time complexity: O((|tableA| + |tableB|)log(|tableA|+|tableB|)).
Space overhead: O(|tableA|+|tableB|).

(2) Run through the B-tree to get groups
Time complexity: O(|tableA| + |tableB|).


Alternative Query Plan:

(1) Iterate over both B-trees in synch since they are already sorted (i.e.,
the way we would merge two already sorted lists using two running pointers,
or the behavior you indicate in 3.2 here
https://www.sqlite.org/queryplanner.html).
Time complexity: O(|tableA|+|tableB|)
Space overhead: O(1)

Yet in the following queries the plan (seems to) correctly use the primary
key index to speed up the order by.

explain query plan
select c1, c2, prop from (
select * from tableA
union all
select * from tableB
)
order by c1, c2;

1|0|0|SCAN TABLE tableA USING INDEX sqlite_autoindex_tableA_1
2|0|0|SCAN TABLE tableB USING INDEX sqlite_autoindex_tableB_1
0|0|0|COMPOUND SUBQUERIES 1 AND 2 (UNION ALL)


explain query plan
select c1, c2, prop from table A
order by c1, c2;

0|0|0|SCAN TABLE tableA USING INDEX sqlite_autoindex_tableA_1


Even if one of the tables did not have an index, again the better plan
would be to create a temp B-tree for that table and then iterate in synch.
Yet sqlite again creates a temp B-tree as shown below.

pragma table_info(tableC);

0|c1|int(11)|0||0
1|c2|int(11)|0||0
2|prop|int(11)|0||0


select count(*) from tableC;

100


pragma index_list(tableC);


explain query plan
select c1, c2, myagg(*) from (
select * from tableA
union all
select * from tableC
)
group by c1, c2;

2|0|0|SCAN TABLE tableA
3|0|0|SCAN TABLE tableC
1|0|0|COMPOUND SUBQUERIES 2 AND 3 (UNION ALL)
0|0|0|SCAN SUBQUERY 1
0|0|0|USE TEMP B-TREE FOR GROUP BY



Re: [sqlite] Having problems with Entity Framework code first db creation

2015-01-23 Thread RSmith


On 2015/01/23 16:51, Walter Williams wrote:

I'm trying to use a code first model ///... (snipped)

then in the signature...


"Do, or do not.  There is no try."


Thank you for the chuckle.  As to the actual question, when you say "when I try to create a new database file", do you mean that you 
are using the open command (or whatever in the wrapper tries to call the sqlite3_open* commands) with a valid filename (which need 
not exist, but must be a valid name) and then it fails with the mentioned message?


The list unfortunately (or fortunately) does not permit attachments, could you use a file upload/sharing service kindly so we can 
access the file and (hopefully) better understand the question? - thanks.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3 very slow even after creating without rowid

2015-01-23 Thread Simon Slavin

On 23 Jan 2015, at 3:16pm, Parakkal, Navin S (Software Engineer) 
 wrote:

> I also did another experiment. I created this table and did a vaccum and then 
> the select count(*) in sqlite3 was around 2 mins. 
> 
> When I create an index manually after the table is loaded (imported from 
> csv), select count(*) in sqlite3 was within 30 to 40 secs.

In the second case, to calculate count(*) SQLite was able to use the index you 
had created.  Since this index was smaller than the table, SQLite was able to 
count the entries in it faster.  The result would have been the same if you had 
done whenever the index had been created

CREATE TABLE
.import
CREATE INDEX
time the 'select count(*) from hp_table1' command here


should yield pretty-much the same result as

CREATE TABLE
CREATE INDEX
.import
time the 'select count(*) from hp_table1' command here

If you are using a table for which rows are INSERTed but never DELETEd, then 
you will get the same result almost instantly using

select max(rowid) from hp_table1

instead of counting the rows.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] TEMP tables with the same name as of main tables

2015-01-23 Thread Aldo Buratti
I had a bad programming experience with temporary tables and after some
googling I found this old post

   [sqlite] How to select from a temp table with same name as a main table.
   dochsm Tue, 18 Jun 2013 05:39:04 -0700

that illustrated exactly the same troubles.
In short, if you have a table named A and a temporary table named TEMP.A,
then if you want to refer to the A table, it is strongly recommended to
explicitly call it MAIN.A ( otherwise if you simply refer to A, you will
pick ... TEMP.A )

In my opinion, and for my experience, I find this behavior
counter-intuitive, so I ask if it's possibile to change it in a future
release,
so that "A"  always refer to MAIN.A and TEMP.A refers to TEMP.A.

Since this change may break some old programs logic, I suggest to introduce
a new PRAGMA named "IMPLICIT_MAINDB_PREFIX" (or or better name ..), so that
if a table-name is not prefixed by a db-name, then MAIN is assumed.

regards
  A.Buratti
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Core dump with shared cache enabled

2015-01-23 Thread Tim Streater
On 22 Jan 2015 at 21:34, Daniel Roberts  wrote:

> Attached is a core file (sorry I don’t have symbols) as well as a copy of
> the script I was running.

Attachments are not allowed here.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Having problems with Entity Framework code first db creation

2015-01-23 Thread Walter Williams
I'm trying to use a code first model using the System.Data.SQlite NuGet
(v1.0.94.1) package and Entity Framework.  I'm using VS 2013.

I have defined my objects, but when I try to create a new database file
using them, I get an error "Unable to complete operation. The supplied
SqlConnection does not specify an initial catalog or AttachDBFileName."

The attached project has a sample of the code which demonstrates the error.
For size the packages folder is not included.  I was able to use the same
code with SQL Server Compact 4.0 but for preference I'd like to use SQLite.

Is this a bug in the package or am I missing something?


Walter Williams
Senior Software Engineer
Sawtooth Software, Inc.


"Do, or do not.  There is no try."

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Report a warning bug about Lemon parser

2015-01-23 Thread Richard Hipp
On 1/22/15, Tang Tianyong  wrote:
> Hi, yy_destructor function can not suppress warning about unused
> %extra_argument variable.

Sure it can.  Just add code to one of your destructors that references
the %extra_argument variable.  It doesn't have to actually do anything
with the variable.  If the variable is named "xyzzy" then it will
probably suffice to just say "(void)xyzzy;" inside one of your
destructors.

> My yy_destructor function that Lemon generated
> like this:
>
> ```
> static void yy_destructor(
>   yyParser *yypParser,/* The parser */
>   YYCODETYPE yymajor, /* Type code for object to destroy */
>   YYMINORTYPE *yypminor   /* The object to be destroyed */
> ){
>   COSStyleParseARG_FETCH;
>   switch( yymajor ){
> /* Here is inserted the actions which take place when a
> ** terminal or non-terminal is destroyed.  This can happen
> ** when the symbol is popped from the stack during a
> ** reduce or during error processing or when a parser is
> ** being destroyed before it is finished parsing.
> **
> ** Note: during a reduce, the only symbols destroyed are those
> ** which appear on the RHS of the rule, but which are not used
> ** inside the C code.
> */
> default:  break;   /* If no destructor action specified: do nothing */
>   }
> }
> ```
>
> --
> *By tan...@gmail.com *
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FW: Core dump with shared cache enabled

2015-01-23 Thread Roberts, Daniel
Hello,

While doing some profiling in python, I ran into a crash in sqlite3. The 
workflow was pretty simple – one thread doing writes, another doing reads (of 
course, these are python threads, so it’s only “sort-of” concurrent).

This workflow worked fine with the shared cache disabled, but when I enabled 
the shared cache I was able to generate a core every time.

Attached is a core file (sorry I don’t have symbols) as well as a copy of the 
script I was running.

System info: running Linux Mint 17 64-bit
➜  scratch $ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> sqlite3.version
'2.6.0'
>>> sqlite3.sqlite_version
'3.8.2'

Thanks,
--Dan Roberts

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Report a warning bug about Lemon parser

2015-01-23 Thread Tang Tianyong
Hi, yy_destructor function can not suppress warning about unused
%extra_argument variable. My yy_destructor function that Lemon generated
like this:

```
static void yy_destructor(
  yyParser *yypParser,/* The parser */
  YYCODETYPE yymajor, /* Type code for object to destroy */
  YYMINORTYPE *yypminor   /* The object to be destroyed */
){
  COSStyleParseARG_FETCH;
  switch( yymajor ){
/* Here is inserted the actions which take place when a
** terminal or non-terminal is destroyed.  This can happen
** when the symbol is popped from the stack during a
** reduce or during error processing or when a parser is
** being destroyed before it is finished parsing.
**
** Note: during a reduce, the only symbols destroyed are those
** which appear on the RHS of the rule, but which are not used
** inside the C code.
*/
default:  break;   /* If no destructor action specified: do nothing */
  }
}
```

-- 
*By tan...@gmail.com *
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Huge WAL log

2015-01-23 Thread Jan Slodicka
Thanks for the documentation update.

>From my point of view I would invite more details related to the term "large
transaction". Specifically the role of indexes is important. (They are often
overlooked, added by an admin after the development is over etc.)


> Defenses against this failure mode include: ... (2) making sure that the
> cache_size is at least as big as the transaction working set

a) This is not realistic in many cases
b) It is even not needed. What we need to avoid is multiple spills of the
same page to the disk. AFAIK this should not concern the "true" data written
to the database, but only the indexes. Hence the cache size should be
related to the size of the active indexes.

Maybe the point (2) could be re-formulated along these lines:
Increase of the cache size reduces the number of cache spills and helps thus
to reduce the growth of the WAL file. The more indexes are updated during
the transaction, the larger should be the cache size. In some cases it might
be better to delete the indexes and re-create them after the write operation
is done.

P.S. Another typo:

This scenario can be avoiding by ensuring  ---> avoided

P.P.S.

Does the latest SQLite optimize the case "DELETE FROM table"? Or for
commands that have the effect of deleting all table rows? This was the case
with the SQL command I wrote in my last post:

DELETE FROM discount WHERE discounttypeid NOT IN (SELECT discounttypeid FROM
discounttype) 

What happens in v3.7.15 is that WAL log growths to a huge size. (The reason
is as before - the indexes.)

This should be possible to avoid.







--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Huge-WAL-log-tp79991p80209.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] System.Data.SQLite - Exception Calling SQLiteModule.DeclareTable

2015-01-23 Thread Stephen Chrzanowski
In the 8.3 days, I routinely gave directories an underscore as a delimiter
for version information, prior to my actually using a version control
package.  So "game" would be the main thing, and if I wanted to test,
"game_1" became the new WIP folder.  If I liked what I did, I'd move "game"
to "game__1" and rename "game_1" to "game".  I even sometimes went so far
as adding an extension to a directory, so, "mkdir game.1".  LOVED how that
screwed some of the DOS toys up when a directory contained an extension. :]

On Thu, Jan 22, 2015 at 8:13 PM, Keith Medcalf  wrote:

>
> Probably for the same reason people exceed 8.3 filename limitations on
> Windows or try to embed non-alphanumeric characters in file/directory names
> (Including spaces and shell special characters) -- they have been told that
> they could do so and not told of the problems and difficulties  created by
> doing so.
>
> I cannot fathom why one would use underscores either.
>
> In both cases the only rational explanation I can come up with as that
> these folks are "users" or "coders" and not "programmers" ...
>
> ---
> Theory is when you know everything but nothing works.  Practice is when
> everything works but no one knows why.  Sometimes theory and practice are
> combined:  nothing works and no one knows why.
>
> >-Original Message-
> >From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> >boun...@sqlite.org] On Behalf Of Hick Gunter
> >Sent: Thursday, 22 January, 2015 01:39
> >To: 'General Discussion of SQLite Database'
> >Subject: Re: [sqlite] System.Data.SQLite - Exception Calling
> >SQLiteModule.DeclareTable
> >
> >I have always wondered why people will insist on using human readable
> >column names (with embedded spaces and special characters) in the
> >implementation layer (SQL code) instead of the presentation layer (user
> >interface). The clutter introduced into queries by having to quote the
> >column names by far outweighs any gain from having "some strange field
> >name" displayed instead of some_strange_field_name...
> >
> >-Ursprüngliche Nachricht-
> >Von: Mike Nicolino [mailto:mike.nicol...@centrify.com]
> >Gesendet: Donnerstag, 22. Jänner 2015 02:17
> >An: General Discussion of SQLite Database
> >Betreff: Re: [sqlite] System.Data.SQLite - Exception Calling
> >SQLiteModule.DeclareTable
> >
> >Figured this one out.  DeclareTable doesn't like any 'quoting' around the
> >column names in the sql.  It works fine with just straight column names.
> >
> >
> >-Original Message-
> >From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> >boun...@sqlite.org] On Behalf Of Mike Nicolino
> >Sent: Saturday, January 17, 2015 6:17 PM
> >To: sqlite-users@sqlite.org
> >Subject: [sqlite] System.Data.SQLite - Exception Calling
> >SQLiteModule.DeclareTable
> >
> >I'm getting an exception calling SQLiteModule.DeclareTable that seems to
> >imply the 'create table' sql being passed is invalid: "SQL logic error or
> >missing database".  Yet using that same sql on the same connection as a
> >create table call succeeds.  Reviewing the virtual table docs don't imply
> >there are restrictions on the create table sql for virtual tables so I'm
> >at a loss to what's wrong.
> >
> >The create table sql (the line breaks here are for readability and not
> >present in the actual string send to DeclareTable):
> >
> >create table xxx(
> >"Username" text,
> >"DisplayName" text,
> >"Email" text,
> >"LastLogin" integer,
> >"LastInvite" integer,
> >"Status" text,
> >"SourceDs" text,
> >"Data" text,
> >"SourceDsLocalized" text
> >)
> >
> >Anyone have any input on what might be wrong?
> >Thanks!
> >
> >___
> >sqlite-users mailing list
> >sqlite-users@sqlite.org
> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >___
> >sqlite-users mailing list
> >sqlite-users@sqlite.org
> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> >
> >___
> > Gunter Hick
> >Software Engineer
> >Scientific Games International GmbH
> >FN 157284 a, HG Wien
> >Klitschgasse 2-4, A-1130 Vienna, Austria
> >Tel: +43 1 80100 0
> >E-Mail: h...@scigames.at
> >
> >This communication (including any attachments) is intended for the use of
> >the intended recipient(s) only and may contain information that is
> >confidential, privileged or legally protected. Any unauthorized use or
> >dissemination of this communication is strictly prohibited. If you have
> >received this communication in error, please immediately notify the
> >sender by return e-mail message and delete all copies of the original
> >communication. Thank you for your cooperation.
> >
> >
> >___
> >sqlite-users mailing list
> >sqlite-users@sqlite.org
> >http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
> ___
> sqlite-users 

Re: [sqlite] Compiling error on Cygwin on Windows 8.1: 3.8.7.4 and 3.8.8

2015-01-23 Thread Jan Nijtmans
2015-01-19 11:01 GMT+01:00 Frank Ho :
> I compiled the SQLite on the Cygwin 1.7.33 running on a Windows 8.1, here's 
> the error:
>
> .libs/sqlite3.o: In function `sqlite3ThreadProc':
> ../sqlite-autoconf-3080800/sqlite3.c:22471: undefined reference to 
> `_endthreadex'
> .libs/sqlite3.o: In function `sqlite3ThreadCreate':
> ../sqlite-autoconf-3080800/sqlite3.c:22493: undefined reference to 
> `_beginthreadex'
> collect2: error: ld returned 1 exit status

The problem is here:


For Cywin, SQLITE_MUTEX_PTHREADS should be set to 1, but in those
lines SQLITE_MUTEX_W32 is set in stead.

There was a commit which was

Any chance this will be fixed for SQLite 3.8.9?

There was a commit in SQLite which simply disables all
multi-threading for Cygwin:

that's indeed another possible 'solution' ;-)

Regards,
   Jan Nijtmans
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Huge WAL log

2015-01-23 Thread Dominique Devienne
On Thu, Jan 22, 2015 at 6:49 PM, Richard Hipp  wrote:

> Let me know if that helps.  Note that I have only quickly read over my
> writing so there is a high probability of typos, which I will be happy
> to correct when brought to my attention.


Thanks for the new doc. Very interesting. A couple typos (I think):
1) when no readers a+++re+++ using the WAL file
2) using a rollback journal mode---s---

--DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users