from:"Kevin"

Re: [sqlite] Unexplained table bloat

2020-01-11 Thread Kevin Youren


Hi,

I checked the download mentioned in the original email. Not sure if the
table changed since the previous posts.


It seems LUTFullString has 3 BLOB rows, but LENGTH treats them as
strings.

I'm in Melbourne, Oz, so I added the UTC datetime.

regs, Kev

kevin@KCYDell:~$ cd /mnt/KCY/KCYDocs/
kevin@KCYDell:/mnt/KCY/KCYDocs$ sqlite3 /mnt/KCY/KCYDocs/200k-per-
row.sqlite
SQLite version 3.30.1 2019-10-10 20:19:45
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);
sqlite> 
sqlite> select '1',rowid,LENGTH(HEX(LUTFullString))/2 from copied
   ...> UNION ALL
   ...> select '2',rowid,INSTR(HEX(LUTFullString),'00') from copied
   ...> UNION ALL
   ...> select '3',rowid,substr(HEX(LUTFullString),0,10) from copied
   ...> UNION ALL
   ...> select '4',rowid,INSTR(SUBSTR(HEX(LUTFullString),4),'00') from
copied
   ...> UNION ALL
   ...> select '5',rowid,LENGTH(LUTFullString) from copied;
1|1|194238
1|2|183050
1|3|193908
2|1|3
2|2|3
2|3|3
3|1|2C0003007
3|2|2C0003007
3|3|2C0003007
4|1|1
4|2|1
4|3|1
5|1|1
5|2|1
5|3|1
sqlite> .quit
kevin@KCYDell:/mnt/KCY/KCYDocs$ date -u
Sat 11 Jan 23:39:43 UTC 2020
kevin@KCYDell:/mnt/KCY/KCYDocs$ 









Message: 6
Date: Fri, 10 Jan 2020 08:48:21 -0500
From: Ryan Mack 
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Unexplained table bloat
Message-ID:
<
CABhGdGRbR1kT+3_BU6ob9L7tpSPZ09HJn=ofPyK6OXvgQK=_...@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"

Hi list,

I'm trying to understand unexplained table bloat I found in what should
be
a very small table in an old database file. If you dump/restore the
database, the problem goes away. If you duplicate the table, the
problem
propagates. Schema:

CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

Test Data:
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

In my test database file, each of these 3 rows is allocating about 40
4k
overflow pages for a total database size of about 500k. The full
database
has about 4MB of actual data which takes up over 500MB on disk. If you
want
to see/reproduce the problem you'll need my test database file which
I've
uploaded here: https://mackman.net/200k-per-row.sqlite.zip (500kb
download). I don't know why it doesn't compress better, those extra
overflow pages must be getting filled with random garbage.

My uninformed guess is there was a bug in the version of sqlite used at

database creation time that computed an incorrect overflow threshold
and is
storing each byte of the row to its own page. Since the problem goes
away
with a dump/restore, I'm considering releasing a script to do that and
mitigate the problem for affected users. Before doing that I would like
to
understand the problem better.

Thanks for reading, Ryan

PS: Here's some output from my debug session showing the 123 bytes of
data
is occupying 582k of space on disk in a freshly created table.

% sqlite3 200k-per-row.sqlite
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
sqlite> .tables
copied

sqlite> .schema copied
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

sqlite> select * from copied;
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

sqlite> select sum(length(id_local) + length(lutfullstring) +
length(luthash)) from copied;
123

sqlite> create table copied2 as select * from copied;
sqlite> SELECT name, SUM("pgsize") as sz FROM dbstat group by name;
copied|581632
copied2|581632
sqlite_master|4096



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Things you shouldn't assume when you store names

2019-11-10 Thread Kevin O'Gorman

And "full legal name"   How about my dad, whose full name was Dr. John
Michael Patrick Dennis Emmet O'Gorman, PhD.  How many rules does that
break?  I've fought many companies over that apostrophe in my life.
Governments tend to throw it away, but it's on my old passport and birth
certificate.

---
Dictionary.com's word of the year: *misinformation*
Merriam-Webster word of the year: *justice*


On Sun, Nov 10, 2019 at 4:01 AM Richard Damon 
wrote:

> On 11/10/19 1:21 AM, Gary R. Schmidt wrote:
> > On 10/11/2019 13:44, Doug wrote:
> >> Au Contraire, Jens! In many local contexts you can normalize people's
> >> names. I was born in Kansas, USA. My parents filled out a birth
> >> certificate for me. It had a place on the form for first name, middle
> >> name, last name, and a suffix like II or III.
> >>
> >> That birth certificate form determined that everyone born in Kansas
> >> (at that time), had a first, middle, and last name. There was no
> >> discussion of the matter. That's the way it was. The form led the
> >> way; people never thought about whether it was effective or not. Each
> >> newly-born child was given a first, middle, and last name.
> >>
> >> Effective was irrelevant for that system. There was no option, no
> >> alternative. It simply was.
> >>
> >> All systems are like that at each moment in time. They are what they
> >> are at any moment in time, and they force the users to behave the way
> >> the system wants them to behave. If you want to change the system and
> >> momentum is on your side, then immediately you have a new system - at
> >> that moment in time. It is composed of the old system and the momentum.
> >>
> >> Back to names: just like the birth certificate, a system which
> >> assigns a name to you, actually coerces you to have that name,
> >> because within that system, you exist as that name. The "names"
> >> article is totally wrong when it says that each assumption is wrong.
> >> Each of those assumptions is correct, and I can find at least one
> >> system which makes each one correct. Within each system, the
> >> assumption works, and is valid.
> >>
> >> My two cents...
> > Is not worth the paper it is written on!
> >
> > So what happens when someone from a family who only uses first- and
> > last-names moves to Kansas?
> >
> > Do they have to make up a middle-name so that he idiots can fill out
> > the forms?
> >
> > Well, in the case of the US Navy back in the late 1980's, when a
> > friend of mine from here in Australia, who only has a first and
> > last-name married a USN pilot and moved to the USA, she was told that,
> > "Yes, you have a middle name."  No amount of arguing, or producing of
> > official documents, (well, it's the USA, most people there don't know
> > what a passport is), could prevail.  In the end she conceded defeat
> > and became  Doe , for the duration.
> >
> > Names are impossible, unless you use a free-form, infinite-length
> > field, you won't be safe, and even then, someone with turn up whose
> > name is 'n' recurring to an infinite number of characters or something!
> >
> > Cheers,
> > GaryB-)
> Actually, 'The Artist whose name formerly was Prince' (which wasn't his
> name, his legal name was an unpronounceable pictograph), breaks every
> computer system I know.
>
> --
> Richard Damon
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Date time input

2019-10-08 Thread Kevin Benson

On Tue, Oct 8, 2019 at 2:40 PM James K. Lowden 
wrote:

> On Tue, 8 Oct 2019 09:06:24 -0700
> Jens Alfke  wrote:
>
> > I think the idea of a semi-official ?SQLite++? has been floated here
> > before
>
> OK, but it needs a better name. --
>

SQLiteXTD

   --
  --
 --Ö¿Ö--
K e V i N
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] sqlite3 doesn't build with -DSQLITE_ENABLE_DBPAGE_VTAB with bitbake

2019-10-04 Thread Kevin Dankwardt

I need sqlite3 to be built with -DSQLITE_ENABLE_DBPAGE_VTAB so that I can
use the ".recover" command. The ".dbinfo" command doesn't work either.

I add  -DSQLITE_ENABLE_DBPAGE_VTAB to the sqlite recipe CFLAGS in various
ways but I never get sqlite3 built with the option. Example error: the
".dbinfo" command requires the -DSQLITE_ENABLE_DBPAGE_VTAB compile-time
options

I have tried various ways to add -DSQLITE_ENABLE_DBPAGE_VTAB in the recipe
file sqlite3.inc, including setting CFLAGS_append, CFLAGS +=, and
EXTRA_CFLAGS+=

Interestingly if I change into the expanded source directory, and do
./configure
--host=x86_64and then make (I have an ARM toolchain set-up), I get a
version of sqlite3 that does have SQLITE_ENABLE_DBPAGE_VTAB.

I am doing bitbake sqlite, and even tried rm -rf build/tmp and have
everything rebuilt. The sqlite3 command just doesn't get
SQLITE_ENABLE_DBPAGE_VTAB.

EXTRA_CFLAGS += " -DSQLITE_ENABLE_DBPAGE_VTAB "

This is the error.

SQLite version 3.29.0 2019-07-10 17:32:03 Enter ".help" for usage hints.
sqlite> .dbinfo the ".dbinfo" command requires the
-DSQLITE_ENABLE_DBPAGE_VTAB compile-time options sqlite> .quit

This is with the hand-made version

#/new/sqlite3 my.db SQLite version 3.29.0 2019-07-10 17:32:03 Enter ".help"
for usage hints. sqlite> .dbinfo database page size: 4096 write format: 1
read format: 1 reserved bytes: 0 file change counter: 97071 database page
count: 1795 freelist page count: 0 schema cookie: 10 schema format: 4
default cache size: 0 autovacuum top root: 0 incremental vacuum: 0 text
encoding: 1 (utf8) user version: 0 application id: 0 software version:
3029000 number of tables: 10 number of indexes: 2 number of triggers: 0
number of views: 0 schema size: 977 data version 1 sqlite>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [EXTERNAL] Re: Documentation update request

2019-08-16 Thread Kevin Benson

On Fri, Aug 16, 2019 at 3:15 AM Hick Gunter  wrote:

> Reminds me of "... two mice ran up the clock, the clock struck one, and
> the other escaped with minor injuries"
>

'Twas my impetus (rather than the wording that mistakenly sounded like I
was "besmirching" Dr. Hipp)


>
> -Ursprüngliche Nachricht-
> Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> Im Auftrag von Kevin Benson
> Gesendet: Donnerstag, 15. August 2019 20:40
> An: SQLite mailing list 
> Betreff: [EXTERNAL] Re: [sqlite] Documentation update request
>
> On Thu, Aug 15, 2019 at 2:33 PM Jose Isaias Cabrera 
> wrote:
>
> > Richard Hipp, on Thursday, August 15, 2019 01:32 PM, wrote...
> > >
> > > On 8/15/19, Simon Slavin, on
> > > > On 15 Aug 2019, at 5:20pm, Richard Damon, on
> > > >
> > > >> You under quote, the faq says it “can not be changed (except
> > > >> under extra-ordinary conditions).”, and those extra-ordinary
> > > >> conditions are
> > a
> > > >> link to the second section you mention.
> > > >
> > > > I didn't notice that, for some reason.  Thanks for the correction.
> > >
> > > You didn't notice it because I only added it moments ago, in
> > > response to your documentation update request.
> >
> > Oh! Trickery!
> >
> >
> Dickery (sometimes a nickname for RICHARD ;-)

Doc (...umentation; the something Richard "Dick" Hipp was doing ;-)
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Documentation update request

2019-08-15 Thread Kevin Benson

On Thu, Aug 15, 2019 at 2:33 PM Jose Isaias Cabrera 
wrote:

> Richard Hipp, on Thursday, August 15, 2019 01:32 PM, wrote...
> >
> > On 8/15/19, Simon Slavin, on
> > > On 15 Aug 2019, at 5:20pm, Richard Damon, on
> > >
> > >> You under quote, the faq says it “can not be changed (except under
> > >> extra-ordinary conditions).”, and those extra-ordinary conditions are
> a
> > >> link to the second section you mention.
> > >
> > > I didn't notice that, for some reason.  Thanks for the correction.
> >
> > You didn't notice it because I only added it moments ago, in response
> > to your documentation update request.
>
> Oh! Trickery!
>
>
Dickery (sometimes a nickname for RICHARD ;-)
Doc (...umentation; the something Richard "Dick" Hipp was doing ;-)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Correct use of sqlite3_vtab_nochange/sqlite3_value_nochange

2019-08-13 Thread Kevin Martin


> On 13 Aug 2019, at 14:08, Richard Hipp  wrote:
> 
> I think that is correct.
> 

Great, thanks.

> But it never occurred to me that somebody might do this on the PRIMARY
> KEY.  I don't see any reason why it wouldn't work, though.

I have a c++ interface built on top of the virtual table api which multiple 
modules are then built on top of, it doesn't know which columns are 
large/expensive, so the idea was just to use call sqlite3_vtab_nochange for all 
of them, which includes the primary key.


> On 13 Aug 2019, at 13:00, Hick Gunter  wrote:
> 
> Very strange and AFAICT not documented. I would not have though that calling 
> sqlite3_value_nochange on argv[1] was even legal, given that it would 
> correspond to field number -1. Could you provide an "explain" (SQlite 
> bytecode program dump) of your statement?

See below JSON for the explain as the table is not accessible in the command 
line tool.

Thanks,
Kev

---

[
  {
"sql":"pragma table_info(modeloption_vt_writable);",
"cols":["cid", "name", "type", "notnull", "dflt_value", "pk"],
"time":0.016,
"results":[
  ["0", "option", "text", "1", "", "1"],
  ["1", "value", "text", "0", "", "0"]
]
  },
  {
"sql":"\n\nexplain update modeloption_vt_writable set value = 'v' where 
option='o';",
"cols":["addr", "opcode", "p1", "p2", "p3", "p4", "p5", "comment"],
"time":0,
"results":[
  ["0", "Init", "0", "25", "0", "", "00", "Start at 25"],
  ["1", "OpenEphemeral", "2", "4", "0", "", "00", "nColumn=4"],
  ["2", "VOpen", "1", "0", "0", "vtab:C3289DFC0", "00", ""],
  ["3", "Integer", "0", "10", "0", "", "00", "r[10]=0"],
  ["4", "Integer", "0", "11", "0", "", "00", "r[11]=0"],
  ["5", "VFilter", "1", "16", "10", "", "00", "iplan=r[10] zplan=''"],
  ["6", "VColumn", "1", "0", "12", "", "00", "r[12]=vcolumn(0); 
modeloption_vt_writable.option"],
  ["7", "Ne", "13", "15", "12", "(BINARY)", "52", "if r[12]!=r[13] goto 
15"],
  ["8", "VColumn", "1", "0", "6", "", "01", "r[6]=vcolumn(0)"],
  ["9", "String8", "0", "7", "0", "v", "00", "r[7]='v'"],
  ["10", "VColumn", "1", "0", "4", "", "00", "r[4]=vcolumn(0)"],
  ["11", "SCopy", "6", "5", "0", "", "00", "r[5]=r[6]"],
  ["12", "MakeRecord", "4", "4", "8", "", "00", "r[8]=mkrec(r[4..7])"],
  ["13", "NewRowid", "2", "9", "0", "", "00", "r[9]=rowid"],
  ["14", "Insert", "2", "8", "9", "", "00", "intkey=r[9] data=r[8]"],
  ["15", "VNext", "1", "6", "0", "", "00", ""],
  ["16", "Rewind", "2", "23", "0", "", "00", ""],
  ["17", "Column", "2", "0", "4", "", "00", "r[4]="],
  ["18", "Column", "2", "1", "5", "", "00", "r[5]="],
  ["19", "Column", "2", "2", "6", "", "00", "r[6]="],
  ["20", "Column", "2", "3", "7", "", "00", "r[7]="],
  ["21", "VUpdate", "0", "4", "4", "vtab:C3289DFC0", "02", "data=r[4..7]"],
  ["22", "Next", "2", "17", "0", "", "00", ""],
  ["23", "Close", "2", "0", "0", "", "00", ""],
  ["24", "Halt", "0", "0", "0", "", "00", ""],
  ["25", "Transaction", "0", "1", "1", "0", "01", "usesStmtJournal=1"],
  ["26", "VBegin", "0", "0", "0", "vtab:C3289DFC0", "00", ""],
  ["27", "String8", "0", "13", "0", "o", "00", "r[13]='o'"],
  ["28", "Goto", "0", "1", "0", "", "00", ""]
]
  }
]
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] [EXTERNAL] Correct use of sqlite3_vtab_nochange/sqlite3_value_nochange

2019-08-13 Thread Kevin Martin


> On 12 Aug 2019, at 07:53, Hick Gunter  wrote:
> 
> You did not state your argc and argv[0] values,

Apologies, it is a 2 column table. Full details are:

- argc is 4
- argv[0] is the value of the primary key for the row I want to update.
- argv[1] is SQLITE_NULL, but as described, sqlite3_value_nochange(argv[1]) 
returns true
- argv[2] is the same as argv[1] (null, but sqlite3_value_nochange returns true)
- argv[3] is the new value of the non-primary key column. 

> so looking at the documentation would suggest that SQLite is actually asking 
> for an INSERT into a WITHOUT ROWID virtual table.

I see that is what the documentation leads you to believe, but I can assure you 
I am exciting an update of the form

update t set notprimarykey='some value' where primarykey='other value'

Removing the sqlite3_vtab_nochange from the xColumn call gets the documented 
behaviour. If I keep the sqlite3_vtab_nochange and change my code in xUpdate as 
described in the last email, everything seems to work. I just want to check it 
is correct.

Thanks,
Kev
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Correct use of sqlite3_vtab_nochange/sqlite3_value_nochange

2019-08-10 Thread Kevin Martin

Hi,

I have a without rowid virtual table with an implementation of xColumn that 
begins with

if(sqlite3_vtab_nochange(ctx)) return SQLITE_OK;

If I try to perform an update on this table that doesn't involve a primary key 
change, then my understanding from the documentation is that xUpdate will be 
called and the value of argv[0] and argv[1] will be the same. What I am seeing 
is that argv[1] is set an sql null value, although when I call 
sqlite3_value_nochange(argv[1]) I do get true returned.

Am I therefore right in thinking that the correct detection of whether there is 
an update without a primary key change when using sqlite3_vtab_nochange is 
actually

sqlite3_value_nochange(argv[1]) || values_are_equal(argv[0], argv[1])

Thanks,
Kev
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLITE_LOCKED means "database table is locked"?

2019-06-19 Thread Kevin Benson

On Wed, Jun 19, 2019 at 12:12 PM Carsten Müncheberg <
carsten.muencheb...@native-instruments.de> wrote:

> Am 19.06.2019 um 16:47 schrieb Igor Tandetnik:
> > On 6/19/2019 10:39 AM, Carsten Müncheberg wrote:
> >> Is there really something like a table lock?
> >
> > Yes there is: https://sqlite.org/sharedcache.html
>
> Thanks, I wasn't aware of that.
>
>
 There's an inadvertent word omission in the second stanza under Table
Level Locking:

 IS To read data a table, a connection must first obtain a
read-lock.
SHOULD BE To read data from a table, a connection must first obtain a
read-lock.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Optimising multiple group by clauses

2019-06-08 Thread Kevin Martin

Hi,

I am trying to find the best way to write a query that has two levels of group 
by where the outer group by columns are a subset of the inner group by columns. 
In my example below I want to do an aggregation grouping by per, prod, and mar, 
then I want aggregate the results of this aggregation, grouping by just prod, 
and per. From the results of explain query plan, I can see a B-Tree is not used 
if I only do the first group by - this is mentioned in the query optimisation 
page. However, a B-Tree is used for the second group by when both group bys are 
present, and I don't understand why, as I think the rows from the subquery come 
out in an order already suitable for the second group by?

My actual data is a bit more complex, but I am seeing a 10x-20x speed 
difference between the query with the single group by and the query with both. 
If the B-Tree is necessary, it would be good to be able to understand why, and 
if not, it would be great if there were some way to communicate this to the 
query planner.

This below was run on a freshly downloaded and compiled 3.28.

Thanks,
Kev

---

QUERY PLAN
`--SCAN TABLE data USING INDEX sqlite_autoindex_data_1
QUERY PLAN
|--CO-ROUTINE 1
|  `--SCAN TABLE data USING INDEX sqlite_autoindex_data_1
|--SCAN SUBQUERY 1
`--USE TEMP B-TREE FOR GROUP BY



create table data(
  prod integer not null,
  per integer not null,
  mar integer not null,
  off integer not null,
  val real not null,
  primary key(prod, per, mar, off)
);

explain query plan select
  prod,
  per,
  mar,
  sum(val) as val
from
  data
group by
  prod,
  per,
  mar
;

explain query plan select
  prod,
  per,
  min(val)
from
  (select
prod,
per,
mar,
sum(val) as val
  from
data
  group by
prod,
per,
mar)
group by
  prod,
  per
;


   
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] picking random subset of rows

2019-03-18 Thread Kevin Martin

> On 18 Mar 2019, at 16:15, Dan Kennedy  wrote:
> 
> 
> In SQLite, a correlated sub-query on the RHS of an IN(...) operator may be 
> rerun every time the IN(...) test is required. And if that sub-query contains 
> "random()" it might return a different result every time.
> 
> Your words suggest that you are hoping it will be run once for each different 
> value of "da.area", with different results each time. But it will not.

Ah yes, this makes complete sense now, thanks. I was going off a stack overflow 
post as to how to emulate outer apply in sqlite3. I didn't really think through 
what I was writing.

In my case, I only need a different ordering each time I create the database, 
as opposed to each time the query is run. So, I can remove the random() from 
the subquery, and instead create a new table populated by a trigger mapping 
product to a random number and order by that in the subquery instead.

I would be interested if there is a solution for sqlite 3.22 (i.e. no window 
functions) where it can be done so that the query gives a possibly different 
result each time it is executed.

Thanks,
Kevin
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] picking random subset of rows

2019-03-18 Thread Kevin Martin

Hi,

I am trying to use a correlated subquery with an 'order by random() limit 2' to 
pick upto two random rows for each value in the outer query. I am not sure if I 
am doing this correctly, but the number of rows I am getting seems to vary 
randomly which doesn't make sense to me. If i replace the order by random() 
with order by product I always get the expected number of rows. I have tried to 
create a simplified version of the code below to replicate the issue.

I am experiencing the problem on sqlite 3.22.0, but I have tried on 
sqliteonline.com which I think is using 3.27.2 and am seeing similar results.

Thanks,
Kevin

---

create table if not exists test_productattribs (product text primary key, attr, 
val);   


insert or ignore into test_productattribs values

  ('1', 'area', 'a'),   
   
  ('2', 'area', 'b'),   

  ('3', 'area', 'a'),   
   
  ('4', 'area', 'a')

;   
   


--In the real query, this is done inside the with, but it does not seem 
relevant
--to the issue. 
   
create table if not exists  

  test_productarea  
   
as select   

  product,  
   
  val as area   

from
   
  test_productattribs   

where   
   
  attr='area'   
   
;   



--I have two areas, 'a' and 'b'. I limit to two random products from each area  

--As area 'b' only has one product, I always expect to get 3 rows, 1 for area 
'b', and 2 for   
--area 'a'. 

with
   
  dareas as (select distinct

area
   
  from

[sqlite] SEE

2019-03-13 Thread Williams, Kevin

Any articles out there giving some hint how we use the SEE version of SQLite we 
compiled with our C# program that references System.Data.SQLite?

Thanks in advance!

Kevin
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Regarding CoC

2018-10-24 Thread Kevin Youren

Richard,

thank you for your further explanation of your team's Code of Conduct.

After a bit of research on the Internet, everything makes sense.

Well done.

regs,

Kev



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] weekday time modifier

2018-09-23 Thread Kevin Martin

Hi,

Not sure if this is me misreading it, but the description of the weekday 
modifier in the documentation seems a bit ambiguous.

It says:

> The "weekday" modifier advances the date forward to the next date where the 
> weekday number is N. Sunday is 0, Monday is 1, and so forth.

It is not clear what happens when the date before the modifier is already the 
correct weekday. I interpreted this as it would advance by a full week, but it 
does't, it leaves the date untouched:

SQLite version 3.8.10.2 2015-05-20 18:17:19
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> select strftime('%Y-%m-%d', '2018-09-23', 'weekday 0');
2018-09-23
sqlite> 

I seem the same behaviour in 3.24.

Thanks,
Kev
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] SQLiteJava

2018-09-07 Thread Kevin Rushforth


[Bcc'ing the openjfx-dev list]

I see no indication that there is anything relating to JavaFX in your 
question, so there seems to be no need to include the openjfx-dev list.


-- Kevin


On 9/6/2018 3:21 PM, AmnoJeeuw wrote:

I have a database table named company that looks like this:

snapshot2


When the application is asked to change the value of ArbolOne to ArbolOn,

db_snapshot1


the application generates the following schema represented on this 
message box:

db_snapshot3


which produces this Exception error:
db_snapshot4


I am attaching a snip of the code, in case you'd like to know it.
Having said that, I'd like to know if this is a bug or if I have not 
understand the information found in here 
<https://www.tutorialspoint.com/jdbc/jdbc-update-records.htm>.


Any help would be much appreciated.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Back on-line. Was: Mailing list shutting down...

2018-06-14 Thread Kevin O'Gorman

On Thu, Jun 14, 2018 at 8:58 AM, Richard Hipp  wrote:

> ...



>
> So there you have it:  If you want to harass someone by sending them
> thousands of subscription confirmations, there is now a website to
> assist you.  Do we need any further evidence that the heart of man is
> deceitful above all things, and desperately wicked?
>
>
There is no doubt that SOME folks have deceitful hearts and are wicked.
Please don't paint all members of the species with just one brush.

-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Usage of temporary files of SQLite3 on Android / SQLite commands fail

2018-06-10 Thread Kevin Benson

On Sun, Jun 10, 2018 at 12:45 PM skywind mailing lists <
mailingli...@skywind.eu> wrote:

> Hi,
>
> but to which directory should I set it? There is no general tmp directory
> accessible.
>
> Regards,
> Hartwig
>
> > Am 2018-06-10 um 02:30 schrieb Bob Friesenhahn <
> bfrie...@simple.dallas.tx.us>:
> >
> > On Sat, 9 Jun 2018, skywind mailing lists wrote:
> >
> >> Hi,
> >>
> >> currently I am not creating large subqueries or views and therefore
> storing the temporary data in memory is a solution but I would like to have
> a future proof solution. And I do not like to think about it anymore in the
> future.
> >
> > Have you tried setting the POSIX standard TMPDIR environment variable?
> This might have useful influence under Android.
> >
> > Bob
>
>
Android is foreign to me, but the thread below mentions: /data/local/tmp

https://stackoverflow.com/a/41105574/1828624
--
   --
  --
 --Ö¿Ö--
K e V i N
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] ROWID....

2018-06-09 Thread Kevin Benson

On Sat, Jun 9, 2018 at 7:03 AM Luuk  wrote:

>
> In the docs (https://www.sqlite.org/autoinc.html) it says:
> In SQLite, table rows normally have a 64-bit signed integer ROWID
>  
>
> Question:
> Why it this a signed integer, and not an unsigned integer?
>
> Simply by choice? of is there something more to say about this?
>

http://sqlite.1065341.n5.nabble.com/Use-of-AUTOINCREMENT-td74775.html#a74786
--
   --
  --
 --Ö¿Ö--
K e V i N
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Multi threaded readers on memory sqlite cannot scale

2018-05-13 Thread Kevin O'Gorman

On Sun, May 13, 2018 at 9:01 AM, Dennis Clarke 
wrote:

> On 05/13/2018 11:57 AM, Kevin O'Gorman wrote:
>
>> The arguments here are simplified
>>
>
>
> Will you stop top posting please?
>
> I am trying to follow along here about some x86 boxen stuff but
> you are top posting madly. Also is that a single socket machine
> with a single big memory bank or is it NUMA and multiple sockets
> or is it just a single motherboard unit?
>
>
> Dennis
>
>
It is a single motherboard with two Xeon sockets, and 16 memory sockets.
I think those sockets are in two banks, which may be relevant.

Anyhow, those details weren't my point.  The point was that computing
hardware
is a galaxy of possibilities, some with more inherent parallelism than
others.  I'm
not personally having throughput troubles.

-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Multi threaded readers on memory sqlite cannot scale

2018-05-13 Thread Kevin O'Gorman

The arguments here are simplified, and assume some things that may or may
not be true.  The server I keep in my garage has  16 real cores, 32
threads.  More importantly, it uses DDR4 memory which I think means there
are 4 channels to memory which can be used in parallel -- perhaps not on
exactly the same address but the memory is spread among 16 DIMMs. (It's a
System76 "Silverback" with 256 GB RAM).  Lots of opportunities for
parallelism.  Moreover, depending on the actual work to be done, there may
be a considerable amount of "inherently parallelizable" work.  You don't
know until you try it -- or better yet, measure it.

Sure, there will be limits to how far this can go, but modern machines are
designed to take advantage of opportunities for parallelism.  You just have
to get rid of unnecessary locking.  So I repeat, why is a read-only
database being serialized at all?

On Sun, May 13, 2018 at 8:08 AM, Keith Medcalf  wrote:

>
> >2. Running the same example on sqlite *file *(multi threaded mode;
> >WAL journal) scales almost linearly;  so 6 threads provide nearly 6xN
> >throughput. Single threaded throughput is a bit slower (around 15-
> >20%) than single threaded in-memory access (expected).
>
> So, there is some "part" of the process that is "inherently
> parallelizable" and you have managed to have a one "thread" to some work
> during the time some "other thread" is consumed doing something you cannot
> see.  Congratulations.  This will increase by diminishing returns.
> Eventually adding more parallelism will make things slower.
>
> >So, memory sqlite is not really usable with multiple threads
> >(readers).  While one might expect that multiple readers of
> >*memory *content could scale even better than with file content.
>
> I would expect that a single connection to a single in memory database is
> 100% efficient and cannot be further optimized, and therefore I would not
> try.
>
> Why would I (or anyone of sound mind) want to insert "inefficiencies" so
> that one can then spend inordinate amounts of time to never quite eliminate
> them, and only go forever slower in the process?
>
>
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Multi threaded readers on memory sqlite cannot scale

2018-05-13 Thread Kevin O'Gorman

It's not clear to me why reads must be serialized at all.  Maybe this could
be re-thought?  Maybe there should be a way to tell SQLite that a certain
DB or table is to be read-only and unserialized?

On Sun, May 13, 2018 at 7:15 AM, Keith Medcalf  wrote:

>
> Say Hi to Gene!
>
> https://en.wikipedia.org/wiki/Amdahl%27s_law
>
> So I believe what you are saying is something like this:  If I take a
> child and have it count as fast as it can then it can count to X in an
> hour.  However, I take the same child but have it count as fast as it can
> at five minute stretches, the sum of the X's is less than it was at one
> go.  If I get the child to do this at random intervals consuming juice
> boxes in between, the sum of the X's is even lower, the higher the number
> of interruptions becomes.
>
> In the second case the task consists of counting to ten and then drinking
> a juice box.  If you get one child, then it takes time X.  Interestingly,
> if you get two children, the tasks (empty juice boxes) stack up twice as
> fast.  There is some overlap between the operations.  As you add more and
> more children it goes faster and faster, but not quite.  Eventally all the
> children are drinking the juice box as the same time and adding more
> children does not make things go faster.
>
>
> ---
> The fact that there's a Highway to Hell but only a Stairway to Heaven says
> a lot about anticipated traffic volume.
>
>
> >-Original Message-
> >From: sqlite-users [mailto:sqlite-users-
> >boun...@mailinglists.sqlite.org] On Behalf Of Techno Magos
> >Sent: Sunday, 13 May, 2018 04:51
> >To: sqlite-users@mailinglists.sqlite.org
> >Subject: [sqlite] Multi threaded readers on memory sqlite cannot
> >scale
> >
> >Hello
> >
> >I do not have clear examples to post  on this but would like to
> >report
> >findings around multi threaded read access (single process) in a
> >large
> >system that uses sqlite.
> >
> >This may be a known issue/restriction of memory sqlite behaviour, but
> >wanted to check with the list first:
> >
> >1. Running 2, 3, ... 6 multi threaded readers of a single *memory
> >*sqlite
> >database (via shared cache mode) on an 8 core cpu shows no throughput
> >gain
> >at all compared to single threaded throughput. In fact, it shows a
> >throughput drop: i.e. if a single thread can do N simple queries/sec,
> >2
> >threads .. up to 6 threads do a little less (10% drop) in total. This
> >suggests that access to memory sqlite can only be serialized?
> >
> >2. Running the same example on sqlite *file *(multi threaded mode;
> >WAL
> >journal) scales almost linearly;  so 6 threads provide nearly 6xN
> >throughput. Single threaded throughput is a bit slower (around 15-
> >20%)
> >than single threaded in-memory access (expected).
> >
> >So, memory sqlite is not really usable with multiple threads
> >(readers).
> >While one might expect  that multiple readers of *memory *content
> >could
> >scale even better than with file content.
> >
> >Can this restriction be lifted?
> >Is there some special mode possible to achieve scaling up throughput
> >with
> >multiple threads for memory sqlite content?
> >
> >
> >Thanks
> >___
> >sqlite-users mailing list
> >sqlite-users@mailinglists.sqlite.org
> >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Broken DB?

2018-04-27 Thread Kevin O'Gorman

Oops.  Wrong list.  Should go to a Django group.  I noticed as soon as I
sent this.  Please ignore.

On Fri, Apr 27, 2018 at 2:58 PM, Kevin O'Gorman 
wrote:

> I've got a working site, but I made a copy of the database in order to do
> some development work.
> I've hit a snag that looks like a problem in the data.
>
> Ive written a management command to show the problem:
>
> from django.core.management.base import BaseCommand, CommandError
>
> # Stuff for the library
> from oil.models import Packet, Signature, Log, Voter
>
> class Command(BaseCommand):
> help = 'Shows a quick count of validations'
> BaseCommand.requires_migrations_checks = True
>
>
> def handle(self, *args, **options):
> voters = Log.objects.all()
> self.stdout.write(repr(voters[0]))
>
> I'm suspecting a problem has crept into my Log table, because it works
> fine if I change Log on the
> second line of handle() to any of the other tables.  If it runs as shown
> here however, I get
>
> kevin@camelot-x:/build/comprosloco$ manage oiltest
> Traceback (most recent call last):
>   File "./manage", line 22, in 
> execute_from_command_line(sys.argv)
>   File "/build/django/django/core/management/__init__.py", line 364, in
> execute_from_command_line
> utility.execute()
>   File "/build/django/django/core/management/__init__.py", line 356, in
> execute
> self.fetch_command(subcommand).run_from_argv(self.argv)
>   File "/build/django/django/core/management/base.py", line 283, in
> run_from_argv
> self.execute(*args, **cmd_options)
>   File "/build/django/django/core/management/base.py", line 330, in
> execute
> output = self.handle(*args, **options)
>   File "/raid3/build/comprosloco/oil/management/commands/oiltest.py",
> line 15, in handle
> self.stdout.write(repr(voters[0]))
>   File "/build/django/django/db/models/base.py", line 590, in __repr__
> u = six.text_type(self)
>   File "/raid3/build/comprosloco/oil/models.py", line 172, in __str__
> self.accepted
> TypeError: sequence item 0: expected str instance, datetime.datetime found
> kevin@camelot-x:/build/comprosloco$
>
> And I have no idea how to debug it further.  The schema of Log is
> sqlite> .schema oil_log
> CREATE TABLE "oil_log" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
> "packet" integer NOT NULL, "signature" integer NOT NULL, "action"
> varchar(20) NOT NULL, "criteria" varchar(150) NOT NULL, "candidates"
> varchar(100) NOT NULL, "accepted" varchar(10) NOT NULL, "user_id" integer
> NOT NULL REFERENCES "auth_user" ("id"), "timestamp" datetime NOT NULL);
> CREATE INDEX "oil_log_packet_ecd59bc4" ON "oil_log" ("packet");
> CREATE INDEX "oil_log_user_id_7f26e501" ON "oil_log" ("user_id");
> sqlite>
>
>
> Help???
>
>
> --
> Dictionary.com's word of the year: *complicit*
>



-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Broken DB?

2018-04-27 Thread Kevin O'Gorman

I've got a working site, but I made a copy of the database in order to do
some development work.
I've hit a snag that looks like a problem in the data.

Ive written a management command to show the problem:

from django.core.management.base import BaseCommand, CommandError

# Stuff for the library
from oil.models import Packet, Signature, Log, Voter

class Command(BaseCommand):
help = 'Shows a quick count of validations'
BaseCommand.requires_migrations_checks = True


def handle(self, *args, **options):
voters = Log.objects.all()
self.stdout.write(repr(voters[0]))

I'm suspecting a problem has crept into my Log table, because it works fine
if I change Log on the
second line of handle() to any of the other tables.  If it runs as shown
here however, I get

kevin@camelot-x:/build/comprosloco$ manage oiltest
Traceback (most recent call last):
  File "./manage", line 22, in 
execute_from_command_line(sys.argv)
  File "/build/django/django/core/management/__init__.py", line 364, in
execute_from_command_line
utility.execute()
  File "/build/django/django/core/management/__init__.py", line 356, in
execute
self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/build/django/django/core/management/base.py", line 283, in
run_from_argv
self.execute(*args, **cmd_options)
  File "/build/django/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
  File "/raid3/build/comprosloco/oil/management/commands/oiltest.py", line
15, in handle
self.stdout.write(repr(voters[0]))
  File "/build/django/django/db/models/base.py", line 590, in __repr__
u = six.text_type(self)
  File "/raid3/build/comprosloco/oil/models.py", line 172, in __str__
self.accepted
TypeError: sequence item 0: expected str instance, datetime.datetime found
kevin@camelot-x:/build/comprosloco$

And I have no idea how to debug it further.  The schema of Log is
sqlite> .schema oil_log
CREATE TABLE "oil_log" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"packet" integer NOT NULL, "signature" integer NOT NULL, "action"
varchar(20) NOT NULL, "criteria" varchar(150) NOT NULL, "candidates"
varchar(100) NOT NULL, "accepted" varchar(10) NOT NULL, "user_id" integer
NOT NULL REFERENCES "auth_user" ("id"), "timestamp" datetime NOT NULL);
CREATE INDEX "oil_log_packet_ecd59bc4" ON "oil_log" ("packet");
CREATE INDEX "oil_log_user_id_7f26e501" ON "oil_log" ("user_id");
sqlite>


Help???


-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Simple read-only program very slow

2017-12-04 Thread Kevin O'Gorman

On Sun, Dec 3, 2017 at 8:49 AM, Keith Medcalf  wrote:

>
> On Sunday, 3 December, 2017 08:24, Richard Rousselot <
> richard.rousse...@gmail.com> wrote:
>
> >Had similar issue a few years ago; we were using the SQLite3.exe.  We
> >recompiled the exe as 64 bit which allowed it to use more than 4 GB
> >of ram, loaded the machine with as much memory as we could.  Voila,
> >super fast processing.
>
> >Can the Python libraries be made 64 bit some how?
>
> Yes.  You must be using a 64-bit version of Python and the procedure to
> replace the sqlite3.dll / sqlite3.so it uses is the same as for the 32-bit
> version, or to compile and use a 64-bit version of the apsw extension is
> unchanged.
>
> Neither Windows nor Linux can thunk a dynamic load module such that the
> one used is a different model than the running process (it was proprietary
> IBM technology that no one else seems smart enough to duplicate), so you
> have to update Python to the 64-bit model as well.
>
> On Windows 10 16299.98 I have both a 32-bit (Python 2.7.14) and 64-bit
> (Python 3.6.4) installed and build 32-bit DLLs for the former and 64-bit
> for the latter from the same source (just selecting -m32 or -m64 as
> appropriate).  I use the MinGW64/GCC compiler because (a) it can compile in
> either model depending on the switch you use without requiring any code
> changes, (b) supports long long and long double in 32-bit; and, (c) does
> not require the use of the Microsoft C Runtime "moving target" libraries --
> it can compile to the subsystem runtime (MSVCRT) that has been stable
> since, oh, the first OS/2 New Technology (which later became Windows NT)
> way back when.  Oh, and MinGW/GCC does "true" position independent code and
> when you do a static link of a module to either and executable or dynamic
> load library, it is truly static with no strange external dependencies.
>
> Since SQLite3 is heavily I/O bound (or at least syscall/kernel call bound
> for mutexes, etc) in practically everything it does, the 64-bit version is
> much faster (about 20%) than the 32-bit version, when running on a 64-bit
> OS, since the OS does not have to thunk the call stack when
> accessing/returning from  the kernel.
>
> >On Thu, Nov 30, 2017 at 7:01 PM Keith Medcalf 
> >wrote:
> >
> >>
> >> Is there an index on pos where ppos is the left-most field (or the
> >only
> >> field) in the index?
> >> What is the column affinity of ppos?  Of the fiold you are passing
> >as a
> >> parameter?
> >> Is ppos unique?
> >>
> >> If you CREATE [UNIQUE] INDEX searchindex ON pos (ppos, pnum);
> >>
> >> then your query can be satisfied only using the searchindex
> >covering index.
> >>
> >> If there is not an index on ppos, then you will be wasting time
> >recreating
> >> the index for each query.
> >>
> >> You will probably need to increase the cache size beyond the paltry
> >> default in order for the entire btree structures to be cached in
> >RAM -- you
> >> probably want to make it as big as you can.
> >>
> >> ---
> >> The fact that there's a Highway to Hell but only a Stairway to
> >Heaven says
> >> a lot about anticipated traffic volume.
> >>
> >>
> >> >-Original Message-
> >> >From: sqlite-users [mailto:sqlite-users-
> >> >boun...@mailinglists.sqlite.org] On Behalf Of Kevin O'Gorman
> >> >Sent: Saturday, 25 November, 2017 20:14
> >> >To: sqlite-users
> >> >Subject: [sqlite] Simple read-only program very slow
> >> >
> >> >I'm pretty new at SQLite, so this may seem obvious to you.  Be
> >kind.
> >> >I'm using Python on Ubuntu Linux 16.04 LTS, and the sqlite that is
> >> >built
> >> >into Python.  The database
> >> >is using WAL.
> >> >
> >> >I've got a database of some 100 million records, and a file of
> >just
> >> >over
> >> >300 thousand that I want represented in it.  I wanted to check how
> >> >much
> >> >difference it was going to make, so I wrote a super
> >> >simple program to the read the file and count how many records are
> >> >already
> >> >there.  I got impatient waiting for it so I killed the process and
> >> >added an
> >> >output of one dot (".") per 1000 records.  It went very fast for
> >what
> >> >I
> >> >estimate was around 200 dots and hit a wall.  It mad

Re: [sqlite] Simple read-only program very slow

2017-11-30 Thread Kevin O'Gorman

On Sun, Nov 26, 2017 at 12:02 AM, Clemens Ladisch 
wrote:

> Kevin O'Gorman wrote:
> > I wrote a super simple program to the read the file and count how many
> > records are already there.  I got impatient waiting for it so I killed
> > the process and added an output of one dot (".") per 1000 records.  It
> > went very fast for what I estimate was around 200 dots and hit a wall.
> > It made progress, but very very slowly.  [...]
> > The field being used for the lookup has an index.
>
> I'd guess that most records are found, and that the file and the table
> happen to be somewhat sorted.  The search becomes slow when the amount
> of data that needs to be read exceeds the available memory.
>
> > Why does commit() make a difference?
>
> Hmmm ... interesting.
>
> > for row in conn.execute("""
> > SELECT pnum
> > FROM pos
> > WHERE ppos=?
> > """,(pos,)):
> > pnum = row[0]
> > break
> > if pnum is None:
> > missing += 1
> > else:
> > present += 1
>
> Even with the index on ppos, the DB still has to look up the table row
> to read the pnum value.
>
> You do not care about the actual pnum value, so you could replace it
> with a constant value ("SELECT 1 FROM ...").  Or just use EXISTS to
> show what you actually want to do:
>
> cursor = conn.execute("""
> SELECT EXISTS (
> SELECT *
> FROM pos
> WHERE ppos = ?)
> """, (pos,))
> exists = cursor.fetchone()[0]
> if exists:
> present += 1
> else:
> missing += 1
> <http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users>


That's real interesting and I'll keep it in mind for the future, but note
that my actual code DOES care about the contents of pmain, so it has to
look in the database anyway.  I'm still left wondering why it ran so slow
and why a commit() helped a read-only program running alone on the machine.

I'm gonna try this on my Xeon with 256 GB of RAM to check out the idea it
was running out of space.  It is true that the database is 50 GB so of
course it's bigger than the RAM on the usual desktop.


-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Simple read-only program very slow

2017-11-30 Thread Kevin O'Gorman

On Sun, Nov 26, 2017 at 1:39 AM, Simon Slavin  wrote:

>
>
> On 26 Nov 2017, at 3:13am, Kevin O'Gorman  wrote:
> >
> > I've got a database of some 100 million records, and a file of just over
> > 300 thousand that I want represented in it.  I wanted to check how much
> > difference it was going to make, so I wrote a super
> > simple program to the read the file and count how many records are
> already
> > there.
>
> You can use COUNT(*) to find out how many rows there are in a table.  SQL
> is optimized to handle this faster than reading individual row data.
>
> SELECT COUNT(*) FROM pos
>
> Simon.
>

I know, but that does not help much when I'm trying to match the database
against a file, as I am here.

-- 
Dictionary.com's word of the year: *complicit*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Simple read-only program very slow

2017-11-25 Thread Kevin O'Gorman

I'm pretty new at SQLite, so this may seem obvious to you.  Be kind.
I'm using Python on Ubuntu Linux 16.04 LTS, and the sqlite that is built
into Python.  The database
is using WAL.

I've got a database of some 100 million records, and a file of just over
300 thousand that I want represented in it.  I wanted to check how much
difference it was going to make, so I wrote a super
simple program to the read the file and count how many records are already
there.  I got impatient waiting for it so I killed the process and added an
output of one dot (".") per 1000 records.  It went very fast for what I
estimate was around 200 dots and hit a wall.  It made progress, but very
very slowly.

So I killed it again and added a commit() call every time it output a dot.
It didn't hit a wall, just some rough road (that is, it slowed down at
about the same spot but not nearly so drastically).

The code makes to changes to the database at all.  Why does commit() make a
difference?  What else should I learn from this?

The field being used for the lookup has an index.

++ kevin

Code follows:
#!/usr/bin/env python3
"""Count the number of records that represent rows in the database 'pos'
table.
The database is not modified.

 Last Modified: Sat Nov 25 18:56:49 PST 2017
"""

import os.path  # https://docs.python.org/3.5/library/os.path.html
import sys  # https://docs.python.org/3.5/library/sys.html
import argparse # https://docs.python.org/3.5/library/argparse.html
import sqlite3  # https://docs.python.org/3.5/library/sqlite3.html
import re   # https://docs.python.org/3.5/library/re.html

# from /usr/local/lib/python3.5/dist-packages
import qcreate
from qerror import *
import myparser

if __name__ == '__main__':
parser = argparse.ArgumentParser(description="""A program to read
positions and count how many are
in the database""",)
parser.add_argument("--dbname", default=None,
help="name of the database to work on (overrides qubic.ini
file)")
parser.add_argument("file", nargs='?', type=argparse.FileType('r'),
default=sys.stdin,
help="file containing the qsearch results (default stdin)")
args=parser.parse_args()
infile = args.file

if args.dbname is None:
here=os.path.split(os.path.realpath('.'))[1]
for confdir in
".",os.path.join(os.environ["HOME"],".config"),"/etc/xdg/":
f = os.path.join(confdir, "qubic.ini")
if os.path.exists(f):
args.dbname = myparser.parse(f, here, "dbname")
if args.dbname is not None:
break
if args.dbname is None:
print(" *** ERROR: no database name provided and none found in
qubic.ini files")
sys.exit(1)

present = missing = lines = 0
with sqlite3.connect(args.dbname) as conn:
for line in infile:
fields = line.split()
pos = fields[0]
if len(pos) != 64: # Important test to catch grep without
--no-filename
raise InputError(" ERROR: input line has wrong-sized
position: " + line)

pnum = None
for row in conn.execute("""
SELECT pnum
FROM pos
WHERE ppos=?
""",(pos,)):
pnum = row[0]
break
if pnum is None:
missing += 1
else:
present += 1
lines += 1
if lines % 1000 == 0:
print(".",flush=True,end="")
conn.commit()
print("there were",present,"records on file and",missing," were
missing")
print("out of a total of", lines, "records.")



-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Seg fault with core dump. How to explore?

2017-10-01 Thread Kevin O'Gorman

On Sat, Sep 30, 2017 at 11:41 PM, Clemens Ladisch 
wrote:

> Kevin O'Gorman wrote:
> > my latest trial run ended with a segmentation fault
>
> Really a segmentation fault?  What is the error message?
>

What such things always say "segementation fault (core dumped)" and the
name of the program.

>
> > This particular program is merging two databases.  The result has reached
> > 25 GB, roughly 1/3 of what I expect of the final result (over 100M rows).
> > The filesystem is a RAID with 2+ TB free.
>
> Does the /var/tmp filesystem fill up?
>

No.  And /var/tmp is not used as I've redirected tmp onto my RAID

>
> > Here's my prime suspect: I'm using WAL, and the journal is 543 MB.
>
> In WAL mode, the log stores the new versions of all changed pages.
> In rollback journal mode, the journal stores the old version of all
> changed pages.  So when you're creating a new DB (where the old version
> is empty), journal rollback mode is likely to be more efficient.
>
> I'm not creating a new database.  I'm merging one into the other.


>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Seg fault with core dump. How to explore?

2017-09-30 Thread Kevin O'Gorman

What I'm testing is my code.  I want to be sure the code is going to work.
A crash is a primary indication that it won't.  That's information, not
just an annoyance.

On Sat, Sep 30, 2017 at 5:37 PM, Joseph L. Casale  wrote:

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On
> Behalf Of Kevin O'Gorman
> Sent: Saturday, September 30, 2017 3:55 PM
> To: sqlite-users 
> Subject: [sqlite] Seg fault with core dump. How to explore?
>
> > Here's my prime suspect: I'm using WAL, and the journal is 543 MB.
>
> Do you really need any reliability at all for a test? Who cares if the
> power goes out or the program crashes? If this is a test, you will simply
> restart it and the data is irrelevant so why impede any potential
> performance for data integrity?
>
> Try setting the journal_mode off...
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Seg fault with core dump. How to explore?

2017-09-30 Thread Kevin O'Gorman

I'm using the standard shim, because I've been using it forever and first
head of APSW just a few days ago.  I'm guessing it should be pretty easy to
switch because I'm not doing anything weird.  All my columns are INTEGER or
CHAR, there are not even any foreign keys, although one of the two main
tables does contain primary keys (integer autoincrement primary key) of the
other.

I'm a little leery of switching on account of one crash, as it may weel be
an over-reaction.

On Sat, Sep 30, 2017 at 4:30 PM, Simon Slavin  wrote:

>
>
> On 30 Sep 2017, at 10:54pm, Kevin O'Gorman 
> wrote:
>
> > Here's my prime suspect: I'm using WAL, and the journal is 543 MB.  I
> > hadn't given it much thought, but could this be more than the software
> > really wants to deal with?
>
> No SQLite.  Possibly something else you’re using.  I used to work daily
> with a 43 Gigabyte SQLite database.  And most of that space was used by one
> tall thin table.  SQLite has known limits and is not thoroughly tested near
> those limits (because nobody can afford to buy enough hardware to do it) ,
> but those limits are a lot more than half a Gigabyte.
>
> <https://sqlite.org/limits.html>
>
> A crash sometimes happens because the programmer continues to call sqlite_
> routines after one of them has already reported a problem.  Are you
> checking the values returned by all sqlite_() calls to see that it is
> SQLITE_OK ?  You may have to learn how your Python shim works to know: it
> may interpret other results as "catch" triggers or some equivalent.
>
> Are you using the standard Python shim or APSW ?  The standard Python shim
> does complicated magic to make SQLite behave the way Python wants it to
> behave.  This complication can make it difficult to track down faults.  You
> might instead want to try APSW:
>
> <https://rogerbinns.github.io/apsw/>
>
> This is an extremely thin shim that does almost nothing itself.  That
> makes it easy to track down all errors to a Python problem or a SQLite
> problem.  I’m not saying we can’t help with the standard Python import,
> just that it’s a little more complicated.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Seg fault with core dump. How to explore?

2017-09-30 Thread Kevin O'Gorman

I'm testing new code, and my latest trial run ended with a segmentation
fault after about 5 hours.
I'm running Python 3.5 and its standard sqlite3 module On Xubuntu 16.04.3
LTS.  The code is short -- about 300 lines.

This particular program is merging two databases.  The result has reached
25 GB, roughly 1/3 of what I expect of the final result (over 100M rows).
The filesystem is a RAID with 2+ TB free.  The machine is a Core i7 with 32
GB RAM and 0 swap has been used since the last reboot.  Nothing else much
is running on this machine except some idle terminal and browser windows.

Here's my prime suspect: I'm using WAL, and the journal is 543 MB.  I
hadn't given it much thought, but could this be more than the software
really wants to deal with?  I'm going to try doing occasional commits
(every 100K inserts/updates perhaps,) but I'd like some help:
1. If I'm on the right track, tell me so I can stop worrying and proceed
with development.
2. If I'm on the wrong track, help me figure out how to debug the problem.
I can probably find out what particular part of the merge it had reached,
but it's going to take quite a while.  I'm pretty good with GDB  but I have
no idea how to explore a running Python program.

The project is a hobby, so there's nothing proprietary, and I can post any
information that would help.

++ kevin

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Fwd: Problem on Windows 10 machines

2017-09-10 Thread Kevin Benson

I wonder if the answer at this following inquiry is suggesting that your
last posted results are unremarkable:

https://stackoverflow.com/questions/11872539/windbg-crash-dump-analysis
and that suggested alternative approach may be more helpful than WinDbg ?

--
   --
  -- "
 --Ö¿Ö--
K e V i N

On Sat, Sep 9, 2017 at 10:23 AM, Bart Smissaert 
wrote:

> I did debugging with WinDbg and got the following with !analyse:
>
> GetUrlPageData2 (WinHttp) failed: 12002.
> DUMP_CLASS: 2
> DUMP_QUALIFIER: 0
> FAULTING_IP:
> ntdll!LdrpDoDebuggerBreak+30
> 7ffb`0a4871b8 cc  int 3
> EXCEPTION_RECORD:  (.exr -1)
> ExceptionAddress: 7ffb0a4871b8
> (ntdll!LdrpDoDebuggerBreak+0x0030)
>ExceptionCode: 8003 (Break instruction exception)
>   ExceptionFlags: 
> NumberParameters: 1
>Parameter[0]: 
> FAULTING_THREAD:  1b24
> BUGCHECK_STR:  BREAKPOINT
> DEFAULT_BUCKET_ID:  BREAKPOINT
> PROCESS_NAME:  COM32on64.exe
> ERROR_CODE: (NTSTATUS) 0x8003 - {EXCEPTION}  Breakpoint  A breakpoint
> has been reached.
> EXCEPTION_CODE: (HRESULT) 0x8003 (2147483651) - One or more arguments
> are invalid
> EXCEPTION_CODE_STR:  8003
> EXCEPTION_PARAMETER1:  
> WATSON_BKT_PROCSTAMP:  56efa71c
> WATSON_BKT_PROCVER:  1.0.0.0
> PROCESS_VER_PRODUCT:  COM32on64
> WATSON_BKT_MODULE:  ntdll.dll
> WATSON_BKT_MODSTAMP:  a329d3a8
> WATSON_BKT_MODOFFSET:  d71b8
> WATSON_BKT_MODVER:  10.0.15063.447
> MODULE_VER_PRODUCT:  Microsoft® Windows® Operating System
> BUILD_VERSION_STRING:  10.0.15063.447 (WinBuild.160101.0800)
> MODLIST_WITH_TSCHKSUM_HASH:  38cbe79f85e7dfdb265ff9afb82039794b9b5c6a
> MODLIST_SHA1_HASH:  6299f03ade45bd5452864c8b494e355b11bc3b25
> NTGLOBALFLAG:  70
> PROCESS_BAM_CURRENT_THROTTLED: 0
> PROCESS_BAM_PREVIOUS_THROTTLED: 0
> APPLICATION_VERIFIER_FLAGS:  0
> PRODUCT_TYPE:  1
> SUITE_MASK:  784
> DUMP_TYPE:  fe
> ANALYSIS_SESSION_HOST:  DESKTOP-9UDFVUA
> ANALYSIS_SESSION_TIME:  09-09-2017 14:37:25.0292
> ANALYSIS_VERSION: 10.0.15063.468 amd64fre
> THREAD_ATTRIBUTES:
> OS_LOCALE:  ENG
> PROBLEM_CLASSES:
> ID: [0n300]
> Type:   [@APPLICATION_FAULT_STRING]
> Class:  Primary
> Scope:  DEFAULT_BUCKET_ID (Failure Bucket ID prefix)
> BUCKET_ID
> Name:   Omit
> Data:   Add
> String: [BREAKPOINT]
> PID:[Unspecified]
> TID:[Unspecified]
> Frame:  [0]
> PRIMARY_PROBLEM_CLASS:  BREAKPOINT
> LAST_CONTROL_TRANSFER:  from 7ffb0a442b28 to 7ffb0a4871b8
> STACK_TEXT:
> `0009f2b0 7ffb`0a442b28 : `0040 `0003
> ` `0021d000 : ntdll!LdrpDoDebuggerBreak+0x30
> `0009f2f0 7ffb`0a47a1fc : ` `
> ` `0001 : ntdll!LdrpInitializeProcess+0xfa4
> `0009f720 7ffb`0a429b1b : 7ffb`0a3b `
> ` `0021d000 : ntdll!_LdrpInitialize+0x506cc
> `0009f7a0 7ffb`0a429ace : `0009f820 `
> ` ` : ntdll!LdrpInitialize+0x3b
> `0009f7d0 ` : ` `
> ` ` : ntdll!LdrInitializeThunk+0xe
> THREAD_SHA1_HASH_MOD_FUNC:  f6d506c4546bb5d137f475a99d6cb238658f395a
> THREAD_SHA1_HASH_MOD_FUNC_OFFSET:  6cbeafe1708871adca27efbe26e88c
> d700c132ee
> THREAD_SHA1_HASH_MOD:  421247e39d7ac2afa4b65a67c08f22894942361d
> FOLLOWUP_IP:
> ntdll!LdrpDoDebuggerBreak+30
> 7ffb`0a4871b8 cc  int 3
> FAULT_INSTR_CODE:  4800ebcc
> SYMBOL_STACK_INDEX:  0
> SYMBOL_NAME:  ntdll!LdrpDoDebuggerBreak+30
> FOLLOWUP_NAME:  MachineOwner
> MODULE_NAME: ntdll
> IMAGE_NAME:  ntdll.dll
> DEBUG_FLR_IMAGE_TIMESTAMP:  0
> STACK_COMMAND:  dt ntdll!LdrpLastDllInitializer BaseDllName ; dt
> ntdll!LdrpFailureData ; ~0s ; kb
> BUCKET_ID:  BREAKPOINT_ntdll!LdrpDoDebuggerBreak+30
> FAILURE_EXCEPTION_CODE:  8003
> FAILURE_IMAGE_NAME:  ntdll.dll
> BUCKET_ID_IMAGE_STR:  ntdll.dll
> FAILURE_MODULE_NAME:  ntdll
> BUCKET_ID_MODULE_STR:  ntdll
> FAILURE_FUNCTION_NAME:  LdrpDoDebuggerBreak
> BUCKET_ID_FUNCTION_STR:  LdrpDoDebuggerBreak
> BUCKET_ID_OFFSET:  30
> BUCKET_ID_MODTIMEDATESTAMP:  0
> BUCKET_ID_MODCHECKSUM:  1dd2f4
> BUCKET_ID_MODVER_STR:  10.0.15063.447
> BUCKET_ID_PREFIX_STR:  BREAKPOINT_
> FAILURE_PROBLEM_CLASS:  BREAKPOINT
> FAILURE_SYMBOL_NAME:  ntdll.dll!LdrpDoDebuggerBreak
> FAILURE_BUCKET_ID:  BREAKPOINT_8003_ntdll.dll!LdrpDoDebuggerBreak
> WATSON_STAGEONE_URL:
> http://watson.microsoft.com/StageOne/COM32on64.exe/1.0.0.
> 0/56efa71c/ntdll.dll/10.0.15063.447/a329d3a8/8003/
> 000d71b8.htm?Retriage=1
> TARGET_TIME:  2017-09-09T13:37:25.000Z
> OSBUILD:  15063
> OSSERVICEPACK:  447
> SERVICEPACK_NUMBER: 0
> OS_REVISION: 0
> OSPLATFORM_TYPE:  x64
> OSNAME:  Windows 10
> OSEDITION:  Windows 10 WinNt SingleUserTS Personal
> USER_LCID:  0
> OSBUILD_TIMESTAMP:  unknown_date
> BUILDDATESTAM

Re: [sqlite] Linux top command and sqlite

2017-02-21 Thread Kevin O'Gorman

I'll try the synchronous=off the next time I run it.  As it happens, this
one
finished in 57 minutes, which is not bad, considering.

But, I wonder, could i not tell if it's doing lots of commits by looking at
the size of the ?.db-journal file?  It gets pretty large.


On Tue, Feb 21, 2017 at 10:38 AM, Richard Hipp  wrote:

> On 2/21/17, Kevin O'Gorman  wrote:
> > I'm not at all sure this is the right place to ask, but as it only comes
> up
> > when I'm running one of
> > my sqlite jobs, I thought I'd start here.  I'm running Python 3.5 scripts
> > in Linux 16.04.1 using the sqlite3 package.  Machine is Core i5, 32GB
> RAM.
> >
> > Some of my stuff takes a while to run, and I like to keep tabs on it.
> > Right now, I'm running one of those, and the Linux top command shows
> > extremely small CPU usage, and a status ("S" column) of "D" which the man
> > page defines as "uninterruptable sleep".  Huh?
>
> My guess:  It is busying doing an fsync() after a transaction commit.
> To find out, temporarily set "PRAGMA synchronous=off" in your script
> and see if that makes the pauses go away.
>
> A better long-term solution would be:
>
> (1) Group multiple changes into a single transaction using BEGIN...COMMIT.
> (2) Set PRAGMA journal_mode=WAL with PRAGMA synchronous=NORMAL.
>
>
> >
> > My code does not use an intentional sleep at all.  It's traversing parts
> of
> > a graph, and not interacting with anything else.  As far as I know,
> anyway,
> > and I wrote the whole thing.
> >
> > Now it's true that it's updating the consequences of changes to graph
> nodes
> > that took about 3 hours just to install, so taking a while is not a
> > surprise by itself.  I just wonder what that status means and how it
> could
> > arise.
> >
> > --
> > word of the year: *kakistocracy*
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Linux top command and sqlite

2017-02-21 Thread Kevin O'Gorman

I'm not at all sure this is the right place to ask, but as it only comes up
when I'm running one of
my sqlite jobs, I thought I'd start here.  I'm running Python 3.5 scripts
in Linux 16.04.1 using the sqlite3 package.  Machine is Core i5, 32GB RAM.

Some of my stuff takes a while to run, and I like to keep tabs on it.
Right now, I'm running one of those, and the Linux top command shows
extremely small CPU usage, and a status ("S" column) of "D" which the man
page defines as "uninterruptable sleep".  Huh?

My code does not use an intentional sleep at all.  It's traversing parts of
a graph, and not interacting with anything else.  As far as I know, anyway,
and I wrote the whole thing.

Now it's true that it's updating the consequences of changes to graph nodes
that took about 3 hours just to install, so taking a while is not a
surprise by itself.  I just wonder what that status means and how it could
arise.

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Problem compiling 3.17.0 in MSVS 12

2017-02-14 Thread Kevin Benson

On Tue, Feb 14, 2017 at 7:56 PM, Bart Smissaert 
wrote:

> I had to install WinZip though and at the first unzip I got directly to the
> files whereas now I had a .tar file first and had to unzip that.
>

Following on from David Empson's post... the WinZip docs show an
Advanced->File handling option named  that
likely facilitated your .TAR file attempt:
http://kb.winzip.com/help/HELP_CONFIG_MISC.htm
--
   --
  --
 --Ö¿Ö--
K e V i N
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] SQLAR, slight mod for symlink recursion & a high volume test

2017-02-10 Thread Kevin

Hi, Richard,

just to let you know, I have continued to use sqlar rather than tar.gz
for my personal use.

I found a problem with recursive symlinks.

For example, I have a copy of sqliteodbc-0.9994.tar.gz which has a
symlink source -> . Note, the Gnome Archive Manager has a nice arrow
icon, but seems to otherwise ignore it.

So, I get directories with .../source/source/ forever 

I had a similar issue with some math software I use, called Sage. 
( http://www.sagemath.org/)


To fix it, I did a minimal change, stat( to lstat( within the add_file
function.

I didn't have the need to recover the symlink from the sqlar file, but
others may.



I then did a volume test, on my laptop, which is an Intel i7 about 3
years old. The Sage software is sage-7.4, has 114,000 files and takes
up 6.9G. 

Sqlar took just short of 13 minutes to run against it, and the
resultant sqlar file is 2.5G. 

The sqlar -lv against the file took 50 seconds to produce the file list
which I piped to a file that was 16.9M. However, I normally just read
the "sqlar" table.

regs,

Kev









___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Retrieve INTEGER PRIMARY KEY

2017-02-07 Thread Kevin Benson

On Tue, Feb 7, 2017 at 4:11 PM, Clyde Eisenbeis  wrote:

> int iKeyID = (int)sqliteCmd.ExecuteScalar();
>

I believe the type of last_insert_rowid() is *always* INT64

--
   --
  --
 --Ö¿Ö--
K e V i N
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-02 Thread Kevin O'Gorman

When I read this, it seemed like it made sense.  The thing is, it does not
match up with reality.

First, the analysis of what happens when I pipe the results to 'sort'
misses the fact that the sort process executes within the 31 minutes of
that version.  It would not make a dent in the time of the slow version.

But the big thing is that I took a look at EXPLAIN QUERY PLAN using this
script:
#!/usr/bin/env
python3

"""Output positions that are reachable but unsolved at census 18
See page 76 of Qubic log

Last Modified: Thu Feb  2 07:46:03 PST 2017
"""

import sqlite3  # https://docs.python.org/3.5/library/sqlite3.html

with sqlite3.connect("917.db") as conn:
print("BEOFRE ANALYZE")
for row in conn.execute("""
EXPLAIN
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row)
print()
print()
conn.execute("ANALYZE")
print("AFTER ANALYZE")
for row in conn.execute("""
EXPLAIN
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row)

and after waiting most of the day for the analyze to finish, I got two
identical query plans,
neither of which I could decipher:
BEFORE
ANALYZE

(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None)
(3, 'OpenRead', 3, 6, 0, 'k(1,)', '00', None)
(4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None)
(5, 'Rewind', 3, 21, 1, '0', '00', None)
(6, 'Seek', 3, 0, 1, '', '00', None)
(7, 'Column', 1, 2, 1, '', '00', None)
(8, 'Ne', 2, 20, 1, '(BINARY)', '54', None)
(9, 'Column', 1, 5, 3, '-99', '00', None)
(10, 'Column', 1, 6, 4, '99', '00', None)
(11, 'Ge', 4, 20, 3, '(BINARY)', '53', None)
(12, 'IdxRowid', 3, 5, 0, '', '00', None)
(13, 'SeekGE', 4, 20, 5, '1', '00', None)
(14, 'IdxGT', 4, 20, 5, '1', '00', None)
(15, 'Column', 3, 0, 6, '', '00', None)
(16, 'Eq', 6, 19, 7, '(BINARY)', '80', None)
(17, 'Copy', 6, 7, 0, '', '00', None)
(18, 'ResultRow', 6, 1, 0, '', '00', None)
(19, 'Next', 4, 14, 0, '', '00', None)
(20, 'Next', 3, 6, 0, '', '01', None)
(21, 'Close', 1, 0, 0, '', '00', None)
(22, 'Close', 3, 0, 0, '', '00', None)
(23, 'Close', 4, 0, 0, '', '00', None)
(24, 'Halt', 0, 0, 0, '', '00', None)
(25, 'Transaction', 0, 0, 155, '0', '01', None)
(26, 'TableLock', 0, 4, 0, 'pos', '00', None)
(27, 'TableLock', 0, 7, 0, 'move', '00', None)
(28, 'Integer', 18, 2, 0, '', '00', None)
(29, 'Goto', 0, 1, 0, '', '00', None)


AFTER ANALYZE
(0, 'Init', 0, 25, 0, '', '00', None)
(1, 'Null', 1, 7, 0, '', '08', None)
(2, 'OpenRead', 1, 4, 0, '7', '00', None)
(3, 'OpenRead', 3, 6, 0, 'k(1,)', '00', None)
(4, 'OpenRead', 4, 11, 0, 'k(3,,,)', '02', None)
(5, 'Rewind', 3, 21, 1, '0', '00', None)
(6, 'Seek', 3, 0, 1, '', '00', None)
(7, 'Column', 1, 2, 1, '', '00', None)
(8, 'Ne', 2, 20, 1, '(BINARY)', '54', None)
(9, 'Column', 1, 5, 3, '-99', '00', None)
(10, 'Column', 1, 6, 4, '99', '00', None)
(11, 'Ge', 4, 20, 3, '(BINARY)', '53', None)
(12, 'IdxRowid', 3, 5, 0, '', '00', None)
(13, 'SeekGE', 4, 20, 5, '1', '00', None)
(14, 'IdxGT', 4, 20, 5, '1', '00', None)
(15, 'Column', 3, 0, 6, '', '00', None)
(16, 'Eq', 6, 19, 7, '(BINARY)', '80', None)
(17, 'Copy', 6, 7, 0, '', '00', None)
(1

Re: [sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-01 Thread Kevin O'Gorman

On Wed, Feb 1, 2017 at 6:35 PM, Richard Hipp  wrote:

> On 2/1/17, Kevin O'Gorman  wrote:
> > I have a database of positions and moves in a strategic game, and I'm
> > searching for unsolved positions that have been connected to an immediate
> > ancestor.  I'm using Python 3.5.2, and the code looks like
>
> Please provide us with the following additional information:
>
> (1) In python, run the query: "SELECT sqlite_version(),
> sqlite_source_id();"
>
> (2) In a recent sqlite3 command-line shell (the latest release, not
> whatever 5-year-old release happens to be installed on your system)
> bring up your database and run the command:
>
>  .fullschema --indent
>
> And send in the output.
>
> (3) Download the bundle of command-line tools for your OS, then run
> the command "sqlite3_analyzer" on your database, and send in the
> output.
>
> Thanks.
>
>
>
I am unable to comply with items 2 and 3.  I can download the linux x86
versions, which I expected would run on my x86-64 system, but they don't.
Instead, even when I point right at them, they report "No such file or
directory".  I take this to mean that there is some file they do not find,
like a library, and they report the error code in their return status.

However, my "recent" software reports:
 SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
sqlite> .fullschema --indent
Usage: .fullschema
sqlite> .fullschema
CREATE TABLE base64 (
b64char CHAR NOT NULL PRIMARY KEY,
b64val  INTEGER);
CREATE TABLE pos (
pnum INTEGER PRIMARY KEY AUTOINCREMENT,
ppos CHAR(64) NOT NULL,
pcensus INTEGER NOT NULL,
pscore INTEGER,
pstate CHAR DEFAULT "N" NOT NULL,
pmin INTEGER DEFAULT -99 NOT NULL,
pmax INTEGER DEFAULT 99 NOT NULL,
pmain CHAR(64));
CREATE UNIQUE INDEX pipos ON pos (ppos);
CREATE TABLE move (
mfrom INTEGER NOT NULL,
mto   INTEGER NOT NULL,
mtype CHAR NOT NULL,
mcell INTEGER NOT NULL,
mvalue INTEGER,
ma INTEGER DEFAULT -99,
mb INTEGER DEFAULT 99,
PRIMARY KEY (mfrom, mto, mcell));
CREATE UNIQUE INDEX mrev ON move (mto, mfrom, mcell);
CREATE TABLE expanded (
census INTEGER NOT NULL,
number INTEGER NOT NULL,
pos CHAR(64),
PRIMARY KEY (census, number));
ANALYZE sqlite_master;
ANALYZE sqlite_master;
INSERT INTO sqlite_stat1 VALUES('move','mrev','48329866 2 2 1');
INSERT INTO sqlite_stat1 VALUES('move','sqlite_autoindex_move_1','48329866
38 2 1');
INSERT INTO sqlite_stat1 VALUES('pos','pipos','74409802 1');
INSERT INTO sqlite_stat1 VALUES('base64','sqlite_autoindex_base64_1','64
1');
ANALYZE sqlite_master;
sqlite>

The analyzer is not included in my distribution or its repositiories, as
far as I can tell.  This is Xubuntu, which is a flavor of Ubuntu, which is
derived from Debian.

I'm not sure I want to build your entire software suite.  Perhaps you'd
care to download my database, which I freshly tar-ed and gzip-ed to
http://kosmanor.com/917/917.db.tgz
the databse is 21 GB; the tar is 3.1 GB

> >
> > #!/usr/bin/env python3
> > """Output positions that are reachable but unsolved at census 18 or
> greater
> > See page 76 of Qubic log
> >
> > Last Modified: Tue Jan 31 12:13:07 PST 2017
> > """
> >
> > import sqlite3  # https://docs.python.org/3.5/
> library/sqlite3.html
> >
> > with sqlite3.connect("917.db") as conn:
> > for row in conn.execute("""
> > SELECT DISTINCT ppos
> > FROM move JOIN pos ON mto = pnum
> > WHERE pcensus = 18 and pmin < pmax
> > """):
> > print(row[0])
> >
> > As written here, this query runs for 1193 minutes (just short of 20
> > hours).  If I remove the "DISTINCT" and instead pipe the result into the
> > sort program that comes with Linux "sort --unique" the query and sort
> takes
> > only 31 minutes.  The results are the same, and consist of 4.2 million
> rows.
> >
> > This seems extreme.
> >
> > --
> > word of the year: *kakistocracy*
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
>

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] "DISTINCT" makes a query take 37 times as long

2017-02-01 Thread Kevin O'Gorman

I have a database of positions and moves in a strategic game, and I'm
searching for unsolved positions that have been connected to an immediate
ancestor.  I'm using Python 3.5.2, and the code looks like

#!/usr/bin/env python3
"""Output positions that are reachable but unsolved at census 18 or greater
See page 76 of Qubic log

Last Modified: Tue Jan 31 12:13:07 PST 2017
"""

import sqlite3  # https://docs.python.org/3.5/library/sqlite3.html

with sqlite3.connect("917.db") as conn:
for row in conn.execute("""
SELECT DISTINCT ppos
FROM move JOIN pos ON mto = pnum
WHERE pcensus = 18 and pmin < pmax
"""):
print(row[0])

As written here, this query runs for 1193 minutes (just short of 20
hours).  If I remove the "DISTINCT" and instead pipe the result into the
sort program that comes with Linux "sort --unique" the query and sort takes
only 31 minutes.  The results are the same, and consist of 4.2 million rows.

This seems extreme.

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] possible integrity problem

2017-01-15 Thread Kevin O'Gorman

On Sat, Jan 14, 2017 at 5:04 PM, Simon Slavin  wrote:

>
> On 15 Jan 2017, at 1:01am, Kevin O'Gorman  wrote:
>
> > Update: the integrity check said "ok" after about 1/2 hour.
> > the record count now takes about 4 seconds -- maybe I remembered wrong
> and
> > it always took this long, but I wasn't stopping it until it had hung for
> > several minutes.
>
> What you describe actually sounds more like a hardware problem.  You had a
> 'sticky' disk, affecting at least some of the sectors in which that
> database is stored, which has now sorted itself out.  But sometime in the
> future it may become sticky again.  If you have some sort of disk checking
> software you might like to try it.
>
> Given your 5 indexes, 30 minutes to check an 11GB file is completely
> reasonable.  Don’t worry about that.
>
> Good luck with it.
>

It turns out you're right about the time for a check.  However, I don't buy
the "sticky disk" idea.  I could copy the files just fine.  I could create
a new file of the same size just fine.  But sqlite3 was stuck.  How does
that happen?  I don't know, and my imagination is stuck.


-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] possible integrity problem

2017-01-14 Thread Kevin O'Gorman

Update: the integrity check said "ok" after about 1/2 hour.
the record count now takes about 4 seconds -- maybe I remembered wrong and
it always took this long, but I wasn't stopping it until it had hung for
several minutes.
Color me baffled.

On Sat, Jan 14, 2017 at 4:49 PM, Kevin O'Gorman 
wrote:

> I've got a database that has acted strangely from time to time.  Or
> actually a series of them, since I erase and build from scratch sometimes,
> as I'm just starting this project.
>
> Anyway, the latest is that the DB is about 11 GB.  It's pretty simple,
> just 2 main tables and maybe 5 indexes, no foreign keys, triggers, or much
> of anything else.  Suddenly just about anything I do seems to hang.
>
> In particular SELECT COUNT(*) FROM pos, which used to take under a
> second.  And I haven't make any changes to the DB since then.  This is true
> even if I access a write-protected copy I made some time ago.
>
> That includes PRAGMA integrity_check, which I started about 20 minutes
> ago.  It's the first time I've tried it so I don't know how long it should
> take, but copying the whole database takes under 3 minutes.
>
> I can interrupt the process with control-C, but cannot make progress.
> About the only thing that seems to be working is in sqlite3 I can ask for
> the schema.
>
> All of this is sqlite on Xubuntu Linux in python 3.5.2, and with sqlite3
> command-line.  If I reboot and try again, things still hang.  There are no
> journals hanging around.
>
> My first question: how long should I expect PRAGMA integrity-check to take?
>
> --
> word of the year: *kakistocracy*
>



-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] possible integrity problem

2017-01-14 Thread Kevin O'Gorman

I've got a database that has acted strangely from time to time.  Or
actually a series of them, since I erase and build from scratch sometimes,
as I'm just starting this project.

Anyway, the latest is that the DB is about 11 GB.  It's pretty simple, just
2 main tables and maybe 5 indexes, no foreign keys, triggers, or much of
anything else.  Suddenly just about anything I do seems to hang.

In particular SELECT COUNT(*) FROM pos, which used to take under a second.
And I haven't make any changes to the DB since then.  This is true even if
I access a write-protected copy I made some time ago.

That includes PRAGMA integrity_check, which I started about 20 minutes
ago.  It's the first time I've tried it so I don't know how long it should
take, but copying the whole database takes under 3 minutes.

I can interrupt the process with control-C, but cannot make progress.
About the only thing that seems to be working is in sqlite3 I can ask for
the schema.

All of this is sqlite on Xubuntu Linux in python 3.5.2, and with sqlite3
command-line.  If I reboot and try again, things still hang.  There are no
journals hanging around.

My first question: how long should I expect PRAGMA integrity-check to take?

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] INSERT is corrupting a database

2017-01-13 Thread Kevin O'Gorman

On Fri, Jan 13, 2017 at 3:34 AM, Clemens Ladisch  wrote:

> Kevin O'Gorman wrote:
> > On Tue, Jan 10, 2017 at 11:29 PM, Clemens Ladisch 
> wrote:
> >> Kevin O'Gorman wrote:
> >>> If I go on to the second table, it appears to finish normally, but
> when I
> >>> try to look at the database with sqlite3, a command-line tool for
> >>> interacting with SQLite, it says the database is corrupt.
> >>
> >> What version?
> >
> > It's whatever is in Python 3.5.2.'s builtin sqlite package.
>
> The sqlite3 command-line shell does not ship with Python.
>
> >> It's possible that there is a bug in your code.  Which you have not
> shown.
> >
> > My opinion is that no user bug whatever should cause DB integrity
> problems without
> > raising an exception.
>
> <http://www.sqlite.org/howtocorrupt.html>
> But it's unlikely that you'd manage to do any of this in Python.
>
> Anyway, my own test program works.
>

That's twisted backwards.

My database builder is built with pure Python, using the SQLite package
that comes with it.  Then sqlite3 just refuses to open the result.  That's
just what's in the Xubuntu 16.04 repos, i.e. version 3.11.0-1ubuntu1; I
didn't report that because I don't suspect sqlite3 of being a cause.
Indeed, it was changes to the python code that seems to have stopped
provoking the error -- nothing about squlite3 has changed.  Anyway, I did
not save the problem code, so I can no longer pursue this.

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] INSERT is corrupting a database

2017-01-12 Thread Kevin O'Gorman

On Tue, Jan 10, 2017 at 7:52 PM, Simon Slavin  wrote:

>
> On 11 Jan 2017, at 3:28am, Kevin O'Gorman  wrote:
>
> > I have a modest amount of data that I'm loading into an SQLite database
> for
> > the first time.  For the moment it contains just two tables and a few
> > indices, nothing else.  The first table loads okay, and if I stop the
> > process at that point, all is well and I can look at the database.
> >
> > If I go on to the second table, it appears to finish normally, but when I
> > try to look at the database with sqlite3, a command-line tool for
> > interacting with SQLite, it says the database is corrupt.
>
> Make absolutely sure you’re starting with a new database file each time,
> not continuing to write to an already-corrupt file.
>
> I'm sure.  The program tests for the existence of the main table before
starting, and throws an exception if it's there, then creates that table as
its first action.


> At stages during your Python program, including after you’ve finished
> loading the first table, use the following command to check to see whether
> the database is correct:
>
> It's no longer possible.  In fixing other things, the program has changed,
and it no longer corrupts the database.  Thanks for this next thing,
though



> PRAGMA integrity_check
>

Thanks for that.  I was not aware of this tool.  I'll keep it handy.


> Use the same command in the command-line tool.
>
> Simon.
>

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] INSERT is corrupting a database

2017-01-12 Thread Kevin O'Gorman

On Tue, Jan 10, 2017 at 11:29 PM, Clemens Ladisch 
wrote:

> Kevin O'Gorman wrote:
> > If I go on to the second table, it appears to finish normally, but when I
> > try to look at the database with sqlite3, a command-line tool for
> > interacting with SQLite, it says the database is corrupt.
>
> What version?
>

It's whatever is in Python 3.5.2.'s builtin sqlite package.


> > If however, I split the program into two programs, one for each table,
> and
> > run them one after another, all is well.  Same code, each with parts of
> it
> > if-else-ed out.
>
> It's possible that there is a bug in your code.  Which you have not shown.
>
>
My opinion is that no user bug whatever should cause DB integrity problems
without
raising an exception.




-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] INSERT is corrupting a database

2017-01-10 Thread Kevin O'Gorman

This is a problem I don't quite know how to report in a way that will be
useful.

I'm using Python 3.5 and its builtin sqlite package.

I have a modest amount of data that I'm loading into an SQLite database for
the first time.  For the moment it contains just two tables and a few
indices, nothing else.  The first table loads okay, and if I stop the
process at that point, all is well and I can look at the database.

If I go on to the second table, it appears to finish normally, but when I
try to look at the database with sqlite3, a command-line tool for
interacting with SQLite, it says the database is corrupt.

If however, I split the program into two programs, one for each table, and
run them one after another, all is well.  Same code, each with parts of it
if-else-ed out.

I don't know what to blame, or what to try.

-- 
word of the year: *kakistocracy*
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] new user seeking help

2017-01-05 Thread Kevin Benson

 On Thu, Jan 5, 2017 at 1:26 PM, Chen, Hui  wrote:

> Hi there,
>
> I just recently started integrating System.Data.SQLite.dll into my
> project, I started with a very simple winform application.
>
> I have a very simple method to create a connection, txtSqliteFile is
> textbox for db file location.
> SQLiteConnection GetConnection()
> {
> string connectionString = "Data Source=" + txtSqliteFile.Text +
> ";Version=3;datetimeformat=CurrentCulture";
> SQLiteConnection conn = new SQLiteConnection(connectionString);
> return conn;
> }
>
> Then in a button handler I have these to open a connection.
>
> SQLiteConnection conn = GetConnection();
> conn.Open();
>
> whenever Open method is called, Following exception thrown
> Attempted to read or write protected memory. This is often an indication
> that other memory is corrupt.
>
> If I remove [STAThread] attribute from Main, it opens connections without
> problem, but my OpenFileDialgue blows up, I have to have [STAThread] on
> Main method.
> SQLite documentation says that I can change the threading model, but
> doesn't say how this can be achieved in C#. all examples are in C/C++
>
> Can anyone give me a hint?
>

https://richnewman.wordpress.com/2007/04/08/top-level-exception-handling-in-windows-forms-applications-part-1/

--
   --
  --
 --Ô¿Ô--
K e V i N


>
> Thanks
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Out of date online documents

2016-12-31 Thread Kevin Benson

 On Sat, Dec 31, 2016 at 4:04 PM, Simon Slavin  wrote:

>
> On 31 Dec 2016, at 8:59pm, Kevin Benson  wrote:
> >
> > The Home page Search *DOES* result in linking to
> > "other dot commands" but *NOT*
> > "special commands to sqlite3 dot commands" nor
> > "rules for dot commands".
>
> Because there are no such commands.
>
> The dot commands are part of the SQLite command-line tool and are
> documented in the web page for that tool.  The tool itself implements
> them.  They are not part of SQLite itself.
>

Not sure what you mean, the site has a page for the command-line tool and
the OP was looking (at the site) for reference to "dot command"
info. When anyone types 'dot' or 'dot command' (as search terms) into the
Search facility on the Home page of sqlite.org a top result page link
points to https://sqlite.org/cli.html#other_dot_commands ( go ahead ...do
it ;-) Now that search result is related to the HREF's in the HTML that I
listed (from cli.html)...BUT the other HREF's (for the same keywords) DO
NOT come up in the search results ...and that may be because the HREF's
with trailing underscores are not being indexed.

--
   --
  --
 --Ô¿Ô--
K e V i N

>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Out of date online documents

2016-12-31 Thread Kevin Benson

On Sat, Dec 31, 2016 at 3:20 PM, Paul Lambert  wrote:

> Web search.  In fact I cannot find any link on the sqlite.org page that
> links to the (dot) functions, alias special functions.
>
>
The Home page Search *DOES* result in linking to
"other dot commands" but *NOT*
"special commands to sqlite3 dot commands" nor
"rules for dot commands".

I wonder if the trailing underscores in the underlying HTML are somehow
deleterious ?

1.
Getting Started
2.
Double-click Startup On Windows
--->3. Special commands to
sqlite3 (dot-commands)
--->4. Rules for
"dot-commands"
5. Changing
Output Formats
6. Writing
results to a file
6.1. File I/O
Functions
7. Querying
the database schema
8. CSV Import
9. CSV Export
10. Converting
An Entire Database To An ASCII Text File
11. Loading
Extensions
--->12. Other Dot
Commands
13.
Using sqlite3 in a shell script
14. Ending shell
commands
15. Compiling the
sqlite3 program from sources

--
   --
  --
 --Ô¿Ô--
K e V i N
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Weird chars inserted

2016-12-20 Thread Kevin Youren


Ariel, Keith, Rowan, 

apologies, I re-ran the queries a few times, and I decided to include
"rowid" to keep track of the changes.



The experiments were conducted by cut-and-paste of the í character from
the email, hence UTF8, and using x'...' for inserts and concats.

Note, I use sqlite3 shell by preference, but I use both the Firefox
addon and "DB Browser for Sqlite" for GUI convenience - however, for
inserts and updates I use the sqlite3 shell or the C programming
interface. 

In C, I use int rather than char -

FILE *pinfile = NULL;
...
pinfile = fopen(argv[1],"rb");



int ch = fgetc (pinfile); 
/* changed from char to int to allow >127 & UTF */


Also, I use .mode csv and then a spreadsheet quite a lot.

Note, at the end, I added typeof( ) - and most were BLOBs and a couple
as TEXT.




kevin@kevin-Aspire-V5-571G:~$ sqlite3 dir_md5sum_db.sqlite
SQLite version 3.15.2 2016-11-28 19:13:37
Enter ".help" for usage hints.
sqlite> SELECT * FROM dir_md5sum
   ...> where rowid >= 194576;
kev|
kev2|
kev3|
kev4|
sqlite> insert into dir_md5sum values ( 'kev5', x'C3AD');
sqlite> SELECT * FROM dir_md5sum where rowid >= 194576;
kev|
kev2|
kev3|
kev4|
kev5|í
sqlite> .schema
CREATE TABLE dir_md5sum (dir_name text, dir_md5sum text);
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
sqlite> insert into dir_md5sum values ( 'kev6', 'a' || x'C3AD' || 'b'
);
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
194581|kev6|6B657636|aíb|61C3AD62|97
sqlite> insert into dir_md5sum values ( 'kev7', 'c' || x'00ED' || 'd'
);
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
194581|kev6|6B657636|aíb|61C3AD62|97
194582|kev7|6B657637|c|6300ED64|99
sqlite> insert into dir_md5sum values ( 'kev8',  x'00ED'  );
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
194581|kev6|6B657636|aíb|61C3AD62|97
194582|kev7|6B657637|c|6300ED64|99
194583|kev8|6B657638||00ED|
sqlite> .mode csv
sqlite> .once /home/kevin/Martin.csv
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum), typeof(dir_md5sum)  FROM
dir_md5sum where rowid >= 194576;
194576,kev,6B6576,"�",EE,65533,blob
194577,kev2,6B657632,"�",EE,65533,blob
194578,kev3,6B657633,"�",EE,65533,blob
194579,kev4,6B657634,"�",ED,65533,blob
194580,kev5,6B657635,"í",C3AD,237,blob
194581,kev6,6B657636,"aíb",61C3AD62,97,text
194582,kev7,6B657637,c,6300ED64,99,text
194583,kev8,6B657638,"",00ED,,blob
sqlite> 

regs,

Kev


Date: Mon, 19 Dec 2016 11:12:59 +0800
From: Rowan Worth 
To: SQLite mailing list 
Subject: Re: [sqlite] Weird chars inserted
Message-ID:

Content-Type: text/plain; charset=UTF-8

On 19 December 2016 at 08:24, Kevin  wrote:

> Hi Martin,
>
> I had a go using a terminal session, with default encoding UTF-8.
>
> Try using the hex( ) and unicode( ) functions to check what is
actually
> stored in the sqlite table.
>
> I put a couple of rows at the end of an existing simple table
>
> kevin@kevin-Aspire-V5-571G:~$ sqlite3
/home/kevin/dir_md5sum_db.sqlite
> SQLite version 3.15.2 2016-11-28 19:13:37
> Enter ".help" for usage hints.
> sqlite> SELECT dir_name, hex(dir_name), dir_md5sum, hex(dir_md5sum),
> unicode(dir_md5sum)  FROM dir_md5sum
>...> where rowid >= 194576;
> 194576|kev|6B6576|í|C3AD|237
> 194577|kev2|6B657632|�|ED|65533
> sqlite> .quit
> kevin@kevin-Aspire-V5-571G:~$
>

Hi Kevin,

The problem here lies in whatever inserted these rows. sqlite just
stores
what it is given - it is up to the application to take care of encoding
issues.

In th

[sqlite] Weird chars inserted

2016-12-18 Thread Kevin

Hi Martin,

I had a go using a terminal session, with default encoding UTF-8.

Try using the hex( ) and unicode( ) functions to check what is actually
stored in the sqlite table.

I put a couple of rows at the end of an existing simple table

kevin@kevin-Aspire-V5-571G:~$ sqlite3 /home/kevin/dir_md5sum_db.sqlite
SQLite version 3.15.2 2016-11-28 19:13:37
Enter ".help" for usage hints.
sqlite> SELECT dir_name, hex(dir_name), dir_md5sum, hex(dir_md5sum),
unicode(dir_md5sum)  FROM dir_md5sum 
   ...> where rowid >= 194576;
194576|kev|6B6576|í|C3AD|237
194577|kev2|6B657632|�|ED|65533
sqlite> .quit
kevin@kevin-Aspire-V5-571G:~$ 



I tried changing the terminal session to UTF-16, but I ended up with a
mess of Chinese characters. 






the � is a place marker inserted by the program that displays on your
screen for a character the program doesn't recognize. 

Check http://www.fileformat.info/info/unicode/char/00ed/index.htm

as a reference.

I use Linux (Xubuntu)


regs,

Kev


From: "Ariel M. Martin" 
To: "   sqlite-users@mailinglists.sqlite.org"

Subject: [sqlite] Weird chars inserted
Message-ID: <835952.64146...@smtp113.mail.ne1.yahoo.com>
Content-Type: text/plain; charset="utf-8"

Hi people. I need some help I’m lost here.
I’m writing an application using SQLite and whenever I insert Spanish
characters
I get this kind of strings:
Mart�n (where it should read ‘Martín’)

Ok, so if I open my database with SQLiteManager (the Firefox addin) my
strings look
weird like that. But SQLiteManager is able to insert and display any
string correctly.
And any string inserted by SQLiteManager displays all right in my app
as well.

So I must be doing something wrong when I insert the strings. This is
my code:

Note: szSQL is the wchar-string my app uses

char szAux[2048];
ZeroMemory(szAux, 2048);
WideCharToMultiByte(CP_ACP, WC_COMPOSITECHECK, szSQL,
wcslen(szSQL), szAux, 2048, NULL, 0);

int nRet= sqlite3_exec(m_hDB, szAux, NULL, 0, &pErrMsg);


SQLiteManager tells me that my db is utf-16le. Im using vs2015 in a
Windows10 64bit machine.

I’d appreciate any help.
Thanks in advance!


Ariel M. Martin

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-12-09 Thread Kevin O'Gorman

It wasn't memory.  Or at least nothing obvious.  I ran Memtest86+ 5.01 in
SMP mode for 3 full passes (about 10 hours) with no errors.

On reflection, it didn't seem likely to be a kernel freeze anyway.  It
wasn't that the light on the keyboard was unresponsive, or not just that,
but that all the lights were off including the mouse laser.  The sort of
general "on" light in the tower was on, and the power supply fan was going,
but that's all.  I now wish I had looked at the motherboard readouts.
Anyway, that suggested a power distribution or power supply problem to me.
I just don't know how to bifurcate between the power supply and something
on the mobo USB interface.  And it hasn't happened again.

On Wed, Dec 7, 2016 at 5:18 PM, Kevin O'Gorman 
wrote:

> Good feedback.  I haven't done a memory check on that machine in a
> while
>
> Next on my list.
>
> On Sun, Dec 4, 2016 at 11:25 AM, Keith Medcalf 
> wrote:
>
>>
>> If pressing the CAPS LOCK or NUM LOCK keys on the keyboard does not
>> toggle the light on the keyboard then you have lost the all interrupt
>> processing since those keypresses have to be processed by the kernel mode
>> keyboard driver toggling the internal state of the keyboard driver, and
>> then the kernel driver sends output to the keyboard to change the status
>> LEDs.  Typically (all Operating Systems) this means you have suffered a
>> complete kernel crash (or halt) and the system is not running.
>>
>> Since the system must be running in order to output indicator status, all
>> indicators will stay "stuck" in their last known position (hold output).
>>
>> Only a power-cycle (or hardware reset -- assuming the RESET is a hardware
>> reset and not just a request to reset which will be ignored) triggering a
>> reboot will restart the system.
>>
>> The most frequent cause is a Parity Check.  Or in these latter days of
>> not having ECC or even Parity checked memory, just an undetected memory
>> fault which will cause random AHTBL.
>>
>> > -Original Message-
>> > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org
>> ]
>> > On Behalf Of Kevin O'Gorman
>> > Sent: Sunday, 4 December, 2016 09:21
>> > To: SQLite mailing list
>> > Subject: Re: [sqlite] I keep getting seg faults building my database
>> using
>> > python sqlite3
>> >
>> > Well, the i7 system failed again, but this time it was quite different.
>> > And peculiar.
>> > The system wasn't doing anything, but it should have been.  So I tried
>> > something. It didn't matter what, because I could not get the mouse or
>> > keyboard to work -- it was like they weren't plugged in.  Really like
>> it,
>> > because the caps lock light wasn't on, nor was the laser light visible
>> in
>> > the mouse.  Even when I changed mouse, keyboard and USB slot.  I
>> couldn't
>> > get in with SSH from elsewhere either.  But the computer's "I'm running"
>> > light was on.
>> > So I'm suspecting a partial power failure.  I don't know enough about
>> > mobos
>> > and USB to diagnose whether the problem was on the mobo or the power
>> > supply.
>> >
>> > Creepty.  I had to do a hard reset  to get thing going again, and it's
>> > been
>> > running fine for a day now.
>> >
>> > On Mon, Nov 21, 2016 at 9:51 AM, Kevin O'Gorman <
>> kevinogorm...@gmail.com>
>> > wrote:
>> >
>> > > On Mon, Nov 21, 2016 at 9:41 AM, Roger Binns 
>> > > wrote:
>> > >
>> > >> On 19/11/16 08:08, Kevin O'Gorman wrote:
>> > >> > System with problems: Running Xubuntu Linux 16.04.1, Python 3.5.2.
>> > >> [...]
>> > >> > System without this problem: Running Ubuntu Linux 14.04.5, Python
>> > 3.4.3.
>> > >>
>> > >> You are good on Python versions then.  My remaining recommendation is
>> > to
>> > >> make the process that does SQLite be a child process (ie no making
>> its
>> > >> own children).  That will eliminate an entire class of potential
>> > >> problems, although it appears unlikely you are experiencing any of
>> > them.
>> > >>
>> > >> The final option is to run the process under valgrind.  That will
>> > >> definitively show the cause.  Do note however that you may want to
>> > >> change some of the default option

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-12-07 Thread Kevin O'Gorman

Good feedback.  I haven't done a memory check on that machine in a
while

Next on my list.

On Sun, Dec 4, 2016 at 11:25 AM, Keith Medcalf  wrote:

>
> If pressing the CAPS LOCK or NUM LOCK keys on the keyboard does not toggle
> the light on the keyboard then you have lost the all interrupt processing
> since those keypresses have to be processed by the kernel mode keyboard
> driver toggling the internal state of the keyboard driver, and then the
> kernel driver sends output to the keyboard to change the status LEDs.
> Typically (all Operating Systems) this means you have suffered a complete
> kernel crash (or halt) and the system is not running.
>
> Since the system must be running in order to output indicator status, all
> indicators will stay "stuck" in their last known position (hold output).
>
> Only a power-cycle (or hardware reset -- assuming the RESET is a hardware
> reset and not just a request to reset which will be ignored) triggering a
> reboot will restart the system.
>
> The most frequent cause is a Parity Check.  Or in these latter days of not
> having ECC or even Parity checked memory, just an undetected memory fault
> which will cause random AHTBL.
>
> > -Original Message-
> > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> > On Behalf Of Kevin O'Gorman
> > Sent: Sunday, 4 December, 2016 09:21
> > To: SQLite mailing list
> > Subject: Re: [sqlite] I keep getting seg faults building my database
> using
> > python sqlite3
> >
> > Well, the i7 system failed again, but this time it was quite different.
> > And peculiar.
> > The system wasn't doing anything, but it should have been.  So I tried
> > something. It didn't matter what, because I could not get the mouse or
> > keyboard to work -- it was like they weren't plugged in.  Really like it,
> > because the caps lock light wasn't on, nor was the laser light visible in
> > the mouse.  Even when I changed mouse, keyboard and USB slot.  I couldn't
> > get in with SSH from elsewhere either.  But the computer's "I'm running"
> > light was on.
> > So I'm suspecting a partial power failure.  I don't know enough about
> > mobos
> > and USB to diagnose whether the problem was on the mobo or the power
> > supply.
> >
> > Creepty.  I had to do a hard reset  to get thing going again, and it's
> > been
> > running fine for a day now.
> >
> > On Mon, Nov 21, 2016 at 9:51 AM, Kevin O'Gorman  >
> > wrote:
> >
> > > On Mon, Nov 21, 2016 at 9:41 AM, Roger Binns 
> > > wrote:
> > >
> > >> On 19/11/16 08:08, Kevin O'Gorman wrote:
> > >> > System with problems: Running Xubuntu Linux 16.04.1, Python 3.5.2.
> > >> [...]
> > >> > System without this problem: Running Ubuntu Linux 14.04.5, Python
> > 3.4.3.
> > >>
> > >> You are good on Python versions then.  My remaining recommendation is
> > to
> > >> make the process that does SQLite be a child process (ie no making its
> > >> own children).  That will eliminate an entire class of potential
> > >> problems, although it appears unlikely you are experiencing any of
> > them.
> > >>
> > >> The final option is to run the process under valgrind.  That will
> > >> definitively show the cause.  Do note however that you may want to
> > >> change some of the default options since you have nice big systems.
> > For
> > >> example I like to set --freelist-vol and related to very big numbers
> > >> (several gigabytes) which ensures that freed memory is not reused for
> a
> > >> long time.  You could also set the valgrind option so that only one
> > >> thread is allowed - it will catch inadvertent threading you may note
> be
> > >> aware of.
> > >>
> > >> Roger
> > >>
> > >
> > > Thanks for that.  I may do the valgrind thing -- it sounds useful.  But
> > > just to add
> > > to my annoyance about this whole things, I've been having both systems
> > > running
> > > for a couple of days now with no problems or interruptions.  Remember,
> > the
> > > i7 system was failing after 2 hours at most.  I did tweak the code a
> > > little, but
> > > the only thing that seems likely to have stopped the problem is that I
> > put
> > > in
> > > code to do a commit after every 10,000 INSERT statements.  The two
> > systems
> > > are runnin

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-12-04 Thread Kevin O'Gorman

Well, the i7 system failed again, but this time it was quite different.
And peculiar.
The system wasn't doing anything, but it should have been.  So I tried
something. It didn't matter what, because I could not get the mouse or
keyboard to work -- it was like they weren't plugged in.  Really like it,
because the caps lock light wasn't on, nor was the laser light visible in
the mouse.  Even when I changed mouse, keyboard and USB slot.  I couldn't
get in with SSH from elsewhere either.  But the computer's "I'm running"
light was on.
So I'm suspecting a partial power failure.  I don't know enough about mobos
and USB to diagnose whether the problem was on the mobo or the power supply.

Creepty.  I had to do a hard reset  to get thing going again, and it's been
running fine for a day now.

On Mon, Nov 21, 2016 at 9:51 AM, Kevin O'Gorman 
wrote:

> On Mon, Nov 21, 2016 at 9:41 AM, Roger Binns 
> wrote:
>
>> On 19/11/16 08:08, Kevin O'Gorman wrote:
>> > System with problems: Running Xubuntu Linux 16.04.1, Python 3.5.2.
>> [...]
>> > System without this problem: Running Ubuntu Linux 14.04.5, Python 3.4.3.
>>
>> You are good on Python versions then.  My remaining recommendation is to
>> make the process that does SQLite be a child process (ie no making its
>> own children).  That will eliminate an entire class of potential
>> problems, although it appears unlikely you are experiencing any of them.
>>
>> The final option is to run the process under valgrind.  That will
>> definitively show the cause.  Do note however that you may want to
>> change some of the default options since you have nice big systems.  For
>> example I like to set --freelist-vol and related to very big numbers
>> (several gigabytes) which ensures that freed memory is not reused for a
>> long time.  You could also set the valgrind option so that only one
>> thread is allowed - it will catch inadvertent threading you may note be
>> aware of.
>>
>> Roger
>>
>
> Thanks for that.  I may do the valgrind thing -- it sounds useful.  But
> just to add
> to my annoyance about this whole things, I've been having both systems
> running
> for a couple of days now with no problems or interruptions.  Remember, the
> i7 system was failing after 2 hours at most.  I did tweak the code a
> little, but
> the only thing that seems likely to have stopped the problem is that I put
> in
> code to do a commit after every 10,000 INSERT statements.  The two systems
> are running identical Python code on the same inputs.  I had intended this
> to
> verify that one fails and the other does not.  What I got is something
> different,
> but on balance I like it best when my processes do not fail out.  Maybe
> this
> time the code will finish (at this rate it will be at least a week, maybe
> three.
>
> --
> #define QUESTION ((bb) || (!bb)) /* Shakespeare */
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Speeding up a query

2016-11-29 Thread Kevin Baggett


Hi,
I am trying to speed up a query on my SQLite database using SQLite 
version 3.7.17 2013-05-20 00:56:22.

The daily database has 2 tables: file_list and station_list.
Yesterday's database had a file_list table of over 1.7 million records. 
station_list is pretty much constant at 21549 records.
For my query, I am looking for records that have a wmo_prefix field of 
"SA" and "SP".

So, that reduces the records from file_list to 363,710.
I have a field "rmkcorr_flag" that has a value of 0 to 3 based on the 
underlying data that the record is referring to.
What I want returned is the record with the maximum of the rmkcorr_flag 
for the associated wmo_ID and observation_time (e.g. 2 for KMSN at 213347Z)


Here's the query:
SELECT a.observation_day, a.observation_hour, a.observation_time, 
a.text_file_name, a.start_byte, a.message_length, a.wmo_header, 
a.wmo_prefix, max(a.rmkcorr_flag),b.wmo_ID,b.latitude,b.longitude from 
main.file_list a, main.station_list b WHERE a.wmo_ID=b.wmo_ID AND 
(a.wmo_prefix IN ("SA","SP")) GROUP BY a.wmo_ID, a.observation_time 
ORDER by a.observation_time;


I put the following index on file_list:
create index combo_index on 
file_list(wmo_prefix,wmo_ID,observation_time,rmkcorr_flag);

on station_list:
create index wmo_station_index on station_list (wmo_ID);

I ran ANALYZE and EXPLAIN QUERY PLAN.

ANALYZE: table sqlite_stat1 shows file_list|combo_index|1708131 18172 43 5 4
station_list|wmo_station_index|21549 2
EXPLAIN QUERY PLAN:
0|0|0|SEARCH TABLE file_list AS a USING INDEX combo_index (wmo_prefix=?) 
(~36344 rows)

0|0|0|EXECUTE LIST SUBQUERY 1
0|1|1|SEARCH TABLE station_list AS b USING INDEX wmo_station_index 
(wmo_ID=?) (~2 rows)

0|0|0|USE TEMP B-TREE FOR GROUP BY
0|0|0|USE TEMP B-TREE FOR ORDER BY

So, when I first run the query in the database with .timer on, it hangs 
and hangs and hangs, and then says:

CPU Time: user 3.920404 sys 1.800726
I didn't measure the hanging time, but it was definitely over a minute, 
if not two. ( I see later versions of SQLite have an elapsed time)

Subsequent runs of the query match up with the times above.
Seems like there is some kind-of loading into cache the first time???

Anyway, I'm not a heavy SQLite/database user, so please be gentle :-)
I just want to see if there is anything that stands out to anyone that I 
can do to speed up my query, especially the first time through.
I can/should update the SQLite version, but I don't think that's the 
problem.


Thanks for any help!
Kevin

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-21 Thread Kevin O'Gorman

On Mon, Nov 21, 2016 at 9:41 AM, Roger Binns  wrote:

> On 19/11/16 08:08, Kevin O'Gorman wrote:
> > System with problems: Running Xubuntu Linux 16.04.1, Python 3.5.2.
> [...]
> > System without this problem: Running Ubuntu Linux 14.04.5, Python 3.4.3.
>
> You are good on Python versions then.  My remaining recommendation is to
> make the process that does SQLite be a child process (ie no making its
> own children).  That will eliminate an entire class of potential
> problems, although it appears unlikely you are experiencing any of them.
>
> The final option is to run the process under valgrind.  That will
> definitively show the cause.  Do note however that you may want to
> change some of the default options since you have nice big systems.  For
> example I like to set --freelist-vol and related to very big numbers
> (several gigabytes) which ensures that freed memory is not reused for a
> long time.  You could also set the valgrind option so that only one
> thread is allowed - it will catch inadvertent threading you may note be
> aware of.
>
> Roger
>

Thanks for that.  I may do the valgrind thing -- it sounds useful.  But
just to add
to my annoyance about this whole things, I've been having both systems
running
for a couple of days now with no problems or interruptions.  Remember, the
i7 system was failing after 2 hours at most.  I did tweak the code a
little, but
the only thing that seems likely to have stopped the problem is that I put
in
code to do a commit after every 10,000 INSERT statements.  The two systems
are running identical Python code on the same inputs.  I had intended this
to
verify that one fails and the other does not.  What I got is something
different,
but on balance I like it best when my processes do not fail out.  Maybe this
time the code will finish (at this rate it will be at least a week, maybe
three.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-19 Thread Kevin O'Gorman

On Fri, Nov 18, 2016 at 10:18 AM, James K. Lowden 
wrote:

> On Fri, 18 Nov 2016 08:55:11 -0800
> "Kevin O'Gorman"  wrote:
>
> > All of the python code is a single thread.  The closest I come
> > is a few times where I use subprocess.Popen to create what amounts to
> > a pipeline, and one place where I start a number of copies of a C
> > program in parallel, but each is a separate process with its own
> > input and output files.  These C programs have been in use for a
> > number of months for earlier stages of this project, and I regard
> > them as quite reliable.  None of them uses threads, and they are
> > mostly very simple filters.
>
> As you know, a process started with Popen cannot corrupt the Python
> process's memory.  If you're not doing anything to defeat the GIL, a
> segfault inside the Python interpreter would be considered a bug.
>
> But is it happening in the interpreter, or in SQLite for that matter?
> ISTM that's what you need to know.  To know that, you're going to need
> to run a debug version of the interpreter under gdb.  When it faults, a
> backtrace will tell you where.  That's not definititive proof; memory
> corruption is often detected far from where it was caused.  But if the
> fault is at a consistent place in SQLite code, for example, you can
> use a hardware watchpoint to discover what's writing to it.
>
> I'm game to try that, but unsure how to get such a thing.  If I have to
build
it, it's gonna take a while to assemble all the pieces. because it's gotta
have
all the parts I use, and be arranged so as not to interfere with normal use
of python 3.  Seems pretty error-prone itself, but as I said, I'm game.


> I don't know what more to suggest.  I would be surprised if you find a
> fault in Python, in the Python standard library, or in SQLite.  I'm
> sure it won't be in anything on the other side of a popen call.  Are
> there non-standard libraries or Python modules in use that you haven't
> mentioned?
>
> The most likely culprit in my mind is RAM.  You're exercising new memory
> pretty hard, running a bunch of processes at it at full tilt.  Any
> defect in the chips or DMA could explain what you're seeing.  An easy
> test, not necessarily cheap, would be to replace the RAM (or, if
> possible, run with some removed).
>
> I have two war stories related to rotten I/O hardware, where the device
> appeared to work for all intents and purposes, but was actually a
> high-speed bit munger. Those were both over 20 years ago.  It will
> be interesting to hear if that turns out to be your issue.
>
> HTH.
>
> --jkl
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-19 Thread Kevin O'Gorman

Ran Memtest86+ 5.01, two complete passes, with no errors.



On Sat, Nov 19, 2016 at 8:19 AM, Kevin O'Gorman 
wrote:

>
>
> On Sat, Nov 19, 2016 at 8:11 AM, Simon Slavin 
> wrote:
>
>>
>> On 19 Nov 2016, at 4:08pm, Kevin O'Gorman 
>> wrote:
>>
>> > I have two different machines running this stuff.  Only one is having
>> the
>> > seg faults, but they are different enough that this does not convince me
>> > to blame hardware.
>>
>> Could you, anyway, run a memory test on the computer which is failing ?
>> I don't how one does that under Xubuntu but perhaps you do, or perhaps
>> there's one built into the BIOS.
>>
>> Simon.
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
> Sure.  Memtest86+ is an option on boot.  I just have to find a time slot
> because it takes a while.  Later today.
>
> --
> #define QUESTION ((bb) || (!bb)) /* Shakespeare */
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-19 Thread Kevin O'Gorman

On Sat, Nov 19, 2016 at 8:11 AM, Simon Slavin  wrote:

>
> On 19 Nov 2016, at 4:08pm, Kevin O'Gorman  wrote:
>
> > I have two different machines running this stuff.  Only one is having the
> > seg faults, but they are different enough that this does not convince me
> > to blame hardware.
>
> Could you, anyway, run a memory test on the computer which is failing ?  I
> don't how one does that under Xubuntu but perhaps you do, or perhaps
> there's one built into the BIOS.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

Sure.  Memtest86+ is an option on boot.  I just have to find a time slot
because it takes a while.  Later today.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-19 Thread Kevin O'Gorman

On Fri, Nov 18, 2016 at 3:19 PM, James K. Lowden 
wrote:

> On Fri, 18 Nov 2016 10:56:37 -0800
> Roger Binns  wrote:
>
> > Popen calls fork (it seems like you are doing Unix/Mac, not Windows).
> > fork() duplicates the process including all open file descriptors.
> > One or more of those descriptors belong to open SQLite databases and
> > ancillary files.
>
> Good catch, Roger.  It's a liability, but I slightly disagree with your
> characterization.
>
> > - Running any Python code (destructors can be called which then run in
> > the parent and child)
>
> Yes, if those destructors affect shared resources.  The OP did say the
> processes on the other side of popen were C programs.
>
> > - Not having file descriptors closed so the child process trashes them
> > (close_fds Popen argument is False in python 2.x but True in python
> > 3.x).
>
> The child process can't "trash" the parent's descriptors.  When the
> child exits, the OS will close its descriptors, that's all.  But, yes,
> if the child process is making some assumption about open descriptors
> it receives at startup, that could be lead to problems.  Especially if
> it scribbles on the SQLite database.
>
> > Also python 2.x subprocess module is broken in many ways.
>
> My foray into Unicode in Python convinced me once and for all that
> Python 3 is the only way to go.  But would you care to elaborate on the
> problems with 2.x subprocess?
>
>
> You can all put aside worries about Python 2.  I only started using Python
a couple of months ago, and went with 3.

I have two different machines running this stuff.  Only one is having the
seg faults, but they are different enough that this does not convince me
to blame hardware.

System with problems: Running Xubuntu Linux 16.04.1, Python 3.5.2.
CPU: core i7, MOBO: Asus Z97 Deluxe, 32GB ram.  4 cores, hyperthreaded
to 8.  Home-built.

System without this problem: Running Ubuntu Linux 14.04.5, Python 3.4.3.
This is a System76 Silverback.  CPU: dual Intel Xeon, 256 GB RAM(!)
16 cores, hyperthreaded to 32.

On both: Filesystem is an 11-TB software RAID (Linux LVM) on top of
three 4-TB hard drives.

--
Python modules are all pretty standard (I think):
stat argparse datetime fileinput glob os re sqlite3 subprocess sys time

children on the other side of Popen are either standard Linux utilities like
split/grep/sort or my own C code, which is mostly very simple with one
exception,
and that is a pretty well-tested implementation of minimax search through
a game tree.  But even in that one, the I/O is very simple: one input line
yields one output line on one of two output streams.

I never set close_fds=False, so the Python3 default protects the database.

Questions, though: I don't bother to commit in the midst of anything, so
you'll notice my journal gets pretty big.  That's not a problem for my
filesystem,
but could it be a problem internally to Python?
-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-18 Thread Kevin O'Gorman

On Fri, Nov 18, 2016 at 8:38 AM, Roger Binns  wrote:

> On 17/11/16 19:14, Kevin O'Gorman wrote:
> > SO: I need help bifurcating this problem.  For instance, how can I tell
> if
> > the fault lies in SQLite, or in python? Or even in the hardware, given
> that
> > the time to failure is so variable?
>
> Are you using threads, threading related settings etc in any way?  The
> python sqlite3 module (aka pysqlite) is not threadsafe.  That doesn't
> stop people "working around" it, which can lead to crashes.
>
> Roger
>
>
> I am not.  All of the python code is a single thread.  The closest I come
is a few times where I use subprocess.Popen to create what amounts to a
pipeline, and one place where I start a number of copies of a C program in
parallel, but each is a separate process with its own input and output
files.  These C programs have been in use for a number of months for
earlier stages of this project, and I regard them as quite reliable.  None
of them uses threads, and they are mostly very simple filters.

The one that runs in parallel cannot be the culprit, however, because the
code has not reached the point where it would come into play.  That is the
step where the results get "cached" (in flat files) and all of the early
results are in those files.  This thing is dying before reaching unknown
territory where new results are needed.  The reason the results exist is
that they were generated by previous versions of the software that did not
use Python.  I am switching because (a) I want a database instead of flat
files for speed reasons and (b) the C code was getting too hard to maintain.

All of this is a hobby project, and I can share any parts of it that you
care to see.
-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-18 Thread Kevin O'Gorman

On Fri, Nov 18, 2016 at 3:11 AM, Simon Slavin  wrote:

> Forgot to say ...
>
> Most of these problems result from attempting to reuse memory you've
> already released.  Even if the error is happening inside a SQLite routine,
> it will be because you passed it a pointer to an SQLite connection which
> had already been _close()'d or a statement which had already been
> _finalize()'d.
>
>
> Well, the entire process runs under a single connection that is never
closed. Although cursors are opened and closed within some of the Python
functions, the close is always at the end of the function.  And I have no
idea how things get finalized under python, or if that's an issue.  Anyway,
most of the time I use patterns like
  for row in c1.execute(...)
or
  with dbfile.connect() as conn:
and as I understand it, these protect me from most errors of that kind.

Moreover, it seems to me that these would all lead to failure at exactly
the same point, rather than at three points up to an hour different from
each other.

I was thinking it more likely to be one of
a) random hardware malfunction
b) some bug in memory handling that was subject to influence by other
activity in the machine)
c) some bug in hanling pointers to the python structures (it would have to
be a python or sqlite3 bug.)




-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] I keep getting seg faults building my database using python sqlite3

2016-11-17 Thread Kevin O'Gorman

I ran this thing 3 times with identical inputs, it is deterministic, but it
failed after 66, 128 and 96 minutes respectively.  Each run started with no
database at all, and gets a single input from which the rest is
calculated.  The calculations are cached (in flat files), so and it never
got to the poiint where it needed to calculate more.  I get core files, but
don't know how to interpret them.

The machine is a new i7 MOBO with 32 GB RAM, 32 GB swap (mostly unused),
running Xubuntu Linux. The last run left me with a 16 GB database and a
600MB journal.

SO: I need help bifurcating this problem.  For instance, how can I tell if
the fault lies in SQLite, or in python? Or even in the hardware, given that
the time to failure is so variable?

Can anyone help me make sense of the core file?

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Accessing sqlite3Apis pointer when loading sqlite3 dll/so.

2016-10-29 Thread Kevin Martin

Hi,

I have a shared library that internally uses a statically linked sqlite for a 
few different internal tasks. Amongst these there is some code that provides a 
few virtual tables. I would like to extend the interface of the library so that 
as well as it’s normal interface, it can be accessed as an sqlite extension 
through some of these virtual tables. This will allow my library to be used in 
two ways:

a) Through its normal C interface.
b) From an sqlite3 extension loadable from the sqlite3 client app, or the R or 
python interfaces to sqlite3.

The problem is how to build the internal code that accesses sqlite3, including 
the virtual table code that I also want to make available through the 
extension. For the virtual table code I think I would need to build two copies, 
one with SQLITE_CORE defined (to be used internally), and the other without (to 
be used in case b). Doing this, in case b, I would suffer the problems I have 
read about where there would be two copies of the static global that implements 
the posix locking workaround. Although in my use case it is almost certain this 
would not cause a problem (I can’t envisage a use case where both my statically 
linked sqlite3 and the one being used at the interface layer would open the 
same db). It still feels like an ugly solution.

My thought instead is to build all of sqlite code using sqlite3ext.h and 
without SQLITE_CORE defined, and in case a, dynamically load an sqlite3 
library. However i need to get access to the sqlite3_api_routines pointer 
within whichever sqlite3 library is being used. My question is how to do this?

In case b, the pointer is passed when the extension is first loaded, and I can 
pass it through to the rest of my library. For case a I’m struggling a bit. The 
best I can figure is:

(Assume I have called dlopen/LoadLibrary and can access symbols with 
dlsym/GetProcAddress)

1. Set up a fake extension function whose purpose is to capture the sqlite3Apis 
pointer passed to xInit within sqlite3AutoLoadExtensions.
2. Use dlsym/GetProcAddress to find sqlite3_auto_extension, 
sqlite3_cancel_auto_extension, sqlite3_open_v2, sqlite3_close.
3. Register my extension with sqlite3_auto extension.
4. Open an in memory database to load the extension and capture the 
sqlite3_api_routines pointer.
5. Close the database.
6. Cancel the auto extension.

My questions are:

1) Am I missing anything, i.e. does this work.
2) Will it be guaranteed in the future that the sqlite3_api_routines pointer 
will be passed to the extensions loaded by sqlite3_auto_extension.
3) Is there a better way to do this as it feels a little hacky.

Thanks,
Kevin
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Import 5000 xml files in a sqlite database file

2016-10-24 Thread Kevin Youren

Bob,

my name is Kevin Youren, and I did this task about 4 years ago in
Python 3, by parsing XML files and creating CSV files. The CSV files
were used to load Sqlite tables, MS Xcel spreadsheets and IBM mainframe
DB2 tables. The XML data was mildly complex, large, and error prone.

If you have a sample, say 2 or 3 of the normal files, I could make some
suggestions.

Please note that unless the data is ultra simple, XML is generally
better translated as several tables.

For example, my application stored Control-M scheduler information.

Two tables for the schedule group or table name.

The third table for the jobs in each schedule group/table.

The fourth table for the conditions for the jobs for the schedule
group/table.

Each table had columns for the tags or attributes.

regs,

Kev






___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Is there a best practice for breaking up a large update?

2016-10-16 Thread Kevin O'Gorman

There are other writers, but they create new stuff, while this big update
wants to consolidate info about existing stuff.  There are also pure
readers.

Some of the docs on WAL logging had caveats that put me off.  Not knowing
exactly what I was going to be doing with the database, I could not be sure
if the caveats applied to me or not.  Particularly because I had some
trouble understanding the caveats, since some of the terminology is a bit
new to me.  Not all of it -- I actually taught a database course at the
undergrad level once, SQL-based, but it was quite a while ago.  But most of
my database experience was either following SQL recipes, And a long time
ago I wrote a full database server from scratch (not relational --
heirarchical) in assembler.

On Sat, Oct 15, 2016 at 6:14 PM, Darren Duncan 
wrote:

> You didn't say if the other tasks need write access to the database or if
> it is just read-only.  If the others only need read-only, let them access a
> copy of the database while you make your changes in another copy, then just
> swap the databases when done. -- Darren Duncan
>
>
> On 2016-10-15 1:21 PM, Kevin O'Gorman wrote:
>
>> I'm new to this, and working in Python's sqlite3.  So be patient, and
>> don't
>> expect me to know too much.  This is also a personal hobby, so there's
>> nobody else for me to ask.
>>
>> I've got a database of a some tens of millions of positions in a board
>> game.  It may be over a billion before I'm done (working in an 11-TB
>> partition at the moment.) I want to process a major slice of these
>> records,
>> in order to update other records.  I might want to break the work up
>> into chunks to allow other access to the database while this is going on.
>>
>> So I have some questions:
>> (1) If I do all of my updates to a temporary table, does the database
>> still
>> get locked?
>>
>> (2) Is there another way to keep it available?  It happens for this
>> activity that consistency is not at risk.
>>
>> (3) If it turns out that I can avoid locking, it there still a performance
>> reason to break the transaction into chunks, or would I be as well off
>> doing it as a single transaction (assuming I have disk space for the
>> journal).
>>
>> (4) If I break it up into chunks, I can think of several ways to do that
>> and keep track of what's been done and what has not.  Is there a best
>> practice for this?
>>
>> (5) Is there a forum specifically for folks doing Python database
>> programming?  It occurs to me that much of what I'm asking about is not
>> specific to SQLite.  But my code, for now at least, is going to be in
>> Python because it's the easiest way I can see, and my own efforts are the
>> real bottleneck in making progress.
>>
>>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Is there a best practice for breaking up a large update?

2016-10-15 Thread Kevin O'Gorman

I'm new to this, and working in Python's sqlite3.  So be patient, and don't
expect me to know too much.  This is also a personal hobby, so there's
nobody else for me to ask.

I've got a database of a some tens of millions of positions in a board
game.  It may be over a billion before I'm done (working in an 11-TB
partition at the moment.) I want to process a major slice of these records,
in order to update other records.  I might want to break the work up
into chunks to allow other access to the database while this is going on.

So I have some questions:
(1) If I do all of my updates to a temporary table, does the database still
get locked?

(2) Is there another way to keep it available?  It happens for this
activity that consistency is not at risk.

(3) If it turns out that I can avoid locking, it there still a performance
reason to break the transaction into chunks, or would I be as well off
doing it as a single transaction (assuming I have disk space for the
journal).

(4) If I break it up into chunks, I can think of several ways to do that
and keep track of what's been done and what has not.  Is there a best
practice for this?

(5) Is there a forum specifically for folks doing Python database
programming?  It occurs to me that much of what I'm asking about is not
specific to SQLite.  But my code, for now at least, is going to be in
Python because it's the easiest way I can see, and my own efforts are the
real bottleneck in making progress.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] enabling FK in VisualStudio

2016-09-24 Thread Kevin Benson

On Sat, Sep 24, 2016 at 5:14 AM, denis  wrote:

> Hello all, I want enabling foreign key in the file app.config of visual
> studio 2015 is-it possible? if yes, what is the string?
> I tested connectionString = "data source=d:\db.sqlite; ForeignKeys=true;"
> but the FK does not work.
> thanks
>

The accompanying .CHM file shows the Parameter as ForeignKeys, not
ForeignKeys. Have you tried that?

http://s11.postimg.org/qy2q9attf/2016_09_24_085430.jpg

--
   --
  --
 --Ô¿Ô--
K e V i N
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Consistency, rollback and such

2016-09-20 Thread Kevin O'Gorman

Thanks.  That seems clear, and I think I understand it, and the conflicts
that I was referring to result in exceptions thrown at commit() time.  That
makes sense.  I don't think I'll be doing any rollbacks -- the logic is
hard enough as it stands.  I'll just wrap it all in IMMEDIATE
transactions.  They're quick, but conflicts are fairly likely, so it's
probably the right solution.

On Tue, Sep 20, 2016 at 3:02 PM, Richard Hipp  wrote:

> On 9/20/16, Kevin O'Gorman  wrote:
> > Surely, Mr. Hipp is an authority, but I'm slightly puzzled by this
> answer.
> > And it doesn't answer the part of the question about what happens if I do
> > it wrong, and transactions conflict.  Based on what I now think is true,
> if
> > I don't do something, transactions begin with the first modification.
> They
> > may be prevented from executing UPDATEs simultaneously, but could still
> > result in non-serializable execution of the whole, and inconsistent or
> > erroneous data in the database, because each would be based on the SELECT
> > statements before either had written.  Or do they result in exceptions?
>
> If you do a "BEGIN;" followed by a "SELECT..." then the transaction
> starts before the SELECT is run.  So it is serializable.
>
> But if you do just a "BEGIN", then some other process might jump in
> line ahead of you and make some other changes to the database.  Your
> transaction will not be able to see those changes, due to isolation.
> But they will be in the database file. Then when you got to COMMIT,
> SQLite will see that your transaction is based on an historical and
> out-of-date version of the database and hence will refuse to do the
> commit.  You'll have to ROLLBACK and try again.
>
> When you do "BEGIN IMMEDIATE" that reserves your spot in line and
> ensures that no other transactions will commit in front of you.
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Consistency, rollback and such

2016-09-20 Thread Kevin O'Gorman

If by explicitly, you mean starting my own transactions, then I guess I'll
be doing it as you are.  I do not even want to think about savepoints and
figuring out what has to be re-done.  My transactions, while numerous, will
be pretty simple and mostly quite quick, so there's not much advantage to
redoing over just delaying.  Besides, I'm not sure I know how to detect a
conflict that would has caused or should cause a rollback.

On Tue, Sep 20, 2016 at 12:21 PM, David Raymond 
wrote:

> Your understanding of the isolation_level parameter there is correct. It
> only effects how the  transactions are created. If you're going
> to be doing a few selects before you update then  doing the
> "begin someSortOf transaction;" is what you want.
>
> After having some trouble getting savepoints to work correctly I've gotten
> into the habit of always doing isolation_level = None, and doing everything
> explicitly, but as long as you know what's going on then you're good.
>
>
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Kevin O'Gorman
> Sent: Tuesday, September 20, 2016 12:35 PM
> To: sqlite-users
> Subject: [sqlite] Consistency, rollback and such
>
> I'm also wondering if setting
>   conn = sqlite3.connect("mydb", isolation_level=IMMEDIATE)
> does what I need.  Reading the docs, it would appear this does not start a
> transaction until the UPDATE, and I think I want the transactions to start
> before the first SELECT.  Should I instead do
>   c = conn.cursor()
>   c.execuite("BEGIN TRANSACTION IMMEDIATE")
>
> and is IMMEDIATE the right thing, or do I need EXCLUSIVE.
>
>
> --
> #define QUESTION ((bb) || (!bb)) /* Shakespeare */
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Consistency, rollback and such

2016-09-20 Thread Kevin O'Gorman

Surely, Mr. Hipp is an authority, but I'm slightly puzzled by this answer.
And it doesn't answer the part of the question about what happens if I do
it wrong, and transactions conflict.  Based on what I now think is true, if
I don't do something, transactions begin with the first modification.  They
may be prevented from executing UPDATEs simultaneously, but could still
result in non-serializable execution of the whole, and inconsistent or
erroneous data in the database, because each would be based on the SELECT
statements before either had written.  Or do they result in exceptions?

On Tue, Sep 20, 2016 at 10:09 AM, Richard Hipp  wrote:

> On 9/20/16, Kevin O'Gorman  wrote:
> >   c.execuite("BEGIN TRANSACTION IMMEDIATE")
> >
> > and is IMMEDIATE the right thing, or do I need EXCLUSIVE.
>
> IMMEDIATE is the right thing.  That lets other readers continue and
> new readers to start, but blocks all other writers.
>

Please confirm or refute my understanding that this would let transactions
without the EXCLUSIVE to begin, but not EXCLUSIVE ones even if they begin
with reading.

> EXCLUSIVE would block everybody - readers and writers - which is more
> than you need.
>

If my understanding above is correct, then this would indeed be more than
required.  But of course, it would work so long as the SELECTs are wrapped
along with their resultant UPDATEs.

>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Consistency, rollback and such

2016-09-20 Thread Kevin O'Gorman

I think I understand the basics of SQL and ACID properties, but I'm new to
SQLite and not really experienced in any of these.  So I'm having some
trouble figuring out the detailed consequences of IMMEDIATE, EXCLUSIVE and
DEFERRED and the autocommit mode of python's sqlite3.

I expect my transactions to be fairly short, and to average three or less
per second, so conflict is likely, but not likely to overload whatever
mechanisms are involved.  However, it will be very common for a transaction
to begin with SELECT queries, and to compose an UPDATE based on what it has
and what it finds.  It will be quite possible, even frequent, for multiple
processes to decide to update the same records.

As background, I'd like to know what happens with each of the kinds of
transaction.  Do any of them do rollbacks, and if so what does that look
like (in Python),  Do any of them throw exceptions?

I'm guessing I'm going to want one of IMMEDIATE or EXCLUSIVE, but I'm not
sure which one.  I'm also wondering if setting
  conn = sqlite3.connect("mydb", isolation_level=IMMEDIATE)
does what I need.  Reading the docs, it would appear this does not start a
transaction until the UPDATE, and I think I want the transactions to start
before the first SELECT.  Should I instead do
  c = conn.cursor()
  c.execuite("BEGIN TRANSACTION IMMEDIATE")

and is IMMEDIATE the right thing, or do I need EXCLUSIVE.

That's a bunch of questions, so please answer any where you're _sure_ you
know the answer.


-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Clarification on "No Isolation On Same Db Connection"

2016-09-08 Thread Kevin O'Gorman

It seems to me that the simplest, most portable approach for this sort of
thing would to be having the SELECT create a temporary table of the desired
actions, and not apply them until after the select has concluded.  This
would work in any database -- it does not depend on precise semantics of
WAL, for instance.

Of course, it could be that this is inefficient for some reason, and that
might cause you to take a different course, but all the alternatives are
going to be more troublesome to understand and maintain.  It's up to you
whether this is worth it.

On Wed, Sep 7, 2016 at 7:24 PM, Igor Tandetnik  wrote:

> On 9/7/2016 6:11 PM, Stephan Mueller wrote:
>
>> I understand that a way to ensure "SELECT is unperturbed" semantics is to
>> use separate connections for SELECT and updates.
>>
>
> If you go down that route, make sure you are using WAL journaling mode; it
> won't work otherwise.
>
> This is undesirable since I'd have to (IIUC) do all my updates (possibly
>> millions) in a single transaction.
>>
>
> I don't see how this follows.
>
> I'd prefer to commit after each update
>>
>
> You can't commit on a single connection either, while there's an
> unfinalized SELECT statement traversal going on. So you aren't gaining
> anything by trying to interleave SELECT and updates on the same connection.
>
> That is, if I ever receive a record that ought to have arrived earlier
>> because of ORDER BY, it must be a since-SELECT-began update, and should be
>> ignored.
>>
>
> When data is modified under SELECT's feet, phantom rows are just one
> problem; it's also possible for the statement to skip rows it would have
> otherwise returned, and to return rows containing stale data. Basically,
> undefined behavior is undefined.
> --
> Igor Tandetnik
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Setting temp location with Python

2016-08-11 Thread Kevin O'Gorman

On Wed, Aug 10, 2016 at 6:50 AM, Jonathan Moules <
jonathan-li...@lightpear.com> wrote:

> Hi List,
>I'm using Python's sqlite3 library to access a SQLite db. I'd like to
> set the location for the temporary databases in a platform agnostic fashion
> (*nix or Windows).
>
> This page - https://www.sqlite.org/tempfiles.html - gives a number of
> options, but the only good one I can see for using it via Python would be
> the PRAGMA.
> But the docs for PRAGMA temp_store_directory; (
> https://www.sqlite.org/pragma.html#pragma_temp_store_directory ) say this
> is deprecated.
>
> So what's the recommended way to achieve this?
>
>
> Take a look at https://www.sqlite.org/tempfiles.html, specifically item
5.0.  which addresses this question.  It varies a bit depending on your
environment, so there's no single answer.  On my Linux system, I fiddle
with the SQLITE_TMPDIR environment variable.

You might want to describe your environment so that folks with a similar
environment can respond more helpfully.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug in CREATE INDEX

2016-08-08 Thread Kevin O'Gorman

On Sun, Aug 7, 2016 at 11:11 PM, Dan Kennedy  wrote:

> On 08/08/2016 02:03 AM, Dominique Pellé wrote:
>
>> Kevin O'Gorman wrote:
>>
>> CREATE INDEX has two problems:
>>> 1) poor default location of temporary storage.
>>> 2) gets wedged on very large indexes.
>>>
>>> I'm using the sqlite that came with Xubuntu 14.04, I think it's version
>>> 3.8.2.
>>>
>> SQLite-3.8.2 is old (Dec 2013). It's better to download and
>> compile SQLite yourself.
>>
>> There has been several optimizations since that release.
>> In particular, looking at release notes at
>> https://sqlite.org/changes.html the following
>> improvement which may be relevant for your issue:
>>
>> === BEGIN QUOTE https://sqlite.org/changes.html ===
>> 3.8.10:
>>
>> Performance improvements for ORDER BY, VACUUM,
>> CREATE INDEX, PRAGMA integrity_check, and
>> PRAGMA quick_check.
>> === END QUOTE ===
>>
>
> 3.8.7 introduced the multi-threaded sorter too. So with a more recent
> version of SQLite,
>
>   PRAGMA threads = 4
>
> might help this case.
>
>   https://sqlite.org/pragma.html#pragma_threads
>
> Dan.
>

Very cool.  But 4?  I will be running this on machines with 8 and 16
cores.  Does going beyond 4 not help much?


-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug in CREATE INDEX

2016-08-08 Thread Kevin O'Gorman

On Mon, Aug 8, 2016 at 2:41 AM, Philip Newton 
wrote:

> On 7 August 2016 at 22:37, Kevin O'Gorman  wrote:
> > I use the LTS (long-term support) version of Ubuntu, and like not having
> to
> > keep up with all the latest.  My current 14.04 is at end-of-life
>
> LTS are supported for 5 years; your 14.04 is good till April 2019.
>
> Ph.
>

I guess I'm not paying attention -- that's the point, I don't want to have
to -- and just consider it end-of-life when it starts bugging me twice a
day about the next LTS.  I'm planning to install 16.04.1 into the partition
with the even older 12.04 today.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug in CREATE INDEX

2016-08-07 Thread Kevin O'Gorman

On Sun, Aug 7, 2016 at 12:03 PM, Dominique Pellé 
wrote:

> Kevin O'Gorman wrote:
>
> > CREATE INDEX has two problems:
> > 1) poor default location of temporary storage.
> > 2) gets wedged on very large indexes.
> >
> > I'm using the sqlite that came with Xubuntu 14.04, I think it's version
> > 3.8.2.
>
> SQLite-3.8.2 is old (Dec 2013). It's better to download and
> compile SQLite yourself.
>
> There has been several optimizations since that release.
> In particular, looking at release notes at
> https://sqlite.org/changes.html the following
> improvement which may be relevant for your issue:
>
> === BEGIN QUOTE https://sqlite.org/changes.html ===
> 3.8.10:
>
> Performance improvements for ORDER BY, VACUUM,
> CREATE INDEX, PRAGMA integrity_check, and
> PRAGMA quick_check.
> === END QUOTE ===
>
> Regards
> Dominique
>
>
I use the LTS (long-term support) version of Ubuntu, and like not having to
keep up with all the latest.  My current 14.04 is at end-of-life, and I'll
upgrade to 16.04 soon -- probably this weekend -- and get sqlite 3.8.17
automatically.

I'm a hobbyist going solo with a lot of interests.  This means I have to
accept the concept of "good enough" or pare down my interests to those I
can devote admin time to on a regular basis. Even more so when I haven't
really decided to adopt the package yet.

Works for me, but YMMV.



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Bug in CREATE INDEX

2016-08-07 Thread Kevin O'Gorman

RETRACTED: Not wedged, just somewhat slower than I expected.  I was
expecting about an hour, and it takes 6.  On reflection, it's okay.  My
ideas about what's going on were wrong too -- anonymous files are used, but
only two of them at a time.

I still think it is using a poor choice of default directory for temporary
files.  Is it /var/tmp instead of the more usual /tmp?  That's the only
other choice that would not be just plain wrong, and maybe it was chosen
thinking that it's likely to be on a bigger partition.  Maybe, sometimes.
But a lot of the time, /var/tmp is on the same partition as /tmp, or just
ignored because the sort utility defaults to /tmp and often that's the
biggest user of big temporary files.  That's the reason that I've placed
/tmp on its own huge partition (3 TB) and left /var/tmp alone.  The big
reason to use /var/tmp is that the files there are not deleted
automatically -- the standard says they're to be preserved -- but anonymous
files aren't ever preserved -- they've disappeared from the file system by
definition.

In all, not a big issue since $TMPDIR is honored, but I don't see that it's
documented.  I just guessed it since the sort utility honors it and I
thought it was possible sort was being used under the covers.  It's not,
but it all worked out okay.

Does anybody know where the actual defaults and controlling environment
variables are documented, by operating system?  Or are they?

On Fri, Aug 5, 2016 at 12:36 PM, Kevin O'Gorman 
wrote:

> CREATE INDEX has two problems:
> 1) poor default location of temporary storage.
> 2) gets wedged on very large indexes.
>
> I'm using the sqlite that came with Xubuntu 14.04, I think it's version
> 3.8.2.
>
> I created a table, and used .import to populate it with records, about 1.4
> billion of them.  The resulting database is 232 GB.  All seemed well.
>
> I then went to CREATE INDEX on two fields, one CHAR(64) and the other an
> INT.  This initially died quickly, but I determined that it was because it
> was attempting to use my smallest partition for workspace.  It was in my
> root partition, but not in /tmp which is separate.  I submit that /tmp
> would be the natural choice, as it is what the sort command uses by
> default.  That's problem 1.
>
> Fortunately, it honors the $TMPDIR environment variable, but gets wedged
> after a bit.  I have an idea why.  While it was still actively adding
> space, there were no filenames showing, so I infer that it's using
> "anonymous" files, perhaps by calling tmpfile().  This could be bad news,
> as anonymous files have to be held open, but there are limits on how many
> open files any process is allowed.  Unless your merge process starts with
> at least 1.4 million keys in each buffer, this is going to fail on my
> data.  I suggest using tempnam() and closing the files when you're not
> actively using them.
>
> --
> #define QUESTION ((bb) || (!bb)) /* Shakespeare */
>

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-07 Thread Kevin O'Gorman

On Sat, Aug 6, 2016 at 2:49 PM, Kevin O'Gorman 
wrote:

> On Sat, Aug 6, 2016 at 2:09 AM, Dan Kennedy  wrote:
>
>> On 08/06/2016 09:52 AM, Kevin O'Gorman wrote:
>>
>>> On Fri, Aug 5, 2016 at 2:03 PM, Dan Kennedy 
>>> wrote:
>>>
>>> On 08/06/2016 03:28 AM, Kevin O'Gorman wrote:
>>>>
>>>> On Fri, Aug 5, 2016 at 1:08 PM, David Raymond >>>> >
>>>>> wrote:
>>>>>
>>>>> ..
>>>>>
>>>>> Apart from the default location of the files, it reads like your next
>>>>> main
>>>>>
>>>>>> concern is how many temp files get opened up. My bet is that it'll be
>>>>>> a
>>>>>> very small number, just potentially huge in file size while it's doing
>>>>>> its
>>>>>> thing. But again, try that pragma and take a look.
>>>>>>
>>>>>> My best bet is the contrary:  it starts with small files and makes
>>>>>>
>>>>> increasingly larger ones, like the sort utility does.  The problem is
>>>>> that
>>>>> there are too many of them at the beginning for it to work with
>>>>> anonymous
>>>>> files (which sort does not use).  This at least offers a possible
>>>>> explanation of its getting wedged on large indexes: an unexpected and
>>>>> untested error, handled poorly.
>>>>>
>>>>> You could verify this by checking the number of open handles in
>>>> "/proc//fd" after your process is wedged.
>>>>
>>>> Excellent idea.  I did not know about that possibility.  And sure
>>>> enough,
>>>>
>>> I'm wrong.  It's using anonymous files, all right, but only one or two
>>> at a
>>> time.  I assume they're big.  I'm in the process of bracketing where size
>>> begins to matter.  So far, 1/10 of the data loads and indexes just fine,
>>> albeit somewhat more slowly that the smaller samples predicted.  The
>>> database load took 6.5 minutes, the troublesome index 10 minutes.  At
>>> smaller sizes, indexing is faster than the database load.
>>>
>>> I'm trying 1/3 now (500 million lines)
>>>
>>
>> What does [top] tell you once the process becomes wedged? What percentage
>> is the CPU running at? Or is it completely bogged down waiting for IO?
>>
>> Dan
>
>
> I'm waiting for a good time to get the answer to this.  It takes a good
> long while to get to wedged, so I'll probably do it overnight tonight.
>

RETRACTION: it doesn't get wedged after all, it just takes about 4 times
longer than I expected.  On small inputs, it makes the indexes faster than
the data table.  When I test on the whole data file, it takes 4 times
longer to build each index than it took to make the table.  I guess that's
what it looks like when building the table is O(n) complexity (linear), and
the index is O(n * log(n)) and log(n) is getting bigger.

Sorry for the kerfluffle.  This is my first time working with datasets this
big.  Since it takes almost 6 hours to build each index (there are 2 at
this point), and I didn't know to expect that, I drew the wrong
conclusion.  My bad.

I'm just glad Mr. Hipp got into this thread and advised to build the
indexes after the data is loaded.  It would have been _lots_ slower
inserting each index individually, and in fact my first attempt -- the one
that gave me the initial idea that this had wedged -- took this approach
and I decided it was wedged after a few _days_.  I suspect it wasn't
either, but might as well have been.

So far, this has been a feasibility exercise.  Building a 500 GB database
overnight is definitely feasible.  Now I can go on to see if it helps me
solve my problem.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-06 Thread Kevin O'Gorman

On Sat, Aug 6, 2016 at 2:09 AM, Dan Kennedy  wrote:

> On 08/06/2016 09:52 AM, Kevin O'Gorman wrote:
>
>> On Fri, Aug 5, 2016 at 2:03 PM, Dan Kennedy 
>> wrote:
>>
>> On 08/06/2016 03:28 AM, Kevin O'Gorman wrote:
>>>
>>> On Fri, Aug 5, 2016 at 1:08 PM, David Raymond 
>>>> wrote:
>>>>
>>>> ..
>>>>
>>>> Apart from the default location of the files, it reads like your next
>>>> main
>>>>
>>>>> concern is how many temp files get opened up. My bet is that it'll be a
>>>>> very small number, just potentially huge in file size while it's doing
>>>>> its
>>>>> thing. But again, try that pragma and take a look.
>>>>>
>>>>> My best bet is the contrary:  it starts with small files and makes
>>>>>
>>>> increasingly larger ones, like the sort utility does.  The problem is
>>>> that
>>>> there are too many of them at the beginning for it to work with
>>>> anonymous
>>>> files (which sort does not use).  This at least offers a possible
>>>> explanation of its getting wedged on large indexes: an unexpected and
>>>> untested error, handled poorly.
>>>>
>>>> You could verify this by checking the number of open handles in
>>> "/proc//fd" after your process is wedged.
>>>
>>> Excellent idea.  I did not know about that possibility.  And sure enough,
>>>
>> I'm wrong.  It's using anonymous files, all right, but only one or two at
>> a
>> time.  I assume they're big.  I'm in the process of bracketing where size
>> begins to matter.  So far, 1/10 of the data loads and indexes just fine,
>> albeit somewhat more slowly that the smaller samples predicted.  The
>> database load took 6.5 minutes, the troublesome index 10 minutes.  At
>> smaller sizes, indexing is faster than the database load.
>>
>> I'm trying 1/3 now (500 million lines)
>>
>
> What does [top] tell you once the process becomes wedged? What percentage
> is the CPU running at? Or is it completely bogged down waiting for IO?
>
> Dan


I'm waiting for a good time to get the answer to this.  It takes a good
long while to get to wedged, so I'll probably do it overnight tonight.


-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-05 Thread Kevin O'Gorman

On Fri, Aug 5, 2016 at 2:03 PM, Dan Kennedy  wrote:

> On 08/06/2016 03:28 AM, Kevin O'Gorman wrote:
>
>> On Fri, Aug 5, 2016 at 1:08 PM, David Raymond 
>> wrote:
>>
>> ..
>>
>> Apart from the default location of the files, it reads like your next main
>>> concern is how many temp files get opened up. My bet is that it'll be a
>>> very small number, just potentially huge in file size while it's doing
>>> its
>>> thing. But again, try that pragma and take a look.
>>>
>>> My best bet is the contrary:  it starts with small files and makes
>> increasingly larger ones, like the sort utility does.  The problem is that
>> there are too many of them at the beginning for it to work with anonymous
>> files (which sort does not use).  This at least offers a possible
>> explanation of its getting wedged on large indexes: an unexpected and
>> untested error, handled poorly.
>>
>
> You could verify this by checking the number of open handles in
> "/proc//fd" after your process is wedged.
>
> Excellent idea.  I did not know about that possibility.  And sure enough,
I'm wrong.  It's using anonymous files, all right, but only one or two at a
time.  I assume they're big.  I'm in the process of bracketing where size
begins to matter.  So far, 1/10 of the data loads and indexes just fine,
albeit somewhat more slowly that the smaller samples predicted.  The
database load took 6.5 minutes, the troublesome index 10 minutes.  At
smaller sizes, indexing is faster than the database load.

I'm trying 1/3 now (500 million lines)

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-05 Thread Kevin O'Gorman

On Fri, Aug 5, 2016 at 3:03 PM, Darren Duncan 
wrote:

> On 2016-08-04 7:27 AM, Jim Callahan wrote:
>
>> Steps
>> Agree with Darren Duncan and Dr. Hipp you may want to have at least 3
>> separate steps
>> (each step should be a separate transaction):
>>
>> 1. Simple load
>> 2. Create additional column
>> 3. Create index
>>
>> Have you pre-defined the table you are loading data into? (step 0 CREATE
>> TABLE)
>>
>> If "Step 1 Simple Load" does not complete; then may want to load a fixed
>> number of rows into separate tables (per Darren Duncan)  and then combine
>> using an APPEND
>> or a UNION query (doing so before steps 2 and 3).
>>
>
> To be clear, my proposal of only loading a subset of rows was just
> intended for debugging the performance issues the OP was having and try and
> figure out what is causing the problem without having to wait too long for
> larger sets to complete.  I wasn't proposing splitting the load into
> separate databases and unioning later, rather the attempt database would be
> thrown away after each trial. -- Darren Duncan
>

Thanks, I understood that and it was a good set of suggestions.  Using
separate transactions led to focusing on the CREATE INDEX part of the
process.  That's where it gets wedged.  Working up to the full database, by
first processing smaller extracts, convinced me that size matters.  Smaller
(even millions of row) extracts do not wedge.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-05 Thread Kevin O'Gorman

On Fri, Aug 5, 2016 at 1:08 PM, David Raymond 
wrote:

> There's a depreciated pragma, PRAGMA temp_store_directory =
> 'directory-name'; that apears to still work ok for now at least.
> http://www.sqlite.org/pragma.html#pragma_temp_store_directory
>
> I've used it for the same reasons you've both mentioned about space,
> though for me it vacuuming a huge db, and the full size db copy that makes
> which ate up my entire C drive. But with that pragma you can specify
> exactly which folder you want to use for your temp files. I'd suggest try
> using that, then monitor the folder you give it to see what shows up.
>

Since it's deprecated, I'll stick with TMPDIR because it's pretty standard
on Linux (it also works with the sort utility for instance).


> Apart from the default location of the files, it reads like your next main
> concern is how many temp files get opened up. My bet is that it'll be a
> very small number, just potentially huge in file size while it's doing its
> thing. But again, try that pragma and take a look.
>

My best bet is the contrary:  it starts with small files and makes
increasingly larger ones, like the sort utility does.  The problem is that
there are too many of them at the beginning for it to work with anonymous
files (which sort does not use).  This at least offers a possible
explanation of its getting wedged on large indexes: an unexpected and
untested error, handled poorly.


>
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Kevin O'Gorman
> Sent: Friday, August 05, 2016 3:41 PM
> To: SQLite mailing list
> Subject: Re: [sqlite] newbie has waited days for a DB build to complete.
> what's up with this.
>
> On Fri, Aug 5, 2016 at 12:30 PM, Igor Korot  wrote:
>
> > Hi, Kevin,
> >
> > On Fri, Aug 5, 2016 at 3:18 PM, Kevin O'Gorman 
> > wrote:
> > > Okay, I followed some of the advice y'all gave and got some results.
> > >
> > > 1. The original problem was compromised by malformed input.  However,
> it
> > > appears that did not cause the wedging of the process.  See (3) below.
> >
> > Where are the data will come?
> > From the user? Internet?
> >
> > What I'm getting at is - you need to look for some malformed data in
> > the future as well.
> >
>
> I generate it.  I goofed, and I'll try not to goof in the future.
>
>
> >
> > >
> > > 2. I separated the steps, and started small.  Time increased slightly
> > > sub-linearly with dataset size, so I jumped to doing the whole thing.
> > With
> > > proper input, the data was loaded in 68 minutes.
> > >
> > > 3. The CREATE INDEX steps failed quickly (2 minutes), reporting
> "database
> > > or disk is full" which seemed odd since most of my partitions have much
> > > more free space than the entire database.  It turns out that whatever
> > does
> > > the creation was using space on my root partition (this is Linux, so
> that
> > > means "/").  That's the only partition in my setup without a huge
> amount
> > of
> > > free space.  On would expect temporary stuff to go to /tmp (which has
> 3TB
> > > free), but it doesn't go there.  I would go there if the system's
> native
> > > "sort" program were used.  Fortunately, it turns out that the TMPDIR
> > > environment variable is honored, but while I could see space was being
> > > used, there were no files visible.  I take that to mean that the
> > tmpfile()
> > > function (or equivalent) was used.  This could be a bad idea for large
> > > indexes because anonymous files have to be kept open, and there's a
> limit
> > > on the number of files that can be open at a time, around 1,000.  Sure
> > > enough, the index creation appears to be wedged like the original run,
> > and
> > > after a few hours I killed it manually.  This is a deal-killer.
> >
> > The failure you saw - is it on the table with the complete data set?
> > Or you got it during the experimenting?
> >
> > Only on the complete data set.
>
>
> > >
> > > So the questions are: Where do bug reports go?  I seem to be running
> > 3.8.2;
> > > is this fixed in any later version?
> >
> > You can try the "3.14" pre-released one right now. ;-)
> >
>
> Meh.  I submitted a bug report to this list.  I'll see what happens.
>
>
> > Thank you.
> >
> > >
> > >
> > > On Thu, Au

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-05 Thread Kevin O'Gorman

On Fri, Aug 5, 2016 at 12:30 PM, Igor Korot  wrote:

> Hi, Kevin,
>
> On Fri, Aug 5, 2016 at 3:18 PM, Kevin O'Gorman 
> wrote:
> > Okay, I followed some of the advice y'all gave and got some results.
> >
> > 1. The original problem was compromised by malformed input.  However, it
> > appears that did not cause the wedging of the process.  See (3) below.
>
> Where are the data will come?
> From the user? Internet?
>
> What I'm getting at is - you need to look for some malformed data in
> the future as well.
>

I generate it.  I goofed, and I'll try not to goof in the future.


>
> >
> > 2. I separated the steps, and started small.  Time increased slightly
> > sub-linearly with dataset size, so I jumped to doing the whole thing.
> With
> > proper input, the data was loaded in 68 minutes.
> >
> > 3. The CREATE INDEX steps failed quickly (2 minutes), reporting "database
> > or disk is full" which seemed odd since most of my partitions have much
> > more free space than the entire database.  It turns out that whatever
> does
> > the creation was using space on my root partition (this is Linux, so that
> > means "/").  That's the only partition in my setup without a huge amount
> of
> > free space.  On would expect temporary stuff to go to /tmp (which has 3TB
> > free), but it doesn't go there.  I would go there if the system's native
> > "sort" program were used.  Fortunately, it turns out that the TMPDIR
> > environment variable is honored, but while I could see space was being
> > used, there were no files visible.  I take that to mean that the
> tmpfile()
> > function (or equivalent) was used.  This could be a bad idea for large
> > indexes because anonymous files have to be kept open, and there's a limit
> > on the number of files that can be open at a time, around 1,000.  Sure
> > enough, the index creation appears to be wedged like the original run,
> and
> > after a few hours I killed it manually.  This is a deal-killer.
>
> The failure you saw - is it on the table with the complete data set?
> Or you got it during the experimenting?
>
> Only on the complete data set.


> >
> > So the questions are: Where do bug reports go?  I seem to be running
> 3.8.2;
> > is this fixed in any later version?
>
> You can try the "3.14" pre-released one right now. ;-)
>

Meh.  I submitted a bug report to this list.  I'll see what happens.


> Thank you.
>
> >
> >
> > On Thu, Aug 4, 2016 at 9:27 AM, Kevin O'Gorman 
> > wrote:
> >
> >> The metric for feasability is coding ease, not runtime.  I'm the
> >> bottleneck, not the machine, at least at this point.
> >>
> >> As for adding rows, it will be about like this time: a billion or so at
> a
> >> time.  But there's no need to save the old data.  Each round can be
> >> separate except for a persistent "solutions" table of much more modest
> >> size.  I've been doing this for a while now, and the solutions file has
> >> only 10 million or so lines, each representing a game position for which
> >> optimum moves are known.  Getting this file to include the starting
> >> position is the point of the exercise.
> >>
> >> If I ever get to anything like "production" in this project, I expect it
> >> to run for maybe three years...  That's after I tweak it for speed.
> >>
> >> Background: in production, this will be running on a dual-Xeon with 16
> >> cores (32 hyperthreads) and 1/4 TiB RAM.  It has sequential file update
> >> through Linux flock() calls at the moment.  The code is bash gluing
> >> together a collection of UNIX utilities and some custom C code.  The C
> is
> >> kept as simple as possible, to minimize errors.
> >>
> >> As you may surmise, this "hobby" is important to me.
> >>
> >>
> >> On Thu, Aug 4, 2016 at 9:09 AM, R Smith  wrote:
> >>
> >>>
> >>>
> >>> On 2016/08/04 5:56 PM, Kevin O'Gorman wrote:
> >>>
> >>>> On Thu, Aug 4, 2016 at 8:29 AM, Dominique Devienne <
> ddevie...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>
> >>>> It's even less dense than that.  Each character has only 3 possible
> >>>> values,
> >>>> and thus it's pretty easy to compress down to 2 bits each, for a 16
> byte
> >

[sqlite] Bug in CREATE INDEX

2016-08-05 Thread Kevin O'Gorman

CREATE INDEX has two problems:
1) poor default location of temporary storage.
2) gets wedged on very large indexes.

I'm using the sqlite that came with Xubuntu 14.04, I think it's version
3.8.2.

I created a table, and used .import to populate it with records, about 1.4
billion of them.  The resulting database is 232 GB.  All seemed well.

I then went to CREATE INDEX on two fields, one CHAR(64) and the other an
INT.  This initially died quickly, but I determined that it was because it
was attempting to use my smallest partition for workspace.  It was in my
root partition, but not in /tmp which is separate.  I submit that /tmp
would be the natural choice, as it is what the sort command uses by
default.  That's problem 1.

Fortunately, it honors the $TMPDIR environment variable, but gets wedged
after a bit.  I have an idea why.  While it was still actively adding
space, there were no filenames showing, so I infer that it's using
"anonymous" files, perhaps by calling tmpfile().  This could be bad news,
as anonymous files have to be held open, but there are limits on how many
open files any process is allowed.  Unless your merge process starts with
at least 1.4 million keys in each buffer, this is going to fail on my
data.  I suggest using tempnam() and closing the files when you're not
actively using them.

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-05 Thread Kevin O'Gorman

Okay, I followed some of the advice y'all gave and got some results.

1. The original problem was compromised by malformed input.  However, it
appears that did not cause the wedging of the process.  See (3) below.

2. I separated the steps, and started small.  Time increased slightly
sub-linearly with dataset size, so I jumped to doing the whole thing.  With
proper input, the data was loaded in 68 minutes.

3. The CREATE INDEX steps failed quickly (2 minutes), reporting "database
or disk is full" which seemed odd since most of my partitions have much
more free space than the entire database.  It turns out that whatever does
the creation was using space on my root partition (this is Linux, so that
means "/").  That's the only partition in my setup without a huge amount of
free space.  On would expect temporary stuff to go to /tmp (which has 3TB
free), but it doesn't go there.  I would go there if the system's native
"sort" program were used.  Fortunately, it turns out that the TMPDIR
environment variable is honored, but while I could see space was being
used, there were no files visible.  I take that to mean that the tmpfile()
function (or equivalent) was used.  This could be a bad idea for large
indexes because anonymous files have to be kept open, and there's a limit
on the number of files that can be open at a time, around 1,000.  Sure
enough, the index creation appears to be wedged like the original run, and
after a few hours I killed it manually.  This is a deal-killer.

So the questions are: Where do bug reports go?  I seem to be running 3.8.2;
is this fixed in any later version?

On Thu, Aug 4, 2016 at 9:27 AM, Kevin O'Gorman 
wrote:

> The metric for feasability is coding ease, not runtime.  I'm the
> bottleneck, not the machine, at least at this point.
>
> As for adding rows, it will be about like this time: a billion or so at a
> time.  But there's no need to save the old data.  Each round can be
> separate except for a persistent "solutions" table of much more modest
> size.  I've been doing this for a while now, and the solutions file has
> only 10 million or so lines, each representing a game position for which
> optimum moves are known.  Getting this file to include the starting
> position is the point of the exercise.
>
> If I ever get to anything like "production" in this project, I expect it
> to run for maybe three years...  That's after I tweak it for speed.
>
> Background: in production, this will be running on a dual-Xeon with 16
> cores (32 hyperthreads) and 1/4 TiB RAM.  It has sequential file update
> through Linux flock() calls at the moment.  The code is bash gluing
> together a collection of UNIX utilities and some custom C code.  The C is
> kept as simple as possible, to minimize errors.
>
> As you may surmise, this "hobby" is important to me.
>
>
> On Thu, Aug 4, 2016 at 9:09 AM, R Smith  wrote:
>
>>
>>
>> On 2016/08/04 5:56 PM, Kevin O'Gorman wrote:
>>
>>> On Thu, Aug 4, 2016 at 8:29 AM, Dominique Devienne 
>>> wrote:
>>>
>>>
>>> It's even less dense than that.  Each character has only 3 possible
>>> values,
>>> and thus it's pretty easy to compress down to 2 bits each, for a 16 byte
>>> blob.
>>> It's just hard to do that without a bunch of SQLite code I'd have to
>>> learn
>>> how to write.  The current effort amounts to a feasibility study, and I
>>> want
>>> to keep it as simple as possible.
>>>
>>
>> A feasibility study using equipment that are hamstrung by weights they
>> won't have in the real situation is not an accurate study.
>>
>> It's like studying fuel consumption on a different kind of road surface,
>> but for the test purposes, the cars had to tow caravans containing their
>> testing equipment - the study will not look feasible at all.
>>
>> It might of course be that the feasibility you are studying is completely
>> unrelated to the data handling - in which case the point is moot.
>>
>> Let us know how it goes :)
>> Ryan
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
>
> --
> #define QUESTION ((bb) || (!bb)) /* Shakespeare */
>

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-04 Thread Kevin O'Gorman

The metric for feasability is coding ease, not runtime.  I'm the
bottleneck, not the machine, at least at this point.

As for adding rows, it will be about like this time: a billion or so at a
time.  But there's no need to save the old data.  Each round can be
separate except for a persistent "solutions" table of much more modest
size.  I've been doing this for a while now, and the solutions file has
only 10 million or so lines, each representing a game position for which
optimum moves are known.  Getting this file to include the starting
position is the point of the exercise.

If I ever get to anything like "production" in this project, I expect it to
run for maybe three years...  That's after I tweak it for speed.

Background: in production, this will be running on a dual-Xeon with 16
cores (32 hyperthreads) and 1/4 TiB RAM.  It has sequential file update
through Linux flock() calls at the moment.  The code is bash gluing
together a collection of UNIX utilities and some custom C code.  The C is
kept as simple as possible, to minimize errors.

As you may surmise, this "hobby" is important to me.

On Thu, Aug 4, 2016 at 9:09 AM, R Smith  wrote:

>
>
> On 2016/08/04 5:56 PM, Kevin O'Gorman wrote:
>
>> On Thu, Aug 4, 2016 at 8:29 AM, Dominique Devienne 
>> wrote:
>>
>>
>> It's even less dense than that.  Each character has only 3 possible
>> values,
>> and thus it's pretty easy to compress down to 2 bits each, for a 16 byte
>> blob.
>> It's just hard to do that without a bunch of SQLite code I'd have to learn
>> how to write.  The current effort amounts to a feasibility study, and I
>> want
>> to keep it as simple as possible.
>>
>
> A feasibility study using equipment that are hamstrung by weights they
> won't have in the real situation is not an accurate study.
>
> It's like studying fuel consumption on a different kind of road surface,
> but for the test purposes, the cars had to tow caravans containing their
> testing equipment - the study will not look feasible at all.
>
> It might of course be that the feasibility you are studying is completely
> unrelated to the data handling - in which case the point is moot.
>
> Let us know how it goes :)
> Ryan
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>

-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-04 Thread Kevin O'Gorman

On Thu, Aug 4, 2016 at 8:29 AM, Dominique Devienne 
wrote:

> On Thu, Aug 4, 2016 at 5:05 PM, Kevin O'Gorman 
> wrote:
>
> > 3. Positions are 64 bytes always, so your size guesses are right.  They
> are
> > in no particular order.  I like the suggestion of a separate position
> > table, because they're going to appear in multiple qmove records, with an
> > average of about 3 or 4 appearances I think.  Maybe more.
> >
>
> 3x or 4x duplication doesn't sound like a lot. What you'd gain in the moves
> table,
> you'd lose having to join to the positions table I suspect. Higher level
> duplicates, maybe.
>
> 64-bytes always? Is that a human readable string, or some 'code' in
> hexadecimal?
> If the latter, use a blob, which requires only 32-bytes to store the same
> info. You can
> use the blob literal notation x'abcdef01' (that's a 4-bytes blob).
>

It's even less dense than that.  Each character has only 3 possible values,
and thus it's pretty easy to compress down to 2 bits each, for a 16 byte
blob.
It's just hard to do that without a bunch of SQLite code I'd have to learn
how to write.  The current effort amounts to a feasibility study, and I want
to keep it as simple as possible.


> Finally, note that if your program writes a huge text file with all your
> values, that you
> .import into sqlite3 as you showed, you're IMHO wasting time, since you
> can't use
> prepared statements and binds, and you also force SQLite's SQL parser to
> parse
> a huge amount of text. By embedding SQLite into your generator program, you
> remove all parsing except for a trivial "insert into qmoves values (:1, :2,
> ...)", and
> all the rest is sqlite_bind*() and co. calls. (and if blob situation for
> positions, then
> you can bind the 32-bytes blob directly, no need to convert/parse to/from
> hex).
>

I understand the concept of prepared statements in principle, sort of,
don't how
binds work really, so I'm not quite ready to write the code you allude to.
And I'd
no longer be able to use sqlite3 at all to do simple experiments -- I'd
have to
be blobbing and de-blobbing to make sense of anything.


>
> My $0.02. --DD
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-04 Thread Kevin O'Gorman

Lots of answers, so thanks all around.  Some more info:

1. All partitions have at least 3 GB free, and it's not changing.  /tmp is
3 TiB and empty.
2. I have a RAID partition, for size, but no RAID controller.  As a hobby
project, I don't have spare parts, and I fear the results of a failure of a
hardware RAID without a spare, so I use Linux mdadm to manage software RAID
across three 4-TB drives.
3. Positions are 64 bytes always, so your size guesses are right.  They are
in no particular order.  I like the suggestion of a separate position
table, because they're going to appear in multiple qmove records, with an
average of about 3 or 4 appearances I think.  Maybe more.

I'm going to retry, using some of the suggestions above: smaller
transactions, IGNORE, position table rowids in the moves table, smaller
subsets being loaded, developing ideas of how time depends on data size.

If it doesn't go well fairly quickly, I'll probably go back to flat files
and writing the queries the hard way.  At least I know what I'm dealing
with there.

QUESTIONS:
If it's best in general to build indexes as a separate step, does this also
apply to primary indexes?  Can a table without a primary index have one
added later?  Isn't ROWID the real primary, presuming it has ROWIDs?  And
if so, then is a primary index on a ROWID table just for compliance with
standard SQL, and really no better than any other index?  Obviously, I'm a
bit confused about this.

While I'm at it, I may as well ask if ROWID has any physical significance,
such that a VACUUM operation might change it.  Or is it just an arbitrary
ID inserted by SQLite and added to each record when they exist at all.

The current dataset is intended to solve one particular issue in the
overall project.  It looks like I'd want to build each such dataset
separately, as there will likely be a few hundred, and I gather that adding
to these tables will be pretty slow once the indexes have been built.  Or
is it sensible to drop indexes, add data and rebuild?

On Thu, Aug 4, 2016 at 7:27 AM, Jim Callahan  wrote:

> Temp Files
> Have you checked how much storage is available to the temporary file
> locations?
> The temporary file locations are different depending on the OS, build, VFS
> and PRAGMA settings.
> See the last section "5.0 Temporary File Storage Locations" of:
> https://www.sqlite.org/tempfiles.html
>
>
> The database was growing for about 1-1/2 days.  Then its journal
> > disappeared, the file size dropped to zero, but sqlite3 is still running
> > 100% CPU time, now for a total of 3800+ minutes (63+ hours).  The
> database
> > is still locked, but I have no idea what sqlite3 is doing, or if it will
> > ever stop.  All partitions still have lots of space left (most of this is
> > running in a RAID partition of 11 TiB).  Here's what I gave to sqlite3 on
> > my Linux system:
>
>
> You might have a huge storage allocation for the main file and log, but
> some other temp file might be being dumped
> to a more constrained storage location.
>
> RAM
> Since you are using RAID disk controller; I assume you have 64 bit CPU and
> more than 8 GB of RAM?
> If you have 8 GB or more of RAM would it help to use an in memory database?
>
> Transactions
> Are you using explicit or implicit transactions?
> https://www.sqlite.org/lang_transaction.html
>
>
> Steps
> Agree with Darren Duncan and Dr. Hipp you may want to have at least 3
> separate steps
> (each step should be a separate transaction):
>
> 1. Simple load
> 2. Create additional column
> 3. Create index
>
> Have you pre-defined the table you are loading data into? (step 0 CREATE
> TABLE)
>
> If "Step 1 Simple Load" does not complete; then may want to load a fixed
> number of rows into separate tables (per Darren Duncan)  and then combine
> using an APPEND
> or a UNION query (doing so before steps 2 and 3).
>
> HTH
>
> Jim Callahan
> Data Scientist
> Orlando, FL
>
>
>
>
> On Wed, Aug 3, 2016 at 11:00 PM, Kevin O'Gorman 
> wrote:
>
> > I'm working on a hobby project, but the data has gotten a bit out of
> hand.
> > I thought I'd put it in a real database rather than flat ASCII files.
> >
> > I've got a problem set of about 1 billion game positions and 187GB to
> work
> > on (no, I won't have to solve them all) that took about 4 hours for a
> > generator program just to write.  I wrote code to turn them into
> something
> > SQLite could import.  Actually, it's import, build a non-primary index,
> and
> > alter table to add a column, all in sqlite3.
> >
> > The database was growing for about 1-1/2 days.  Then its journal
> > disappear

[sqlite] newbie has waited days for a DB build to complete. what's up with this.

2016-08-03 Thread Kevin O'Gorman

I'm working on a hobby project, but the data has gotten a bit out of hand.
I thought I'd put it in a real database rather than flat ASCII files.

I've got a problem set of about 1 billion game positions and 187GB to work
on (no, I won't have to solve them all) that took about 4 hours for a
generator program just to write.  I wrote code to turn them into something
SQLite could import.  Actually, it's import, build a non-primary index, and
alter table to add a column, all in sqlite3.

The database was growing for about 1-1/2 days.  Then its journal
disappeared, the file size dropped to zero, but sqlite3 is still running
100% CPU time, now for a total of 3800+ minutes (63+ hours).  The database
is still locked, but I have no idea what sqlite3 is doing, or if it will
ever stop.  All partitions still have lots of space left (most of this is
running in a RAID partition of 11 TiB).  Here's what I gave to sqlite3 on
my Linux system:

time sqlite3 qubic.db

Re: [sqlite] Conversion failure

2016-06-23 Thread Kevin Benson

--
   --
  --
 --Ô¿Ô--
K e V i N

On Thu, Jun 23, 2016 at 10:38 AM, Igor Korot  wrote:

> Hi, Clemens,
>
> On Thu, Jun 23, 2016 at 10:33 AM, Clemens Ladisch 
> wrote:
> > Igor Korot wrote:
> >> I am trying to find out why the following code fails to do proper
> conversion.
> >> It works if the tableName have "abcd", but fails if it has "abcß" (the
> >> German letter for the "ss" (looks like Greek letter beta)).
> >>
> >> const unsigned char *tableName = sqlite3_column_text( stmt, 0 );
> >> pimpl->m_tables[m_catalog].push_back( Table( myconv.from_bytes( (const
> char *) tableName ), fields, foreign_keys ) );
> >
> > What are the byte values in tableName?
>
> I have no idea.
> I just did ALT+225 in the SQLite shell during the CREATE TABLE command.
>
> Thank you.
>
>
Have you tried changing to Windows UTF-8 codepage by issuing
>chcp 65001
at the Windows command prompt, before starting the SQLite shell and doing
ALT+225 during the CREATE TABLE command ?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

[sqlite] Note, using SQLAR compared to TGZ archive

2016-03-31 Thread Kevin

Thanks Richard,

I first saw a reference from the mailing list back in May 2014. I've
been reading the list since April 2010.

I'm not much of a C programmer, but I will let you know. 

I'm working on adding "comments" or tags to accompany the filenames. For
example, the file
"/mnt/KCYPictures/Pictures/Amys_tablet/Camera/20130727_204615.jpg" is a
picture of my daughter with he grandmother.

So, if I wanted to see files related to my daughter or her grandmother
or both together, a simple SQL query would help.

I was also interested in the concept of adding and removing files
incrementally, without having to read in an entire archive and the
rewriting the entire archive.

The application which precedes the archive function is where I take a
Linux LS listing and convert it into an SQLite database. I also add
"comments" and MD5SUMs of the files. 

I then compare the current LS SQLite database with a previous one. I
generate new MD5SUMs for new and modified files, and carry forward
comments and MD5SUMs for the unchanged files. The MD5SUMs tell me when a
file has a different name, different date or has moved folders.

I will then add the new and overwrite the modified files in the SQLAR
archive.

regs,

Kev

On 3/29/16, Kevin  wrote:
> Richard,
>
> Just a note of thanks for the SQLAR tool.

I'm glad you find it useful.  Really is it just a proof-of-concept.
If you are a C programmer and would like to try to enhance it, write
to me and I'll get you check-in privileges on the respository.

So where did you hear about SQLAR?

-- 
D. Richard Hipp
drh at sqlite.org

[sqlite] Note, using SQLAR compared to TGZ archive

2016-03-30 Thread Kevin

Richard,

Just a note of thanks for the SQLAR tool.

I am using it to store SQLite ".dump" output, which is of text data. 

The SQLAR file is 125.2M bytes.

The corresponding tgz file is 125.1M bytes.

The raw text content is in 20 files, totally 746.9M bytes.

regards,

Kev Youren

[sqlite] The database disk image is malformed

2016-01-06 Thread Kevin

Please note that I have recently had "The database disk image is
malformed" errors a month ago.

The cause was simply a fault in my original 4G RAM card.

I was trying to expand the memory by adding another 4G card. My first
card had only 3.6G working. When I was doing about 2.5M inserts, they
would appear to work, but an "analyze" immediately after would come out
with the error. 

I have an Ubuntu laptop. Note that memtest hasn't worked for a while in
Ubuntu (an EFI issue), so I had to download it and install to a USB to
find the fault. 

regs,

Kev

[sqlite] Order with another sequence

2015-10-09 Thread Kevin Benson

> On Fri, Oct 9, 2015 at 5:08 AM, Cecil Westerhof 
> wrote:
>
> > When I use ORDER BY an ? comes after a z. Is it possible to make an ?
> come
> > after a z?
> >
> > If it is important I am using SQLite 3.8.6 and Python 3.4.1.
>

Perhaps you might garner some insight from looking at other's code?
I did a search on GitHub of APSW (Another Python SQLite wrapper) by Roger
Binns:

https://github.com/rogerbinns/apsw/search?utf8=%E2%9C%93&q=collation

--
   --
  --
 --???--
K e V i N

[sqlite] DbFunctions.TruncateTime

2015-08-24 Thread Kevin Benson

On Sun, Aug 23, 2015 at 10:53 PM, Joe Mistachkin 
wrote:

>
> The LINQ provider for System.Data.SQLite does not provide this function
> directly; however, it may be possible to use one of the core date-time
> related SQL functions to do it?
>
> https://www.sqlite.org/lang_datefunc.html
>


Maybe something involving the "unixepoch" modifier ...via the connection
string in App.config ?

http://stackoverflow.com/questions/30444222/sqlite-and-entityframework-formatexception-on-timestamp

--
   --
  --
 --???--
K e V i N

[sqlite] : it seems to be only orber by and group

2015-08-22 Thread Kevin Benson

On Sat, Aug 22, 2015

> afriendandmore wrote:
> > The table shall be ordered according to A. (Order by A)
> > But if C ist true and to the value B1 in this Record, there exists
> another Record with B2 and B2 = B1, then B1 should be ordered in the row
> above B2.


If the B1 and B2, to which you refer, are just two equal values in
different rows of column B, then it sounds like you want:

SELECT * FROM table
 ORDER BY columnA, columnB
 CASE WHEN columnC = 1
 THEN 0 ELSE 1 END;

If neither ASC or DESC are specified, rows are sorted in ascending (smaller
values first) order by default.
--
   --
  --
 --???--
K e V i N

1 2 3 4 >

1 - 100 of 341 matches

Mail list logo