Re: [sqlite] Unexplained table bloat

2020-01-11 Thread Kevin Youren

Hi,

I checked the download mentioned in the original email. Not sure if the
table changed since the previous posts.


It seems LUTFullString has 3 BLOB rows, but LENGTH treats them as
strings.

I'm in Melbourne, Oz, so I added the UTC datetime.

regs, Kev

kevin@KCYDell:~$ cd /mnt/KCY/KCYDocs/
kevin@KCYDell:/mnt/KCY/KCYDocs$ sqlite3 /mnt/KCY/KCYDocs/200k-per-
row.sqlite
SQLite version 3.30.1 2019-10-10 20:19:45
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);
sqlite> 
sqlite> select '1',rowid,LENGTH(HEX(LUTFullString))/2 from copied
   ...> UNION ALL
   ...> select '2',rowid,INSTR(HEX(LUTFullString),'00') from copied
   ...> UNION ALL
   ...> select '3',rowid,substr(HEX(LUTFullString),0,10) from copied
   ...> UNION ALL
   ...> select '4',rowid,INSTR(SUBSTR(HEX(LUTFullString),4),'00') from
copied
   ...> UNION ALL
   ...> select '5',rowid,LENGTH(LUTFullString) from copied;
1|1|194238
1|2|183050
1|3|193908
2|1|3
2|2|3
2|3|3
3|1|2C0003007
3|2|2C0003007
3|3|2C0003007
4|1|1
4|2|1
4|3|1
5|1|1
5|2|1
5|3|1
sqlite> .quit
kevin@KCYDell:/mnt/KCY/KCYDocs$ date -u
Sat 11 Jan 23:39:43 UTC 2020
kevin@KCYDell:/mnt/KCY/KCYDocs$ 









Message: 6
Date: Fri, 10 Jan 2020 08:48:21 -0500
From: Ryan Mack 
To: sqlite-users@mailinglists.sqlite.org
Subject: [sqlite] Unexplained table bloat
Message-ID:
<
CABhGdGRbR1kT+3_BU6ob9L7tpSPZ09HJn=ofPyK6OXvgQK=_...@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"

Hi list,

I'm trying to understand unexplained table bloat I found in what should
be
a very small table in an old database file. If you dump/restore the
database, the problem goes away. If you duplicate the table, the
problem
propagates. Schema:

CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

Test Data:
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

In my test database file, each of these 3 rows is allocating about 40
4k
overflow pages for a total database size of about 500k. The full
database
has about 4MB of actual data which takes up over 500MB on disk. If you
want
to see/reproduce the problem you'll need my test database file which
I've
uploaded here: https://mackman.net/200k-per-row.sqlite.zip (500kb
download). I don't know why it doesn't compress better, those extra
overflow pages must be getting filled with random garbage.

My uninformed guess is there was a bug in the version of sqlite used at

database creation time that computed an incorrect overflow threshold
and is
storing each byte of the row to its own page. Since the problem goes
away
with a dump/restore, I'm considering releasing a script to do that and
mitigate the problem for affected users. Before doing that I would like
to
understand the problem better.

Thanks for reading, Ryan

PS: Here's some output from my debug session showing the 123 bytes of
data
is occupying 582k of space on disk in a freshly created table.

% sqlite3 200k-per-row.sqlite
SQLite version 3.28.0 2019-04-15 14:49:49
Enter ".help" for usage hints.
sqlite> .tables
copied

sqlite> .schema copied
CREATE TABLE copied(
  id_local INT,
  LUTFullString,
  LUTHash
);

sqlite> select * from copied;
85960605|,|0DE19F8AA100D2962FF22C60B3FA8940
85960465|,|A44615408E8B3E48A684C60CA2967031
85960451|,|33C8804354984071A740788AD308B279

sqlite> select sum(length(id_local) + length(lutfullstring) +
length(luthash)) from copied;
123

sqlite> create table copied2 as select * from copied;
sqlite> SELECT name, SUM("pgsize") as sz FROM dbstat group by name;
copied|581632
copied2|581632
sqlite_master|4096



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Regarding CoC

2018-10-24 Thread Kevin Youren
Richard,

thank you for your further explanation of your team's Code of Conduct.

After a bit of research on the Internet, everything makes sense.

Well done.

regs,

Kev



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Weird chars inserted

2016-12-20 Thread Kevin Youren

Ariel, Keith, Rowan, 

apologies, I re-ran the queries a few times, and I decided to include
"rowid" to keep track of the changes.



The experiments were conducted by cut-and-paste of the í character from
the email, hence UTF8, and using x'...' for inserts and concats.

Note, I use sqlite3 shell by preference, but I use both the Firefox
addon and "DB Browser for Sqlite" for GUI convenience - however, for
inserts and updates I use the sqlite3 shell or the C programming
interface. 

In C, I use int rather than char -

FILE *pinfile = NULL;
...
pinfile = fopen(argv[1],"rb");



int ch = fgetc (pinfile); 
/* changed from char to int to allow >127 & UTF */


Also, I use .mode csv and then a spreadsheet quite a lot.

Note, at the end, I added typeof( ) - and most were BLOBs and a couple
as TEXT.




kevin@kevin-Aspire-V5-571G:~$ sqlite3 dir_md5sum_db.sqlite
SQLite version 3.15.2 2016-11-28 19:13:37
Enter ".help" for usage hints.
sqlite> SELECT * FROM dir_md5sum
   ...> where rowid >= 194576;
kev|
kev2|
kev3|
kev4|
sqlite> insert into dir_md5sum values ( 'kev5', x'C3AD');
sqlite> SELECT * FROM dir_md5sum where rowid >= 194576;
kev|
kev2|
kev3|
kev4|
kev5|í
sqlite> .schema
CREATE TABLE dir_md5sum (dir_name text, dir_md5sum text);
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
sqlite> insert into dir_md5sum values ( 'kev6', 'a' || x'C3AD' || 'b'
);
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
194581|kev6|6B657636|aíb|61C3AD62|97
sqlite> insert into dir_md5sum values ( 'kev7', 'c' || x'00ED' || 'd'
);
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
194581|kev6|6B657636|aíb|61C3AD62|97
194582|kev7|6B657637|c|6300ED64|99
sqlite> insert into dir_md5sum values ( 'kev8',  x'00ED'  );
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
194576|kev|6B6576|�|EE|65533
194577|kev2|6B657632|�|EE|65533
194578|kev3|6B657633|�|EE|65533
194579|kev4|6B657634|�|ED|65533
194580|kev5|6B657635|í|C3AD|237
194581|kev6|6B657636|aíb|61C3AD62|97
194582|kev7|6B657637|c|6300ED64|99
194583|kev8|6B657638||00ED|
sqlite> .mode csv
sqlite> .once /home/kevin/Martin.csv
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum)  FROM dir_md5sum where rowid >=
194576;
sqlite> SELECT rowid, dir_name, hex(dir_name), dir_md5sum,
hex(dir_md5sum), unicode(dir_md5sum), typeof(dir_md5sum)  FROM
dir_md5sum where rowid >= 194576;
194576,kev,6B6576,"�",EE,65533,blob
194577,kev2,6B657632,"�",EE,65533,blob
194578,kev3,6B657633,"�",EE,65533,blob
194579,kev4,6B657634,"�",ED,65533,blob
194580,kev5,6B657635,"í",C3AD,237,blob
194581,kev6,6B657636,"aíb",61C3AD62,97,text
194582,kev7,6B657637,c,6300ED64,99,text
194583,kev8,6B657638,"",00ED,,blob
sqlite> 

regs,

Kev


Date: Mon, 19 Dec 2016 11:12:59 +0800
From: Rowan Worth 
To: SQLite mailing list 
Subject: Re: [sqlite] Weird chars inserted
Message-ID:

Content-Type: text/plain; charset=UTF-8

On 19 December 2016 at 08:24, Kevin  wrote:

> Hi Martin,
>
> I had a go using a terminal session, with default encoding UTF-8.
>
> Try using the hex( ) and unicode( ) functions to check what is
actually
> stored in the sqlite table.
>
> I put a couple of rows at the end of an existing simple table
>
> kevin@kevin-Aspire-V5-571G:~$ sqlite3
/home/kevin/dir_md5sum_db.sqlite
> SQLite version 3.15.2 2016-11-28 19:13:37
> Enter ".help" for usage hints.
> sqlite> SELECT dir_name, hex(dir_name), dir_md5sum, hex(dir_md5sum),
> unicode(dir_md5sum)  FROM dir_md5sum
>...> where rowid >= 194576;
> 194576|kev|6B6576|í|C3AD|237
> 194577|kev2|6B657632|�|ED|65533
> sqlite> .quit
> kevin@kevin-Aspire-V5-571G:~$
>

Hi Kevin,

The problem here lies in whatever inserted these rows. sqlite just
stores
what it is given - it is up to the application to take care of encoding
issues.

In this case the "kev" row has been inserted using utf-8 encoding, so
when
you retrieve this value sqlite emits the bytes 0xC3 0xAD (exactly as
they
were stored), which your terminal interprets as utf-8 and renders the
character í.

The "kev2" row however is not utf-8 encoded. The dir_md5sum column
contains
a single byte 0xED, which is not valid utf-8 (the encoding specifies
that
when the highest bit is on, the

Re: [sqlite] Import 5000 xml files in a sqlite database file

2016-10-24 Thread Kevin Youren
Bob,

my name is Kevin Youren, and I did this task about 4 years ago in
Python 3, by parsing XML files and creating CSV files. The CSV files
were used to load Sqlite tables, MS Xcel spreadsheets and IBM mainframe
DB2 tables. The XML data was mildly complex, large, and error prone.

If you have a sample, say 2 or 3 of the normal files, I could make some
suggestions.

Please note that unless the data is ultra simple, XML is generally
better translated as several tables.

For example, my application stored Control-M scheduler information.

Two tables for the schedule group or table name.

The third table for the jobs in each schedule group/table.

The fourth table for the conditions for the jobs for the schedule
group/table.

Each table had columns for the tags or attributes.

regs,

Kev






___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Mozilla wiki 'avoid SQLite'

2015-06-16 Thread Kevin Youren
In my tool box, I have a tack hammer, a claw hammer and a nail gun.

I broke my fingernail with the tack hammer.

I still have the bandaid on my thumb from the claw hammer.

Now I just have to read the instruction manual for the nail gun



[sqlite] Request: Metadata about C API constants and functions

2015-04-11 Thread Kevin Youren
Thanks, Richard

I found (https://www.sqlite.org/docsrc/timeline) &
https://www.sqlite.org/docsrc/artifact/5c48dd261dbe5804  very useful.

Last year I was parsing 100M of XML stuff and loading it into an Sqlite
database. A bit of hammering required as the XML had minor errors like
missing end tags. "malformed" 


In a meeting, we were discussing a very specific piece of functionality.
"How many times is this function used?" One guy said "About 300 times"
by I was able to answer 2365 times, within 30 seconds, just some simple
SQL.

Incidentally, it was far easier to regenerate the XML from the Sqlite,
similar to a comma delimited format, without all the "malformed"
problems. 

The malformed XML was targeted to be stored in Git. 


regs,

Kev





Re: [sqlite] 3.7.10 problem : SQLite header and source version mismatch

2012-01-26 Thread Kevin Youren
Thank you, Kevin Benson, for your hint about ldconfig

Using ldconfig -p , I found their is an older version of libsqlite3
in /usr/lib/i386-linux-gnu

So I replaced it with my compiled 3.7.10 version , and now sqlite3 works
from the command line.

regs,

Kevin Youren

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] 3.7.10 problem : SQLite header and source version mismatch

2012-01-19 Thread Kevin Youren
G'day,

thanks for such a great program.

Slight problem with sqlite-autoconf-3071000

When I issue from the command line: sqlite3

I get

SQLite header and source version mismatch
2011-06-23 19:49:22 4374b7e83ea0a3fbc3691f9c0c936272862f32f2
2012-01-16 13:28:40 ebd01a8deffb5024a5d7494eef800d2366d97204

I tried "make clean;make", but no luck.

I used:
CPPFLAGS="-DSQLITE_ENABLE_FTS3 -DSQLITE_ENABLE_FTS3_PARENTHESIS"
./configure
make
sudo make install (an Ubuntu system)


regs,

Kevin


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Select via Wi-fi very slow

2010-05-12 Thread Kevin Youren
Emany,

this has possibly already been done, but have you got an index on the table
for the column called "description"?

I would suggest a timed test on both the Symbol device and the computer
before and after the index is added.

Also, the Symbol may only have 64Mb of RAM, which probably would slow it
down.

There is a method, unfortunately not available is Sqlite, called a Stored
Procedure. A Stored Procedure is especially useful in your case because you
move the database access from the Symbol to a more powerful computer, and
reduce the back and forth WiFi traffic. The SQL would be sent from the
Symbol to the computer hosting the database, the query would executed on the
host computer and the result sent back to the Symbol.

If possible, a workaround or emulation of a Stored Procedure could be done.
You could send the SQL to a program on the host, and that program would
interrogate the database and return the result. It would need a reasonable
level of skill to implement.

regards,

Kevin
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users