Re: [sqlite] Free Page Data usage

2014-02-11 Thread Hick Gunter
You may enjoy reading up on Btrees here:

http://en.wikipedia.org/wiki/B-tree

-Ursprüngliche Nachricht-
Von: Raheel Gupta [mailto:raheel...@gmail.com]
Gesendet: Montag, 10. Februar 2014 10:49
An: General Discussion of SQLite Database
Betreff: Re: [sqlite] Free Page Data usage

>> Note that choosing a page size smaller than the typical row size
>> means
that the bottom level of the BTree degrades to 1 row per node.
What do you mean by this ? How is a smaller page bad for the database ?


On Mon, Feb 10, 2014 at 2:43 PM, Hick Gunter <h...@scigames.at> wrote:

> With a record size of a little over 4K (note that the on-disk space
> requirement of a integer+4k Blob row is not always 4k+8) and a page
> size of 2K you are storing 1 row in 3 pages (close to 50% overhead).
> Deleting a record will give you 3 pages of free space, which will be
> reused quickly; some of it for the higher levels of the B-Tree. Note
> that choosing a page size smaller than the typical row size means that
> the bottom level of the BTree degrades to 1 row per node.
>
> Changing to 4k or 8k will increase overhead to near 100% (as you now
> need
> 2 Pages of 4k or one page of 8k for each row).
>
> 16k pages ( 3 rows/page) reduce this to 25.00% while deleting 5
> adjacent rows is guaranteed to free up 1 page.
> 32k pages ( 7 rows/page) reduce this to 12.50% but requires 13
> adjacent deletes for 1 guaranteed free page.
> 64k pages (15 rows/page) reduce this to  6.25% but requires 29
> adjacent deletes for 1 guaranteed free page.
>
> You can choose the source of fragmentation: loosing close to 1 row per
> page (better in bigger pages) or having ununsed space due to
> nonadjacent deletes (better in smaller pages)
>
> -Ursprüngliche Nachricht-
> Von: Raheel Gupta [mailto:raheel...@gmail.com]
> Gesendet: Montag, 10. Februar 2014 07:14
> An: General Discussion of SQLite Database
> Betreff: Re: [sqlite] Free Page Data usage
>
> Hi,
>
> I tried the same database I had and used a 2KB page size.
> It works much faster and also the pages are reused immediattly to the
> extent of 95%.
>
> If only the number of pages could be increased somehow. Does anyone
> think its practical to make the pageNo from a 32 bit int to a 64 bit
> Unsigned Integer.
>
> I do understand that VACUUM is not a good option for me.
>
>
>
> On Sun, Feb 9, 2014 at 4:48 PM, Simon Slavin <slav...@bigfraud.org> wrote:
>
> >
> > On 9 Feb 2014, at 10:45am, RSmith <rsm...@rsweb.co.za> wrote:
> >
> > > On 2014/02/09 12:06, Raheel Gupta wrote:
> > >> Sir, I have only one auto increment primary key.
> > >> Since the new rows will always have a higher number will the
> > >> pages which have some free space by deleting rows with lower
> > >> numbered keys
> > never
> > >> be reused ? e.g. If row with ROWID "1" was deleted and freed,
> > >> will it
> > not
> > >> be used to store the NEW row which will be assigned ROWID 10001 ?
> > >
> > > Yes. That is the point of AutoIncrement, every new Key will always
> > > be
> > higher than any previous key ever used, and always exactly one
> > higher than the highest ever previously used key. As such, it cannot
> > be re-used within pages that are half filled from deletion (except
> > maybe the last page), and I believe pages that go completely empty
> > may be re-used without the need to vacuum etc. (need someone to confirm 
> > this).
> >
> > You are correct, depending on this PRAGMA:
> >
> > <http://www.sqlite.org/pragma.html#pragma_auto_vacuum>
> >
> > auto_vacuum = NONE
> >
> > A page which has all its data deleted is added to the 'free pages'
> > list and eventually reused.
> >
> > auto_vacuum = FULL
> >
> > A page which has all its data deleted is replaced by the last page
> > of the file.  The file is then truncated to release the space of the
> > last page for use in other files.
> >
> > auto_vacuum = INCREMENTAL
> >
> > A page which has all its data deleted is replaced by the last used
> > page of the file.  When you issue "PRAGMA incremental_vacuum(N)" the
> > file is truncated to release unused pages at the end for use in
> > other
> files.
> >
> > As in previous discussion, all this is about reclaiming space at the
> > page
> > level: releasing entire pages of space.  It has nothing to do with
> > reclaiming space within a page.  And also as in previous discussion,
> > the fastest of these is "auto_vacuum = NONE".  Copying one page to
> > another, releasing fil

Re: [sqlite] Free Page Data usage

2014-02-10 Thread Eduardo Morrás


>El lun, 10/2/14, Simon Slavin <slav...@bigfraud.org> escribió:
>
> Asunto: Re: [sqlite] Free Page Data usage
> Para: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
> Fecha: lunes, 10 de febrero, 2014 12:34
> 
> 
> On 10 Feb 2014, at 11:29am, Raheel Gupta <raheel...@gmail.com>
> wrote:
> 
> >>> 64-bit page numbers would not be backwards
> compatible.
> > 
> > It might be possible to implement it in backwards
> compatible mode. I guess
> > SQlite has some free flags in its superblock. Maybe we
> can use a single
> > byte to mark that this is a 64 bit page number ?
> 
> The problem is that people would create a file in the new
> version of SQLite which could handle 64 bits, and expect to
> be able to open it in old versions of SQLite which would not
> understand it.
> 
> It would seem that the move from SQLite3 to SQLite4 would be
> a good time to make all file pointers 64 bits.  But I
> don't know enough about the internal workings of SQLite4 to
> know for sure.
> 
> Simon.

 Sqlite4 is a completly different beast. It uses a hashmap & reduce. I don't 
see any improvement moving sqlite3 to 64 bits page number neither. The problem 
about inner fragmentation, using 64 KB page size is solved if, from time to 
time, make a backup, close db and open backup as main db file.

If db is very huge, backup and switch will cost time, and a DBMS where 
indexes/indexs, db statistics, tables, etc... are split in several files, I 
mean, Postgresql, SQLServer, etc... may work better ( inner fragmentation point 
of view ) and should be used.

HTH
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-10 Thread Simon Slavin

On 10 Feb 2014, at 11:29am, Raheel Gupta  wrote:

>>> 64-bit page numbers would not be backwards compatible.
> 
> It might be possible to implement it in backwards compatible mode. I guess
> SQlite has some free flags in its superblock. Maybe we can use a single
> byte to mark that this is a 64 bit page number ?

The problem is that people would create a file in the new version of SQLite 
which could handle 64 bits, and expect to be able to open it in old versions of 
SQLite which would not understand it.

It would seem that the move from SQLite3 to SQLite4 would be a good time to 
make all file pointers 64 bits.  But I don't know enough about the internal 
workings of SQLite4 to know for sure.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-10 Thread Raheel Gupta
Hi,

If I were to implement it for my use only, any heads up where I could start.
I do know that PgNo is a variable used everywhere. Will changing that help ?

>> 64-bit page numbers would not be backwards compatible.
It might be possible to implement it in backwards compatible mode. I guess
SQlite has some free flags in its superblock. Maybe we can use a single
byte to mark that this is a 64 bit page number ?

Ext File System does that as well :)


On Mon, Feb 10, 2014 at 4:42 PM, Clemens Ladisch  wrote:

> Raheel Gupta wrote:
> >If only the number of pages could be increased somehow. Does anyone think
> >its practical to make the pageNo from a 32 bit int to a 64 bit Unsigned
> >Integer.
>
> The 32-bit page number is part of the file format.
> 64-bit page numbers would not be backwards compatible.
>
>
> Regards,
> Clemens
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-10 Thread Clemens Ladisch
Raheel Gupta wrote:
>If only the number of pages could be increased somehow. Does anyone think
>its practical to make the pageNo from a 32 bit int to a 64 bit Unsigned
>Integer.

The 32-bit page number is part of the file format.
64-bit page numbers would not be backwards compatible.


Regards,
Clemens

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-10 Thread Raheel Gupta
>> Note that choosing a page size smaller than the typical row size means
that the bottom level of the BTree degrades to 1 row per node.
What do you mean by this ? How is a smaller page bad for the database ?


On Mon, Feb 10, 2014 at 2:43 PM, Hick Gunter <h...@scigames.at> wrote:

> With a record size of a little over 4K (note that the on-disk space
> requirement of a integer+4k Blob row is not always 4k+8) and a page size of
> 2K you are storing 1 row in 3 pages (close to 50% overhead). Deleting a
> record will give you 3 pages of free space, which will be reused quickly;
> some of it for the higher levels of the B-Tree. Note that choosing a page
> size smaller than the typical row size means that the bottom level of the
> BTree degrades to 1 row per node.
>
> Changing to 4k or 8k will increase overhead to near 100% (as you now need
> 2 Pages of 4k or one page of 8k for each row).
>
> 16k pages ( 3 rows/page) reduce this to 25.00% while deleting 5 adjacent
> rows is guaranteed to free up 1 page.
> 32k pages ( 7 rows/page) reduce this to 12.50% but requires 13 adjacent
> deletes for 1 guaranteed free page.
> 64k pages (15 rows/page) reduce this to  6.25% but requires 29 adjacent
> deletes for 1 guaranteed free page.
>
> You can choose the source of fragmentation: loosing close to 1 row per
> page (better in bigger pages) or having ununsed space due to nonadjacent
> deletes (better in smaller pages)
>
> -Ursprüngliche Nachricht-
> Von: Raheel Gupta [mailto:raheel...@gmail.com]
> Gesendet: Montag, 10. Februar 2014 07:14
> An: General Discussion of SQLite Database
> Betreff: Re: [sqlite] Free Page Data usage
>
> Hi,
>
> I tried the same database I had and used a 2KB page size.
> It works much faster and also the pages are reused immediattly to the
> extent of 95%.
>
> If only the number of pages could be increased somehow. Does anyone think
> its practical to make the pageNo from a 32 bit int to a 64 bit Unsigned
> Integer.
>
> I do understand that VACUUM is not a good option for me.
>
>
>
> On Sun, Feb 9, 2014 at 4:48 PM, Simon Slavin <slav...@bigfraud.org> wrote:
>
> >
> > On 9 Feb 2014, at 10:45am, RSmith <rsm...@rsweb.co.za> wrote:
> >
> > > On 2014/02/09 12:06, Raheel Gupta wrote:
> > >> Sir, I have only one auto increment primary key.
> > >> Since the new rows will always have a higher number will the pages
> > >> which have some free space by deleting rows with lower numbered
> > >> keys
> > never
> > >> be reused ? e.g. If row with ROWID "1" was deleted and freed, will
> > >> it
> > not
> > >> be used to store the NEW row which will be assigned ROWID 10001 ?
> > >
> > > Yes. That is the point of AutoIncrement, every new Key will always
> > > be
> > higher than any previous key ever used, and always exactly one higher
> > than the highest ever previously used key. As such, it cannot be
> > re-used within pages that are half filled from deletion (except maybe
> > the last page), and I believe pages that go completely empty may be
> > re-used without the need to vacuum etc. (need someone to confirm this).
> >
> > You are correct, depending on this PRAGMA:
> >
> > <http://www.sqlite.org/pragma.html#pragma_auto_vacuum>
> >
> > auto_vacuum = NONE
> >
> > A page which has all its data deleted is added to the 'free pages'
> > list and eventually reused.
> >
> > auto_vacuum = FULL
> >
> > A page which has all its data deleted is replaced by the last page of
> > the file.  The file is then truncated to release the space of the last
> > page for use in other files.
> >
> > auto_vacuum = INCREMENTAL
> >
> > A page which has all its data deleted is replaced by the last used
> > page of the file.  When you issue "PRAGMA incremental_vacuum(N)" the
> > file is truncated to release unused pages at the end for use in other
> files.
> >
> > As in previous discussion, all this is about reclaiming space at the
> > page
> > level: releasing entire pages of space.  It has nothing to do with
> > reclaiming space within a page.  And also as in previous discussion,
> > the fastest of these is "auto_vacuum = NONE".  Copying one page to
> > another, releasing filespace and claiming it back again are slow and
> > require much reading and writing.
> >
> > Simon.
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
&

Re: [sqlite] Free Page Data usage

2014-02-10 Thread Hick Gunter
This thread has already gone through that discussion ;)

-Ursprüngliche Nachricht-
Von: Dominique Devienne [mailto:ddevie...@gmail.com]
Gesendet: Montag, 10. Februar 2014 10:32
An: General Discussion of SQLite Database
Betreff: Re: [sqlite] Free Page Data usage

On Mon, Feb 10, 2014 at 10:13 AM, Hick Gunter <h...@scigames.at> wrote:

> You can choose the source of fragmentation: loosing close to 1 row per
> page (better in bigger pages) or having ununsed space due to
> nonadjacent deletes (better in smaller pages)
>

For the latter, you do have http://www.sqlite.org/lang_vacuum.html. Just a 
reminder. --DD ___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


--
 Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-10 Thread Dominique Devienne
On Mon, Feb 10, 2014 at 10:13 AM, Hick Gunter  wrote:

> You can choose the source of fragmentation: loosing close to 1 row per
> page (better in bigger pages) or having ununsed space due to nonadjacent
> deletes (better in smaller pages)
>

For the latter, you do have http://www.sqlite.org/lang_vacuum.html. Just a
reminder. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-10 Thread Hick Gunter
With a record size of a little over 4K (note that the on-disk space requirement 
of a integer+4k Blob row is not always 4k+8) and a page size of 2K you are 
storing 1 row in 3 pages (close to 50% overhead). Deleting a record will give 
you 3 pages of free space, which will be reused quickly; some of it for the 
higher levels of the B-Tree. Note that choosing a page size smaller than the 
typical row size means that the bottom level of the BTree degrades to 1 row per 
node.

Changing to 4k or 8k will increase overhead to near 100% (as you now need 2 
Pages of 4k or one page of 8k for each row).

16k pages ( 3 rows/page) reduce this to 25.00% while deleting 5 adjacent rows 
is guaranteed to free up 1 page.
32k pages ( 7 rows/page) reduce this to 12.50% but requires 13 adjacent deletes 
for 1 guaranteed free page.
64k pages (15 rows/page) reduce this to  6.25% but requires 29 adjacent deletes 
for 1 guaranteed free page.

You can choose the source of fragmentation: loosing close to 1 row per page 
(better in bigger pages) or having ununsed space due to nonadjacent deletes 
(better in smaller pages)

-Ursprüngliche Nachricht-
Von: Raheel Gupta [mailto:raheel...@gmail.com]
Gesendet: Montag, 10. Februar 2014 07:14
An: General Discussion of SQLite Database
Betreff: Re: [sqlite] Free Page Data usage

Hi,

I tried the same database I had and used a 2KB page size.
It works much faster and also the pages are reused immediattly to the extent of 
95%.

If only the number of pages could be increased somehow. Does anyone think its 
practical to make the pageNo from a 32 bit int to a 64 bit Unsigned Integer.

I do understand that VACUUM is not a good option for me.



On Sun, Feb 9, 2014 at 4:48 PM, Simon Slavin <slav...@bigfraud.org> wrote:

>
> On 9 Feb 2014, at 10:45am, RSmith <rsm...@rsweb.co.za> wrote:
>
> > On 2014/02/09 12:06, Raheel Gupta wrote:
> >> Sir, I have only one auto increment primary key.
> >> Since the new rows will always have a higher number will the pages
> >> which have some free space by deleting rows with lower numbered
> >> keys
> never
> >> be reused ? e.g. If row with ROWID "1" was deleted and freed, will
> >> it
> not
> >> be used to store the NEW row which will be assigned ROWID 10001 ?
> >
> > Yes. That is the point of AutoIncrement, every new Key will always
> > be
> higher than any previous key ever used, and always exactly one higher
> than the highest ever previously used key. As such, it cannot be
> re-used within pages that are half filled from deletion (except maybe
> the last page), and I believe pages that go completely empty may be
> re-used without the need to vacuum etc. (need someone to confirm this).
>
> You are correct, depending on this PRAGMA:
>
> <http://www.sqlite.org/pragma.html#pragma_auto_vacuum>
>
> auto_vacuum = NONE
>
> A page which has all its data deleted is added to the 'free pages'
> list and eventually reused.
>
> auto_vacuum = FULL
>
> A page which has all its data deleted is replaced by the last page of
> the file.  The file is then truncated to release the space of the last
> page for use in other files.
>
> auto_vacuum = INCREMENTAL
>
> A page which has all its data deleted is replaced by the last used
> page of the file.  When you issue "PRAGMA incremental_vacuum(N)" the
> file is truncated to release unused pages at the end for use in other files.
>
> As in previous discussion, all this is about reclaiming space at the
> page
> level: releasing entire pages of space.  It has nothing to do with
> reclaiming space within a page.  And also as in previous discussion,
> the fastest of these is "auto_vacuum = NONE".  Copying one page to
> another, releasing filespace and claiming it back again are slow and
> require much reading and writing.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


--
 Gunter Hick
Software Engineer
Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna, Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then delete this message from 
your system. Please do not copy it or use it for any purposes, or disclose its 
contents to any person as to do so could be a breach of confidence. Thank you 
for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-09 Thread Raheel Gupta
Hi,

I tried the same database I had and used a 2KB page size.
It works much faster and also the pages are reused immediattly to the
extent of 95%.

If only the number of pages could be increased somehow. Does anyone think
its practical to make the pageNo from a 32 bit int to a 64 bit Unsigned
Integer.

I do understand that VACUUM is not a good option for me.



On Sun, Feb 9, 2014 at 4:48 PM, Simon Slavin  wrote:

>
> On 9 Feb 2014, at 10:45am, RSmith  wrote:
>
> > On 2014/02/09 12:06, Raheel Gupta wrote:
> >> Sir, I have only one auto increment primary key.
> >> Since the new rows will always have a higher number will the pages
> >> which have some free space by deleting rows with lower numbered keys
> never
> >> be reused ? e.g. If row with ROWID "1" was deleted and freed, will it
> not
> >> be used to store the NEW row which will be assigned ROWID 10001 ?
> >
> > Yes. That is the point of AutoIncrement, every new Key will always be
> higher than any previous key ever used, and always exactly one higher than
> the highest ever previously used key. As such, it cannot be re-used within
> pages that are half filled from deletion (except maybe the last page), and
> I believe pages that go completely empty may be re-used without the need to
> vacuum etc. (need someone to confirm this).
>
> You are correct, depending on this PRAGMA:
>
> 
>
> auto_vacuum = NONE
>
> A page which has all its data deleted is added to the 'free pages' list
> and eventually reused.
>
> auto_vacuum = FULL
>
> A page which has all its data deleted is replaced by the last page of the
> file.  The file is then truncated to release the space of the last page for
> use in other files.
>
> auto_vacuum = INCREMENTAL
>
> A page which has all its data deleted is replaced by the last used page of
> the file.  When you issue "PRAGMA incremental_vacuum(N)" the file is
> truncated to release unused pages at the end for use in other files.
>
> As in previous discussion, all this is about reclaiming space at the page
> level: releasing entire pages of space.  It has nothing to do with
> reclaiming space within a page.  And also as in previous discussion, the
> fastest of these is "auto_vacuum = NONE".  Copying one page to another,
> releasing filespace and claiming it back again are slow and require much
> reading and writing.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-09 Thread Simon Slavin

On 9 Feb 2014, at 10:45am, RSmith  wrote:

> On 2014/02/09 12:06, Raheel Gupta wrote:
>> Sir, I have only one auto increment primary key.
>> Since the new rows will always have a higher number will the pages
>> which have some free space by deleting rows with lower numbered keys never
>> be reused ? e.g. If row with ROWID "1" was deleted and freed, will it not
>> be used to store the NEW row which will be assigned ROWID 10001 ?
> 
> Yes. That is the point of AutoIncrement, every new Key will always be higher 
> than any previous key ever used, and always exactly one higher than the 
> highest ever previously used key. As such, it cannot be re-used within pages 
> that are half filled from deletion (except maybe the last page), and I 
> believe pages that go completely empty may be re-used without the need to 
> vacuum etc. (need someone to confirm this).

You are correct, depending on this PRAGMA:



auto_vacuum = NONE

A page which has all its data deleted is added to the 'free pages' list and 
eventually reused.

auto_vacuum = FULL

A page which has all its data deleted is replaced by the last page of the file. 
 The file is then truncated to release the space of the last page for use in 
other files.

auto_vacuum = INCREMENTAL

A page which has all its data deleted is replaced by the last used page of the 
file.  When you issue "PRAGMA incremental_vacuum(N)" the file is truncated to 
release unused pages at the end for use in other files.

As in previous discussion, all this is about reclaiming space at the page 
level: releasing entire pages of space.  It has nothing to do with reclaiming 
space within a page.  And also as in previous discussion, the fastest of these 
is "auto_vacuum = NONE".  Copying one page to another, releasing filespace and 
claiming it back again are slow and require much reading and writing.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-09 Thread RSmith


On 2014/02/09 12:06, Raheel Gupta wrote:

Hi,

Sir, I have only one auto increment primary key.
Since the new rows will always have a higher number will the pages
which have some free space by deleting rows with lower numbered keys never
be reused ? e.g. If row with ROWID "1" was deleted and freed, will it not
be used to store the NEW row which will be assigned ROWID 10001 ?


Yes. That is the point of AutoIncrement, every new Key will always be higher than any previous key ever used, and always exactly one 
higher than the highest ever previously used key. As such, it cannot be re-used within pages that are half filled from deletion 
(except maybe the last page), and I believe pages that go completely empty may be re-used without the need to vacuum etc. (need 
someone to confirm this).


I think if you have 2 columns, one with Autoinc Integer and another one with rowid alias as primary key (but not auto-incremented) 
you can get page re-using... but it will still be subject to many factors and never really mimic a full re-using system as you 
really want.


You could of course make the keys yourself, no need to leave it up to the DB to autoincrement them, which might be a cheap solution 
to the problem, but I would still much more favour separate file(s) with byte data alongside an SQLite indexing DB - it would be the 
least work with highest guarantee of working flawlessly.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-09 Thread Raheel Gupta
Hi,

Sir, I have only one auto increment primary key.
Since the new rows will always have a higher number will the pages
which have some free space by deleting rows with lower numbered keys never
be reused ? e.g. If row with ROWID "1" was deleted and freed, will it not
be used to store the NEW row which will be assigned ROWID 10001 ?


On Sat, Feb 8, 2014 at 11:38 PM, Richard Hipp  wrote:

> On Fri, Feb 7, 2014 at 7:39 AM, Raheel Gupta  wrote:
>
> > Hi,
> >
> > My Page size is 64KB and I store around 4KB of row data in one row.
> > I store around 1 rows in one table and the database size reaches
> 42MB.
> >
> > Now, I am facing a peculiar problem. When I delete just 2-3 rows, that
> page
> > is not reused for the new data which will be inserted in the future.
> >
>
> That space will be reused if your new data has the same (or similar) key as
> the rows that were deleted.
>
> In order to achieve fast lookup, content must be logically ordered by key.
> That means that all of the rows on a single page must have keys that are
> close to one another.  If you have space on a page, and you insert a new
> row with a nearby key, that space will be (re)used.  But if you insert a
> new row with a very different key, that new row must be placed on a page
> close to other rows with similar keys, and cannot appear on the same page
> with rows of very dissimilar keys.
>
>
>
> >
> > The pragma freelist_count shows 0 if I delete the 1st 10 rows (approx
> 40KB)
> > Only if I delete more than 20 rows does the freelist_count reflect 1 page
> > as free.
> >
> > How should I get SQLIte to use the free space within a partially used
> page
> > when rows from that page have been deleted.
> >
> > This causes a lot of space wastage when I store more rows.
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
>
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread RSmith


On 2014/02/08 19:30, Raheel Gupta wrote:

@Simon, Sir I dont want to rearrange the data.

I will try to explain more.
All my rows have the exact same size. They will not differ in size.
My problem is due to the fact that I use 64kB page size.
My rows are exactly 8 Bytes + 4096 Bytes.

Now for the purpose of ease in calculation lets assume each row is exactly
4 KB.
So one page stores 16 rows.
Lets say 10 pages are in use and I have a total of 160 rows.


Sir,

We do understand exactly what you mean, no amount  of re-explaining will improve a 100% comprehension, and because we do understand, 
we know SQLite ain't doing it, and we are trying to offer other ways of achieving what you want to achieve, but this is not the road 
you seem to want to go down...  Which is OK.


The basic thing you need to understand is this: SQLite does not work the way you hope, it is not made to do the sort of work within 
the sort of restrictions you prescribe. Please consider using an alternative, or, accept the space vs. usage parameters. Even if you 
could adjust the code of SQLite to allow re-using pages with 1/3 free space (as opposed to 2/3 free space), then you are doomed 
because the code will be untested (unless you can download and run the entire test suite without errors) and even then, you will 
have to manually rebuild and repair and re-test your own version of the DB every time a new release happens and forever in future. 
Is this really feasible?  And even then... there is no guarantee SQLIte will re-use the exact rowids that fit inside a specific 
page, not to mention it will only even consider reusing a key if you did not specify "AUTOINCREMENT" in the schema (which, at least, 
is unlikely and fixable).


If you absolutely have to use SQLite, then maybe you can keep track of deleted rows, and in stead of deleting them, just mark them 
as "not used" while keeping the rowid or whatever primary key is used - add this key to a list of available keys maybe (to be 
faster), and when inserting new rows, first see if you have any items in your list of unused rows, then write them to that primary 
key using REPLACE etc.


A typical Schema could be like this:
CREATE TABLE datablocks (ID INTEGER PRIMARY KEY, Used INT DEFAULT 1, Data BLOB);
CREATE TABLE availrows (ID INTEGER PRIMARY KEY);

Some Pseudo code...
when deleting a row/rows:
UPDATE datablocks SET Used=0 WHERE ID=somerowid;
REPLACE INTO availrows VALUES (somerowid);

when adding a row
availRowID = (SELECT ID FROM availrows LIMIT 1);
if (availRowID!=NULL) then if (DELETE FROM availrows WHERE ID=availRowID) != SQL_OK then availRowID = NULL;  // Needed to ensure you 
can never overwrite a datablock

if (availRowID != NULL) then {
  REPLACE INTO datablocks (availRowID,1,blobValue);
} else {
  INSERT INTO datablocks (Data) VALUES (blobValue);
}

Of course adding BLOBs have some more processing to do, but you get the idea.

This way, no row will ever go unused and inserts wont ever use up any space other than that which already exists, unless no space 
exists, so the DB size will only grow if you have more actual data rows than before. Also, btw, this will have significant 
performance improvements if row-deletion is common.


Queries that need to check through the lists of data can simply reference the "Used" column in the where clause to ensure they list 
only rows that do contain valid data-blocks. (SELECT ... WHERE Used>0, etc.)


Of course, the caveat here is this other index-type table will consume a significant amount of diskspace on a DB the size you 
describe. Maybe have that in another DB file with different page size parameters too.


If it was me though, I would save the blob streams in another bytestreamed file, and only save the other data about it with indexes 
in the SQLite table, since you cannot really use a BLOB in a Where clause or for any other SQL-related function. Do the queries, get 
the index.. read the stream from the other file... SQLIte file size will be negligibly small and the data file will only ever be as 
big as is needed...  easy!


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread Richard Hipp
On Fri, Feb 7, 2014 at 7:39 AM, Raheel Gupta  wrote:

> Hi,
>
> My Page size is 64KB and I store around 4KB of row data in one row.
> I store around 1 rows in one table and the database size reaches 42MB.
>
> Now, I am facing a peculiar problem. When I delete just 2-3 rows, that page
> is not reused for the new data which will be inserted in the future.
>

That space will be reused if your new data has the same (or similar) key as
the rows that were deleted.

In order to achieve fast lookup, content must be logically ordered by key.
That means that all of the rows on a single page must have keys that are
close to one another.  If you have space on a page, and you insert a new
row with a nearby key, that space will be (re)used.  But if you insert a
new row with a very different key, that new row must be placed on a page
close to other rows with similar keys, and cannot appear on the same page
with rows of very dissimilar keys.



>
> The pragma freelist_count shows 0 if I delete the 1st 10 rows (approx 40KB)
> Only if I delete more than 20 rows does the freelist_count reflect 1 page
> as free.
>
> How should I get SQLIte to use the free space within a partially used page
> when rows from that page have been deleted.
>
> This causes a lot of space wastage when I store more rows.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread Richard Hipp
On Sat, Feb 8, 2014 at 11:51 AM, Simon Slavin  wrote:

>
> While a database is in use it might use perhaps 100 pages for a particular
> table.  Almost every one of those pages will have a little space free:
> anything from 1 byte to most of the page, depending on how much space each
> row takes up.  When writing a new row to a table, SQLite intelligently
> figures out which existing page it can write the row to (or does it ?
>  someone who has read the source code can tell me I'm wrong and if it
> searches for the 'best' page).
>

No.

SQLite (as most other database engines) use B-Trees.  Every row has a "key"
(which is often, but not always, the ROWID in SQLite.)  Rows need to be
stored in key-order.  Otherwise, you would not be able to find a row given
its key, except by doing a slow and wasteful scan of the entire table.

If four or five rows have adjacent keys, those rows can be placed
arbitrarily on one page.  There is a small index at the beginning of each
page that tells where to find each row on that page.  But you cannot spread
those keys out arbitrarily on different pages.  If they are adjacent, then
they need to be logically adjacent in the file.



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread Simon Slavin

On 8 Feb 2014, at 5:30pm, Raheel Gupta  wrote:

> I will try to explain more.
> All my rows have the exact same size. They will not differ in size.
> My problem is due to the fact that I use 64kB page size.
> My rows are exactly 8 Bytes + 4096 Bytes.

Your very specific use of SQLite is not every situation that SQLite has to be 
able to handle.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread Raheel Gupta
@Simon, Sir I dont want to rearrange the data.

I will try to explain more.
All my rows have the exact same size. They will not differ in size.
My problem is due to the fact that I use 64kB page size.
My rows are exactly 8 Bytes + 4096 Bytes.

Now for the purpose of ease in calculation lets assume each row is exactly
4 KB.
So one page stores 16 rows.
Lets say 10 pages are in use and I have a total of 160 rows.

Now I delete Rows 1-4 (total 4 rows) and I insert another 4 rows.
What I wanted is that the space freed by the first 4 rows being deleted be
used for the 4 new rows.

This should be done without any re-arrangement of data (so no vacuum and no
internal data rearrangement !).

As far as I am aware if a page is marked as free, sqlite will first use the
page to store new data. But since my page size is 64 KB, this will not be
possible.

Hence I am evaluating all options on this to optimize my storage space
utilization.



On Sat, Feb 8, 2014 at 10:21 PM, Simon Slavin  wrote:

>
> On 8 Feb 2014, at 11:24am, Raheel Gupta  wrote:
>
> > I dont want to repack the DB sir.
> > When a page becomes free I want to make sure that page is used up first
> and
> > then new pages are created.
>
> Just to explain that this would be extremely inefficient because a new row
> that you write to a database will not take up the same space as a row you
> have deleted.
>
> While a database is in use it might use perhaps 100 pages for a particular
> table.  Almost every one of those pages will have a little space free:
> anything from 1 byte to most of the page, depending on how much space each
> row takes up.  When writing a new row to a table, SQLite intelligently
> figures out which existing page it can write the row to (or does it ?
>  someone who has read the source code can tell me I'm wrong and if it
> searches for the 'best' page).
>
> What it won't do is rearrange existing pages so that they are used as much
> as possible.  That could be done whenever a row is deleted (including when
> a row is replaced using UPDATE).  But it would require a lot of checking,
> processing, reading and writing, and this would slow SQLite down a great
> deal for every DELETE and UPDATE operation.  As an the top of my head
> guess, individual operations could take unpredictable amounts of time since
> most efficient packing could require any number of pages to be rewritten.
>  I don't know of any database system that works like this.
>
> So that's one thing that might make you want to use VACUUM.  Even VACUUM
> does not reclaim the maximum amount of space possible.  Instead it prefers
> to keep the data for a row together and rows in primary index order
> together, to increase speeds
>
> There are also the auto_vacuum and incremental-vacuum PRAGMAs.  However
> they operate only on the level of pages: they will reap entire unused
> pages, but not interfere with the packing of data within a page.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread Simon Slavin

On 8 Feb 2014, at 11:24am, Raheel Gupta  wrote:

> I dont want to repack the DB sir.
> When a page becomes free I want to make sure that page is used up first and
> then new pages are created.

Just to explain that this would be extremely inefficient because a new row that 
you write to a database will not take up the same space as a row you have 
deleted.

While a database is in use it might use perhaps 100 pages for a particular 
table.  Almost every one of those pages will have a little space free: anything 
from 1 byte to most of the page, depending on how much space each row takes up. 
 When writing a new row to a table, SQLite intelligently figures out which 
existing page it can write the row to (or does it ?  someone who has read the 
source code can tell me I'm wrong and if it searches for the 'best' page).

What it won't do is rearrange existing pages so that they are used as much as 
possible.  That could be done whenever a row is deleted (including when a row 
is replaced using UPDATE).  But it would require a lot of checking, processing, 
reading and writing, and this would slow SQLite down a great deal for every 
DELETE and UPDATE operation.  As an the top of my head guess, individual 
operations could take unpredictable amounts of time since most efficient 
packing could require any number of pages to be rewritten.  I don't know of any 
database system that works like this.

So that's one thing that might make you want to use VACUUM.  Even VACUUM does 
not reclaim the maximum amount of space possible.  Instead it prefers to keep 
the data for a row together and rows in primary index order together, to 
increase speeds

There are also the auto_vacuum and incremental-vacuum PRAGMAs.  However they 
operate only on the level of pages: they will reap entire unused pages, but not 
interfere with the packing of data within a page.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread Gerry Snyder
Instead of delete and then insert, can you somehow just keep track of which
rows are to be deleted, and when new rows come in replace if you can and
otherwise insert?

A little more bookkeeping, but it might save the space you need.

Gerry
On Feb 7, 2014 10:57 PM, "Raheel Gupta"  wrote:

> Hi,
>
> Sir, the 32 TB size is not always going to be reached.
> The Database is going to be used to store blocks of a Block Device like
> /dev/sda1
> The size can reach 3-4 TB easily and would start from atleast 20-100 GB. 32
> TB of data though impractical as of today will be possible in 2-3 years.
> The issue happens when I delete the rows and new rows are inserted at the
> end of the database the size of the database exceeds that of the actual
> block device size even though many pages are having free space.
> Hence I am simply trying to optimize the utilization of the free space
> available.
>
> I would have loved to use the page size of 2KB which would give me a
> practical size of 4TB. But that would have this hard limit of 4TB.
> So I have two possible options which I am trying to help me solve this
> issue :
> 1) Either make the page size to 2KB and increase the maximum page count to
> 2^64 which will be more than sufficient.
> 2) Improve the free space utilization of each page when the page size is
> 64KB.
>
> I hope this makes sense.
>
>
>
> On Sat, Feb 8, 2014 at 12:54 AM, RSmith  wrote:
>
> > A database that is geared for 32TB size and you are concerned about
> rather
> > insignificant space wasted by the page size that is needed to reach the
> > 32TB max size... does not make any sense unless you are simply paranoid
> > about space.  Removing the gaps in the table space when deleting a row
> (or
> > rows) will render a delete query several magnitudes slower.
> >
> > If it IS that big of a concern, then maybe use standard files rather than
> > SQLite to save data in?  If the SQL functionality is a must, you can use
> > vacuum as often as is needed to clear unused space - but beware, 1 -
> Vacuum
> > takes some processing to re-pack a DB, especially a near 32TB one... in
> the
> > order of minutes on a computer I would guess, and much much more on
> > anything else.  2 - a 32TB DB will need up to 64TB total free disk space
> to
> > be sure to vacuum correctly - so having issues with it taking up maybe
> 40TB
> > for 32TB of data is in itself an irrelevant concern. Even large queries,
> > temporary tables etc will all need additional interim space for the sorts
> > of queries that might be requested of a 32TB data-set.
> >
> > The real point being: if you do not have at least 64TB free on whatever
> > that 32TB DB will sit, you are doing it wrong, and if you do have that
> much
> > free, you can ignore the 25% wasted deletion space problem.
> >
> > If the problem is simply your own pedanticism (at least I can sympathise
> > with that!) then it's simply a case of "Welcome to efficient databasing",
> > but if it is a real space deficit, then I'm afraid you will have to
> re-plan
> > or reconsider either the max allowable DB, or the physical layer's space
> > availability - sorry.
> >
> >
> >
> > On 2014/02/07 20:35, Raheel Gupta wrote:
> >
> >> Hi,
> >>
> >> I use a page size of 64 KB. But my row consists of 2 columns that is :
> >> i - Auto Increment Integer,
> >> b - 4096 Bytes of BLOB data
> >>
> >> Now for the sake of calculation, lets say 16 rows fit in a page and my
> >> table has 1 rows when I start.
> >>
> >> Now, lets say I delete some data which is not in sequence i.e. it can be
> >> deleted as per data which is not in use. To create such a hypothetical
> >> situation for explaining this to you, here is a simple query :
> >> DELETE from TABLE where i%4 = 0;
> >>
> >> As you may see that there is now 25% data deleted in each page.
> >>
> >> Now even if I do insert another 2500 rows (25% of original size) my
> >> database size reaches 125% of the original size when I inserted the
> 1
> >> rows initially.
> >>
> >> Hence there is significant space wastage. Anyway i can improve that ?
> >> It would be nice if the database size would be close to the original
> size
> >> after deleting 25% and adding some new 25% data.
> >>
> >> I know you would recommend to use smaller page sizes. Ideally 2KP page
> >> size
> >> is good but then, the number of pages is restricted to a max of 2^32
> which
> >> will restrict the total database size to 4TB only. I need the max size
> to
> >> be capable of atleast 32TB.
> >>
> >>
> >>
> >> On Fri, Feb 7, 2014 at 11:14 PM, Donald Griggs 
> >> wrote:
> >>
> >>  Can you write more about how this is causing you a problem? Most users
> >>> don't experience this as a problem
> >>> On Feb 7, 2014 10:30 AM, "Raheel Gupta"  wrote:
> >>>
> >>>  SQLite's tables are B-trees, sorted by the rowid.  Your new data will
> > probably get an autoincremented rowid, which will be appended at the
> >
> 

Re: [sqlite] Free Page Data usage

2014-02-08 Thread Raheel Gupta
>> No matter what size you make the pages, a delete function is never going
to re-pack the db

I dont want to repack the DB sir.
When a page becomes free I want to make sure that page is used up first and
then new pages are created.
VACUUM is not what I want to do.

I think that free pages are used up for new data in SQLIte as well and I
have no doubt of that.
The issue is when the page is 64 KB and it has lets say 16KB free then
atleast the 16KB should be used before an entirely new page is created.

@Clemens told that the page is used again if it has 2/3 free space.
My question is that can this 2/3 ratio be changed by me for my purpose to
1/3 or any other ratio.



On Sat, Feb 8, 2014 at 3:47 PM, RSmith  wrote:

> Hi Raheel,
>
> It does make sense what you would like to do, but your concern does not
> make sense. You say you are "trying to optimize the utilization of the free
> space available" but give no indication why, it certainly does not seem
> that space is a problem.
>
> I do understand the urge to optimize very much, but inside a Database
> engine you can optimize either for speed or for size, not for both. SQLIte
> as it stands is quite good at not wasting space unnecessarily, BUT, it is
> first and foremost optimized for speed (Thank goodness for that), which
> means the space-saving you are looking for is not going to happen.  In my
> previous post I made a passing comment / suggestion re using your own data
> files in stead of sqlite, and if it is a case of not needing the sql
> ability - which I seriously doubt since you are basically saving blocks of
> information from a blocked device and doing so as byte streams (or BLOB
> fields in SQL terms) - then I seriously suggest creating your own files and
> custom index mechanism and saving the byte streams in there.  It will be a
> lot faster and with zero space wastage and the size limits can be whatever
> you like them to be.
>
> Trying to use SQLite (or any other DB engine) for this purpose is akin to
> using a full-function bakery with ovens, humidifiers, provers, rising
> agents and bake timers when you just want to warm up your pizza (not to
> mention being restricted by the limitations that come with it).
>
> No matter what size you make the pages, a delete function is never going
> to re-pack the db, though you might get better results at re-using the
> space - but this is a compromise and one that does not sit well with you
> (if I read you right).
>
> Best of luck!
> Ryan
>
>
>
> On 2014/02/08 07:57, Raheel Gupta wrote:
>
>> Hi,
>> Sir, the 32 TB size is not always going to be reached.
>> The Database is going to be used to store blocks of a Block Device like
>> /dev/sda1
>> The size can reach 3-4 TB easily and would start from atleast 20-100 GB.
>> 32 TB of data though impractical as of today will be possible in 2-3 years.
>> The issue happens when I delete the rows and new rows are inserted at the
>> end of the database the size of the database exceeds that of the actual
>> block device size even though many pages are having free space.
>> Hence I am simply trying to optimize the utilization of the free space
>> available.
>> I would have loved to use the page size of 2KB which would give me a
>> practical size of 4TB. But that would have this hard limit of 4TB.
>> So I have two possible options which I am trying to help me solve this
>> issue :
>> 1) Either make the page size to 2KB and increase the maximum page count
>> to 2^64 which will be more than sufficient.
>> 2) Improve the free space utilization of each page when the page size is
>> 64KB.
>> I hope this makes sense.
>>
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-08 Thread RSmith

Hi Raheel,

It does make sense what you would like to do, but your concern does not make sense. You say you are "trying to optimize the 
utilization of the free space available" but give no indication why, it certainly does not seem that space is a problem.


I do understand the urge to optimize very much, but inside a Database engine you can optimize either for speed or for size, not for 
both. SQLIte as it stands is quite good at not wasting space unnecessarily, BUT, it is first and foremost optimized for speed (Thank 
goodness for that), which means the space-saving you are looking for is not going to happen.  In my previous post I made a passing 
comment / suggestion re using your own data files in stead of sqlite, and if it is a case of not needing the sql ability - which I 
seriously doubt since you are basically saving blocks of information from a blocked device and doing so as byte streams (or BLOB 
fields in SQL terms) - then I seriously suggest creating your own files and custom index mechanism and saving the byte streams in 
there.  It will be a lot faster and with zero space wastage and the size limits can be whatever you like them to be.


Trying to use SQLite (or any other DB engine) for this purpose is akin to using a full-function bakery with ovens, humidifiers, 
provers, rising agents and bake timers when you just want to warm up your pizza (not to mention being restricted by the limitations 
that come with it).


No matter what size you make the pages, a delete function is never going to re-pack the db, though you might get better results at 
re-using the space - but this is a compromise and one that does not sit well with you (if I read you right).


Best of luck!
Ryan


On 2014/02/08 07:57, Raheel Gupta wrote:

Hi,
Sir, the 32 TB size is not always going to be reached.
The Database is going to be used to store blocks of a Block Device like 
/dev/sda1
The size can reach 3-4 TB easily and would start from atleast 20-100 GB. 32 TB of data though impractical as of today will be 
possible in 2-3 years.
The issue happens when I delete the rows and new rows are inserted at the end of the database the size of the database exceeds 
that of the actual block device size even though many pages are having free space.

Hence I am simply trying to optimize the utilization of the free space 
available.
I would have loved to use the page size of 2KB which would give me a practical size of 4TB. But that would have this hard limit of 
4TB.

So I have two possible options which I am trying to help me solve this issue :
1) Either make the page size to 2KB and increase the maximum page count to 2^64 
which will be more than sufficient.
2) Improve the free space utilization of each page when the page size is 64KB.
I hope this makes sense.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-07 Thread Raheel Gupta
Hi,

Sir, the 32 TB size is not always going to be reached.
The Database is going to be used to store blocks of a Block Device like
/dev/sda1
The size can reach 3-4 TB easily and would start from atleast 20-100 GB. 32
TB of data though impractical as of today will be possible in 2-3 years.
The issue happens when I delete the rows and new rows are inserted at the
end of the database the size of the database exceeds that of the actual
block device size even though many pages are having free space.
Hence I am simply trying to optimize the utilization of the free space
available.

I would have loved to use the page size of 2KB which would give me a
practical size of 4TB. But that would have this hard limit of 4TB.
So I have two possible options which I am trying to help me solve this
issue :
1) Either make the page size to 2KB and increase the maximum page count to
2^64 which will be more than sufficient.
2) Improve the free space utilization of each page when the page size is
64KB.

I hope this makes sense.



On Sat, Feb 8, 2014 at 12:54 AM, RSmith  wrote:

> A database that is geared for 32TB size and you are concerned about rather
> insignificant space wasted by the page size that is needed to reach the
> 32TB max size... does not make any sense unless you are simply paranoid
> about space.  Removing the gaps in the table space when deleting a row (or
> rows) will render a delete query several magnitudes slower.
>
> If it IS that big of a concern, then maybe use standard files rather than
> SQLite to save data in?  If the SQL functionality is a must, you can use
> vacuum as often as is needed to clear unused space - but beware, 1 - Vacuum
> takes some processing to re-pack a DB, especially a near 32TB one... in the
> order of minutes on a computer I would guess, and much much more on
> anything else.  2 - a 32TB DB will need up to 64TB total free disk space to
> be sure to vacuum correctly - so having issues with it taking up maybe 40TB
> for 32TB of data is in itself an irrelevant concern. Even large queries,
> temporary tables etc will all need additional interim space for the sorts
> of queries that might be requested of a 32TB data-set.
>
> The real point being: if you do not have at least 64TB free on whatever
> that 32TB DB will sit, you are doing it wrong, and if you do have that much
> free, you can ignore the 25% wasted deletion space problem.
>
> If the problem is simply your own pedanticism (at least I can sympathise
> with that!) then it's simply a case of "Welcome to efficient databasing",
> but if it is a real space deficit, then I'm afraid you will have to re-plan
> or reconsider either the max allowable DB, or the physical layer's space
> availability - sorry.
>
>
>
> On 2014/02/07 20:35, Raheel Gupta wrote:
>
>> Hi,
>>
>> I use a page size of 64 KB. But my row consists of 2 columns that is :
>> i - Auto Increment Integer,
>> b - 4096 Bytes of BLOB data
>>
>> Now for the sake of calculation, lets say 16 rows fit in a page and my
>> table has 1 rows when I start.
>>
>> Now, lets say I delete some data which is not in sequence i.e. it can be
>> deleted as per data which is not in use. To create such a hypothetical
>> situation for explaining this to you, here is a simple query :
>> DELETE from TABLE where i%4 = 0;
>>
>> As you may see that there is now 25% data deleted in each page.
>>
>> Now even if I do insert another 2500 rows (25% of original size) my
>> database size reaches 125% of the original size when I inserted the 1
>> rows initially.
>>
>> Hence there is significant space wastage. Anyway i can improve that ?
>> It would be nice if the database size would be close to the original size
>> after deleting 25% and adding some new 25% data.
>>
>> I know you would recommend to use smaller page sizes. Ideally 2KP page
>> size
>> is good but then, the number of pages is restricted to a max of 2^32 which
>> will restrict the total database size to 4TB only. I need the max size to
>> be capable of atleast 32TB.
>>
>>
>>
>> On Fri, Feb 7, 2014 at 11:14 PM, Donald Griggs 
>> wrote:
>>
>>  Can you write more about how this is causing you a problem? Most users
>>> don't experience this as a problem
>>> On Feb 7, 2014 10:30 AM, "Raheel Gupta"  wrote:
>>>
>>>  SQLite's tables are B-trees, sorted by the rowid.  Your new data will
> probably get an autoincremented rowid, which will be appended at the
>
 end
>>>
 of the table.
>
> A page gets reorganized only when about 2/3 is free space.
>
>  Anyway to make this ratio to lets say 1/3 ?
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

  ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>>  

Re: [sqlite] Free Page Data usage

2014-02-07 Thread RSmith
A database that is geared for 32TB size and you are concerned about rather insignificant space wasted by the page size that is 
needed to reach the 32TB max size... does not make any sense unless you are simply paranoid about space.  Removing the gaps in the 
table space when deleting a row (or rows) will render a delete query several magnitudes slower.


If it IS that big of a concern, then maybe use standard files rather than SQLite to save data in?  If the SQL functionality is a 
must, you can use vacuum as often as is needed to clear unused space - but beware, 1 - Vacuum takes some processing to re-pack a DB, 
especially a near 32TB one... in the order of minutes on a computer I would guess, and much much more on anything else.  2 - a 32TB 
DB will need up to 64TB total free disk space to be sure to vacuum correctly - so having issues with it taking up maybe 40TB for 
32TB of data is in itself an irrelevant concern. Even large queries, temporary tables etc will all need additional interim space for 
the sorts of queries that might be requested of a 32TB data-set.


The real point being: if you do not have at least 64TB free on whatever that 32TB DB will sit, you are doing it wrong, and if you do 
have that much free, you can ignore the 25% wasted deletion space problem.


If the problem is simply your own pedanticism (at least I can sympathise with that!) then it's simply a case of "Welcome to 
efficient databasing", but if it is a real space deficit, then I'm afraid you will have to re-plan or reconsider either the max 
allowable DB, or the physical layer's space availability - sorry.



On 2014/02/07 20:35, Raheel Gupta wrote:

Hi,

I use a page size of 64 KB. But my row consists of 2 columns that is :
i - Auto Increment Integer,
b - 4096 Bytes of BLOB data

Now for the sake of calculation, lets say 16 rows fit in a page and my
table has 1 rows when I start.

Now, lets say I delete some data which is not in sequence i.e. it can be
deleted as per data which is not in use. To create such a hypothetical
situation for explaining this to you, here is a simple query :
DELETE from TABLE where i%4 = 0;

As you may see that there is now 25% data deleted in each page.

Now even if I do insert another 2500 rows (25% of original size) my
database size reaches 125% of the original size when I inserted the 1
rows initially.

Hence there is significant space wastage. Anyway i can improve that ?
It would be nice if the database size would be close to the original size
after deleting 25% and adding some new 25% data.

I know you would recommend to use smaller page sizes. Ideally 2KP page size
is good but then, the number of pages is restricted to a max of 2^32 which
will restrict the total database size to 4TB only. I need the max size to
be capable of atleast 32TB.



On Fri, Feb 7, 2014 at 11:14 PM, Donald Griggs  wrote:


Can you write more about how this is causing you a problem? Most users
don't experience this as a problem
On Feb 7, 2014 10:30 AM, "Raheel Gupta"  wrote:


SQLite's tables are B-trees, sorted by the rowid.  Your new data will
probably get an autoincremented rowid, which will be appended at the

end

of the table.

A page gets reorganized only when about 2/3 is free space.


Anyway to make this ratio to lets say 1/3 ?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-07 Thread Simon Slavin

On 7 Feb 2014, at 6:35pm, Raheel Gupta  wrote:

> As you may see that there is now 25% data deleted in each page.
> 
> Now even if I do insert another 2500 rows (25% of original size) my
> database size reaches 125% of the original size when I inserted the 1
> rows initially.
> 
> Hence there is significant space wastage. Anyway i can improve that ?

Whenever filespace matters to you, do a VACUUM.  Some apps do it on startup.  
Some do it on shutdown.  Some do it as part of a monthly maintenance period.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-07 Thread Raheel Gupta
Hi,

I use a page size of 64 KB. But my row consists of 2 columns that is :
i - Auto Increment Integer,
b - 4096 Bytes of BLOB data

Now for the sake of calculation, lets say 16 rows fit in a page and my
table has 1 rows when I start.

Now, lets say I delete some data which is not in sequence i.e. it can be
deleted as per data which is not in use. To create such a hypothetical
situation for explaining this to you, here is a simple query :
DELETE from TABLE where i%4 = 0;

As you may see that there is now 25% data deleted in each page.

Now even if I do insert another 2500 rows (25% of original size) my
database size reaches 125% of the original size when I inserted the 1
rows initially.

Hence there is significant space wastage. Anyway i can improve that ?
It would be nice if the database size would be close to the original size
after deleting 25% and adding some new 25% data.

I know you would recommend to use smaller page sizes. Ideally 2KP page size
is good but then, the number of pages is restricted to a max of 2^32 which
will restrict the total database size to 4TB only. I need the max size to
be capable of atleast 32TB.



On Fri, Feb 7, 2014 at 11:14 PM, Donald Griggs  wrote:

> Can you write more about how this is causing you a problem? Most users
> don't experience this as a problem
> On Feb 7, 2014 10:30 AM, "Raheel Gupta"  wrote:
>
> > >
> > > SQLite's tables are B-trees, sorted by the rowid.  Your new data will
> > > probably get an autoincremented rowid, which will be appended at the
> end
> > > of the table.
> > >
> > > A page gets reorganized only when about 2/3 is free space.
> > >
> >
> > Anyway to make this ratio to lets say 1/3 ?
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-07 Thread Donald Griggs
Can you write more about how this is causing you a problem? Most users
don't experience this as a problem
On Feb 7, 2014 10:30 AM, "Raheel Gupta"  wrote:

> >
> > SQLite's tables are B-trees, sorted by the rowid.  Your new data will
> > probably get an autoincremented rowid, which will be appended at the end
> > of the table.
> >
> > A page gets reorganized only when about 2/3 is free space.
> >
>
> Anyway to make this ratio to lets say 1/3 ?
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-07 Thread Raheel Gupta
>
> SQLite's tables are B-trees, sorted by the rowid.  Your new data will
> probably get an autoincremented rowid, which will be appended at the end
> of the table.
>
> A page gets reorganized only when about 2/3 is free space.
>

Anyway to make this ratio to lets say 1/3 ?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Free Page Data usage

2014-02-07 Thread Clemens Ladisch
Raheel Gupta wrote:
> When I delete just 2-3 rows, that page is not reused for the new data
> which will be inserted in the future.

SQLite's tables are B-trees, sorted by the rowid.  Your new data will
probably get an autoincremented rowid, which will be appended at the end
of the table.

A page gets reorganized only when about 2/3 is free space.

> How should I get SQLIte to use the free space within a partially used page
> when rows from that page have been deleted.

VACUUM (which needs lots of free disk space, and should not be executed
too often).


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Free Page Data usage

2014-02-07 Thread Raheel Gupta
Hi,

My Page size is 64KB and I store around 4KB of row data in one row.
I store around 1 rows in one table and the database size reaches 42MB.

Now, I am facing a peculiar problem. When I delete just 2-3 rows, that page
is not reused for the new data which will be inserted in the future.

The pragma freelist_count shows 0 if I delete the 1st 10 rows (approx 40KB)
Only if I delete more than 20 rows does the freelist_count reflect 1 page
as free.

How should I get SQLIte to use the free space within a partially used page
when rows from that page have been deleted.

This causes a lot of space wastage when I store more rows.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users