Re: [sqlite] The IN (... ) clause

2014-09-13 Thread Darren Duncan

On 2014-09-13, 10:07 PM, jose isaias cabrera wrote:

I know that the IN clause contains a list of something. I.e.

IN ('2014-01-01', '2014-01-02', '2014-01-03', '2014-01-04', '2014-01-05')

So the question is, is there a shorter way for one to say something like,

IN ('2014-01-01', ..., '2014-01-05')

where the content of the IN would have the first item and the last item of the
list, but that's it?  Thanks.


You're talking about a range/interval.

In SQL it is spelled like this:

  BETWEEN '2014-01-01' AND '2014-01-05'

-- Darren Duncan


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] The IN (... ) clause

2014-09-13 Thread jose isaias cabrera


Greetings!

I know that the IN clause contains a list of something. I.e.

IN ('2014-01-01', '2014-01-02', '2014-01-03', '2014-01-04', '2014-01-05')

So the question is, is there a shorter way for one to say something like,

IN ('2014-01-01', ..., '2014-01-05')

where the content of the IN would have the first item and the last item of 
the list, but that's it?  Thanks.


josé 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Scott Robison
On Sat, Sep 13, 2014 at 2:53 PM, Howard Chu  wrote:

> Scott Robison wrote:
>
>> On Sat, Sep 13, 2014 at 2:24 PM, Howard Chu  wrote:
>>
>>  Scott Robison wrote:
>>>
>>>  A couple of academic thoughts.


>>>  1. If one wanted to embed the journal within the database, would it be
 adequate to reserve a specific page as the "root" page of the journal,
 then
 allocate the remaining pages as normal (either to the journal or the
 main
 database)? This does leave the big hole problem so it may still not be
 ideal, but it would give you a known location to find the beginning of
 the
 journal without doubling the database size or requiring an extra file.


>>> Starting with a known location is definitely a step in the right
>>> direction.
>>>
>>>   2. Building on 1, could sparse files be used to accomplish this? Seek
>>> to
>>>
 "really big constant offset" and do all journaling operations at that
 point, allowing the operating system to manage actual disk allocation?
 If


>>> We're talking about implementing a filesystem. "the operating system" is
>>> your own code, in this case, you don't get to foist the work off onto
>>> anyone else.
>>>
>>
>>
>> No, Simon's original question was to the effect of why doesn't SQLite just
>> use the already open database file for journaling purposes as well.
>>
>
> OK, maybe I missed that, but I thought that question itself arose from how
> to use SQLite to implement a filesystem, on a raw partition. And the answer
> to that question (operating inside a raw partition) could apply equally
> well to operating inside a single file.
>
> If you preassign a fixed maximum size to the file, you could e.g. reserve
> the tail of the file for the journal, growing backward toward the head of
> the file, while the main data grows the usual direction from the head of
> the file toward the tail. This would basically be your (2) above. On HDDs
> this approach would have horrible seek latencies but it could work OK on
> SSDs.
>
> The other point though - like the existing journaling filesystems, you
> should not limit yourself to using a single file/storage device. Allow the
> option of storing the journal somewhere else - the performance potential is
> worth it.
>
>  My
>> point 1 was in response to the need to know where the journal file is, so
>> just pick a dedicated page in the file as the root page of the journal,
>> allowing the two files to be co-mingled. It doesn't address every possible
>> bad reason for co-mingling the data, but it would at least answer the
>> question "how do you find the journal".
>>
>> My second point was about existing SQLite database files that live in a
>> file system managed by some operating system. SQLite already foists that
>> work off on to someone else, this would be no different. It still may be a
>> bad idea, but that's not the reason why it wouldn't work. :)
>>
>>
To be fair, I may have read something out of context. As I said originally,
the questions were academic. I have not thought about these problems to
anywhere near the extent or depth the SQLite devs have. I just was thinking
out loud.

You are absolutely right, there are very good reasons to allow the
possibility to have external journals, perhaps on different physical
devices (much as the test_multiplex vfs uses multiple files) to support
much higher throughput and/or larger database sizes. At the same time,
there certainly could be reasons that keeping everything self contained in
a single file could be useful, and a VFS could be written to accommodate
that (similar to the test_onefile vfs) without needing to modify SQLite.

I was thinking some time back about a VFS that communicates with a network
service. The network service would be a replacement for NFS / SMB / CIFS
that would get file locking semantics "right" through a custom protocol
instead of hoping (and being disappointed) that the system's standard file
handling APIs got it right. The network service would essentially be a page
server. It would likely not be as fast as a local file, but it could allow
distributed access to a shared database in a secure manner.

Anyway, my point with the last paragraph was that my thought was to use a
SQLite database privately owned by the network service as the container
that would hold the pages read & written by the remote clients. A VFS can
do almost anything it wants as long as it has a reliable means of locking
out access. Not that it would be *easy*, just possible.

Sorry for the ramble there...

-- 
Scott Robison
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Howard Chu

Scott Robison wrote:

On Sat, Sep 13, 2014 at 2:24 PM, Howard Chu  wrote:


Scott Robison wrote:


A couple of academic thoughts.




1. If one wanted to embed the journal within the database, would it be
adequate to reserve a specific page as the "root" page of the journal,
then
allocate the remaining pages as normal (either to the journal or the main
database)? This does leave the big hole problem so it may still not be
ideal, but it would give you a known location to find the beginning of the
journal without doubling the database size or requiring an extra file.



Starting with a known location is definitely a step in the right direction.

  2. Building on 1, could sparse files be used to accomplish this? Seek to

"really big constant offset" and do all journaling operations at that
point, allowing the operating system to manage actual disk allocation? If



We're talking about implementing a filesystem. "the operating system" is
your own code, in this case, you don't get to foist the work off onto
anyone else.



No, Simon's original question was to the effect of why doesn't SQLite just
use the already open database file for journaling purposes as well.


OK, maybe I missed that, but I thought that question itself arose from how to 
use SQLite to implement a filesystem, on a raw partition. And the answer to 
that question (operating inside a raw partition) could apply equally well to 
operating inside a single file.


If you preassign a fixed maximum size to the file, you could e.g. reserve the 
tail of the file for the journal, growing backward toward the head of the 
file, while the main data grows the usual direction from the head of the file 
toward the tail. This would basically be your (2) above. On HDDs this approach 
would have horrible seek latencies but it could work OK on SSDs.


The other point though - like the existing journaling filesystems, you should 
not limit yourself to using a single file/storage device. Allow the option of 
storing the journal somewhere else - the performance potential is worth it.



My
point 1 was in response to the need to know where the journal file is, so
just pick a dedicated page in the file as the root page of the journal,
allowing the two files to be co-mingled. It doesn't address every possible
bad reason for co-mingling the data, but it would at least answer the
question "how do you find the journal".

My second point was about existing SQLite database files that live in a
file system managed by some operating system. SQLite already foists that
work off on to someone else, this would be no different. It still may be a
bad idea, but that's not the reason why it wouldn't work. :)




--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Scott Robison
On Sat, Sep 13, 2014 at 2:24 PM, Howard Chu  wrote:

> Scott Robison wrote:
>
>> A couple of academic thoughts.
>>
>
>> 1. If one wanted to embed the journal within the database, would it be
>> adequate to reserve a specific page as the "root" page of the journal,
>> then
>> allocate the remaining pages as normal (either to the journal or the main
>> database)? This does leave the big hole problem so it may still not be
>> ideal, but it would give you a known location to find the beginning of the
>> journal without doubling the database size or requiring an extra file.
>>
>
> Starting with a known location is definitely a step in the right direction.
>
>  2. Building on 1, could sparse files be used to accomplish this? Seek to
>> "really big constant offset" and do all journaling operations at that
>> point, allowing the operating system to manage actual disk allocation? If
>>
>
> We're talking about implementing a filesystem. "the operating system" is
> your own code, in this case, you don't get to foist the work off onto
> anyone else.


No, Simon's original question was to the effect of why doesn't SQLite just
use the already open database file for journaling purposes as well. My
point 1 was in response to the need to know where the journal file is, so
just pick a dedicated page in the file as the root page of the journal,
allowing the two files to be co-mingled. It doesn't address every possible
bad reason for co-mingling the data, but it would at least answer the
question "how do you find the journal".

My second point was about existing SQLite database files that live in a
file system managed by some operating system. SQLite already foists that
work off on to someone else, this would be no different. It still may be a
bad idea, but that's not the reason why it wouldn't work. :)

-- 
Scott Robison
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Database malformed

2014-09-13 Thread Harry Thompson
Hi all,

I have the following database where one of the entities in the database is
corrupted (dumping doesn't retrieve any data). Is there any way at all to
fix this database file or is it gone for good?

Harrys-Retina-MacBook-Pro:wetransfer-3bb6e0 Harry$ *ls*

ProjectDataBase.xcdatamodeld

Harrys-Retina-MacBook-Pro:wetransfer-3bb6e0 Harry$ *sqlite3
ProjectDataBase.xcdatamodeld '.dump ZIMAGE'*

PRAGMA foreign_keys=OFF;

BEGIN TRANSACTION;

CREATE TABLE ZIMAGE ( Z_PK INTEGER PRIMARY KEY, Z_ENT INTEGER, Z_OPT
INTEGER, ZCHANGED INTEGER, ZDBID INTEGER, ZDISPLAYORDER INTEGER,
ZPARENTPROJECT INTEGER, ZDATE TIMESTAMP, ZANNOTATIONPATH VARCHAR, ZCOMMENT
VARCHAR, ZIMAGEPATH VARCHAR, ZORDER VARCHAR, ZOWNER VARCHAR, ZTHUMBNAILPATH
VARCHAR, ZTITLE VARCHAR, ZDRAWNSTROKES BLOB, ZIMAGE BLOB, ZTHUMBNAIL BLOB );

/ ERROR: (11) database disk image is malformed */

/ ERROR: (11) database disk image is malformed */

CREATE INDEX ZIMAGE_ZPARENTPROJECT_INDEX ON ZIMAGE (ZPARENTPROJECT);

ROLLBACK; -- due to errors




-- 
__
Harry Thompson
Managing Director
Veam Studios - http://www.veamstudios.com
Mobile: +44 (0)7951 563 942
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Howard Chu

Scott Robison wrote:

On Fri, Sep 12, 2014 at 6:21 PM, Richard Hipp  wrote:


On Fri, Sep 12, 2014 at 8:07 PM, Simon Slavin 
wrote:


   one thing that annoys me about SQLite is that it needs to make a
journal file which isn't part of the database file.  Why ?  Why can't it
just write the journal to the database file it already has open ?  This
would reduce the problems where the OS prevents an application from
creating a new file because of permissions or sandboxing.


Where in the database does the journal information get stored?  At the
end?  What happens then if the transaction is an INSERT and the size of the
content has to grow?  Does that leave a big hole in the middle of the file
when the journal is removed?  During recovery after a crash, where does the
recovery process go to look for the journal information?   If the journal
is at some arbitrary point in the file, where does it look.  Note that we
cannot write the journal location in the file header because the header
cannot be (safely) changed without first journaling it but we cannot
journal the header without first writing the journal location into the
header.


Journaling filesystems already have this problem. By default they just use a 
section of the partition, reserved at FS creation time. Which leads to the 
problem already described in the video that started this thread - perform a 
large enough write operation and you can exceed the fixed size of the journal, 
which requires the journal data to be split and the operation journal update 
is no longer atomic.


Of course, most journaling filesystems also allow you to optionally specify an 
external journal - i.e., instead of embedding the journal on the filesystem's 
partition, you can use some other block device instead. Naturally you can also 
choose a larger size when doing this. Putting the journal on a separate device 
can bring some major performance benefits, as well as accomodating larger 
transactions.


In the tests I did two years ago, JFS with an external journal was blazingly 
fast. http://symas.com/mdb/microbench/july/#sec11



One idea that might work is to interleave the journal information with the
content.  So for each page in the database, there is a corresponding page
of journal content.  The downside there is that you double the size of the
database file without increasing its storage capacity.


This is why LMDB is much better suited to this task - it uses no journal at 
all, nor does it require compaction/defragmentation/VACUUMing.



A couple of academic thoughts.

1. If one wanted to embed the journal within the database, would it be
adequate to reserve a specific page as the "root" page of the journal, then
allocate the remaining pages as normal (either to the journal or the main
database)? This does leave the big hole problem so it may still not be
ideal, but it would give you a known location to find the beginning of the
journal without doubling the database size or requiring an extra file.


Starting with a known location is definitely a step in the right direction.


2. Building on 1, could sparse files be used to accomplish this? Seek to
"really big constant offset" and do all journaling operations at that
point, allowing the operating system to manage actual disk allocation? If


We're talking about implementing a filesystem. "the operating system" is your 
own code, in this case, you don't get to foist the work off onto anyone else.



this were possible, deleting the journal would be a "fast" truncate
operation. A custom VFS might be able to provide a proof of concept... hmm.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Simon Slavin

On 13 Sep 2014, at 9:16pm, Howard Chu  wrote:

> Scott Robison wrote:
> 
>> At the
>> very least (and I suspect more) Commodore 8-bit DOS (which was embedded
>> within a smart drive with its very own dedicated CPU & RAM) supported
>> (essentially) sequential byte stream files (no random seeking for these!)
>> and random access record oriented files (where the record size was set at
>> file creation time). Man were those a pain in the backside to use.
> 
> Now imagine writing an ftp client (or server) for one of these. I wrote both 
> for an IBM mainframe, way back when. 

You could have just used KERMIT.  Wait ... did KERMIT run on IBM mainframes ?



Hahahaha of course it did.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Howard Chu

Scott Robison wrote:

On Sat, Sep 13, 2014 at 1:43 PM, Richard Hipp  wrote:


Decades ago, files came in all kinds of varieties and permutations.
Details varied from one OS to the next.  But it was common to have a
distinction between text files and binary files (with different APIs for
accessing each.)  It was also common to have a difference between "ordinary
files" (that had to be read sequentially from beginning to end) and
"random-access files", which supported operations similar to lseek().
(Once again, completely incompatible APIs existed for accessing each file
type.)  With binary files, one often had to specify a "block size" which
was the increment in which the file was read and written.  The block size
was typically a property of the file and could not be changed after the
file had been created.  There were often restrictions on the permitted
values for block sizes.  And you couldn't ask the operating system to tell
you whether a file was text or binary or sequential or random-access or
what its block-size was;  you just had to know.  And bewildering problems
resulted if you got it wrong.



And this was not true just of big expensive business class machines. At the
very least (and I suspect more) Commodore 8-bit DOS (which was embedded
within a smart drive with its very own dedicated CPU & RAM) supported
(essentially) sequential byte stream files (no random seeking for these!)
and random access record oriented files (where the record size was set at
file creation time). Man were those a pain in the backside to use.

Now imagine writing an ftp client (or server) for one of these. I wrote both 
for an IBM mainframe, way back when. You had to trust the user to select the 
appropriate record mode for the ftp transfer to succeed without just getting 
garbage on the other end.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Scott Robison
On Sat, Sep 13, 2014 at 1:43 PM, Richard Hipp  wrote:

> Decades ago, files came in all kinds of varieties and permutations.
> Details varied from one OS to the next.  But it was common to have a
> distinction between text files and binary files (with different APIs for
> accessing each.)  It was also common to have a difference between "ordinary
> files" (that had to be read sequentially from beginning to end) and
> "random-access files", which supported operations similar to lseek().
> (Once again, completely incompatible APIs existed for accessing each file
> type.)  With binary files, one often had to specify a "block size" which
> was the increment in which the file was read and written.  The block size
> was typically a property of the file and could not be changed after the
> file had been created.  There were often restrictions on the permitted
> values for block sizes.  And you couldn't ask the operating system to tell
> you whether a file was text or binary or sequential or random-access or
> what its block-size was;  you just had to know.  And bewildering problems
> resulted if you got it wrong.
>
> Then the boys at Bell Labs had the idea of "lets make every file be a
> sequence of bytes of arbitrary length".   Many observers scoffed and told
> them that you had to have all kinds of different types of files with
> different APIs for "efficiency".  But in the end, the Unix Filesystem was
> proven to be a Very Good Idea, and has become the norm ever sense.
>
> You youngsters really have concept of the chaos and nonsense we programmers
> had to put up with prior to the invention of the Unix Filesystem, do you?
>

And this was not true just of big expensive business class machines. At the
very least (and I suspect more) Commodore 8-bit DOS (which was embedded
within a smart drive with its very own dedicated CPU & RAM) supported
(essentially) sequential byte stream files (no random seeking for these!)
and random access record oriented files (where the record size was set at
file creation time). Man were those a pain in the backside to use.

-- 
Scott Robison
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Simon Slavin

On 13 Sep 2014, at 8:43pm, Richard Hipp  wrote:

> On Sat, Sep 13, 2014 at 2:46 PM, Simon Slavin  wrote:
> 
>> I would dispute that.  [snip]
> 
> Roger is correct.
> 
> [description]

Thanks for the correction.  Apologies to Roger for doubting him.

> You youngsters really have concept of the chaos and nonsense we programmers
> had to put up with prior to the invention of the Unix Filesystem, do you?

Well, I started programming DEC mainframes in the 1970s but it seems I never 
used any calls which didn't allow random access to file contents.  What you 
describe does sound horrible to me.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Richard Hipp
On Sat, Sep 13, 2014 at 2:46 PM, Simon Slavin  wrote:

>
>
> > Before Unix came along it was quite common for files to be structured,
> > managed by the operating system and to be record based with file apis
> > working that way.  Unix turned files (and similar) into unstructured
> bags of
> > bytes.
>
> I would dispute that.  Unix is not at fault here.  Nor is Minix.  Even the
> mainframes of the day did not support transactional file systems
> routinely.  I can't think of any programming environment that 'expects' all
> file activity to be transactional.  We haven't go there yet.
>
>
Roger is correct.

Decades ago, files came in all kinds of varieties and permutations.
Details varied from one OS to the next.  But it was common to have a
distinction between text files and binary files (with different APIs for
accessing each.)  It was also common to have a difference between "ordinary
files" (that had to be read sequentially from beginning to end) and
"random-access files", which supported operations similar to lseek().
(Once again, completely incompatible APIs existed for accessing each file
type.)  With binary files, one often had to specify a "block size" which
was the increment in which the file was read and written.  The block size
was typically a property of the file and could not be changed after the
file had been created.  There were often restrictions on the permitted
values for block sizes.  And you couldn't ask the operating system to tell
you whether a file was text or binary or sequential or random-access or
what its block-size was;  you just had to know.  And bewildering problems
resulted if you got it wrong.

Then the boys at Bell Labs had the idea of "lets make every file be a
sequence of bytes of arbitrary length".   Many observers scoffed and told
them that you had to have all kinds of different types of files with
different APIs for "efficiency".  But in the end, the Unix Filesystem was
proven to be a Very Good Idea, and has become the norm ever sense.

You youngsters really have concept of the chaos and nonsense we programmers
had to put up with prior to the invention of the Unix Filesystem, do you?
;-)
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Scott Robison
On Fri, Sep 12, 2014 at 6:21 PM, Richard Hipp  wrote:

> On Fri, Sep 12, 2014 at 8:07 PM, Simon Slavin 
> wrote:
> >
> >   one thing that annoys me about SQLite is that it needs to make a
> > journal file which isn't part of the database file.  Why ?  Why can't it
> > just write the journal to the database file it already has open ?  This
> > would reduce the problems where the OS prevents an application from
> > creating a new file because of permissions or sandboxing.
> >
> Where in the database does the journal information get stored?  At the
> end?  What happens then if the transaction is an INSERT and the size of the
> content has to grow?  Does that leave a big hole in the middle of the file
> when the journal is removed?  During recovery after a crash, where does the
> recovery process go to look for the journal information?   If the journal
> is at some arbitrary point in the file, where does it look.  Note that we
> cannot write the journal location in the file header because the header
> cannot be (safely) changed without first journaling it but we cannot
> journal the header without first writing the journal location into the
> header.
>
> One idea that might work is to interleave the journal information with the
> content.  So for each page in the database, there is a corresponding page
> of journal content.  The downside there is that you double the size of the
> database file without increasing its storage capacity.
>

A couple of academic thoughts.

1. If one wanted to embed the journal within the database, would it be
adequate to reserve a specific page as the "root" page of the journal, then
allocate the remaining pages as normal (either to the journal or the main
database)? This does leave the big hole problem so it may still not be
ideal, but it would give you a known location to find the beginning of the
journal without doubling the database size or requiring an extra file.

2. Building on 1, could sparse files be used to accomplish this? Seek to
"really big constant offset" and do all journaling operations at that
point, allowing the operating system to manage actual disk allocation? If
this were possible, deleting the journal would be a "fast" truncate
operation. A custom VFS might be able to provide a proof of concept... hmm.

-- 
Scott Robison
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Simon Slavin

On 13 Sep 2014, at 7:01pm, Luuk  wrote:

> On 13-9-2014 19:45, Simon Slavin wrote:
> 
>> What I want is the entire volume as a database.  It would be something like
>> 
>> Key  Example value
>> ---  -
>> /robots.txt/path /
>> /robots.txt/filename robots.txt
>> /robots.txt/creationdate whatever
>> /robots.txt/filetype text
>> /robots.txt/creatorapp   /Applications/textwrangler.app
>> /robots.txt/content  BLOB
>> /index.html  /
>> /index.html/filename /index.html
>> /index.html/creationdate whatever
>> ...
>> 
>> The actual primary key would be a a two column index, of course.
> 
> My question will than be:
> Would the primary index not be the i-node number (in case of ext3/ext4 
> filesystem), or the sectornr in case of FAT

Sorry, that still incorporates a sub-database disk format, and that's what I 
want to avoid.  The volume itself should be the equivalent of a SQLite 
database.  Sectors of the volume would be the pages of the SQLite database.  
The i-node numbers (or sector numbers) would be the pointers inside the SQLite 
database which normally point to page numbers in the database file.  For 
example, the binary trees mentioned in section 2.3 of



would contain page numbers which are, in this case, sector/inode numbers.  They 
wouldn't be integers stored in the database itself.

One big difference between this and how SQLite works right now is that the 
database would always take up the complete space reserved for the volume, thus 
making VACUUM do nothing.  Presumably some equivalent of VACUUM could be used 
to defragment tables and indexes, and it would have to be disabled for SSDs.  
Though with today's storage hardware defragmentation isn't of much benefit.

I must admit I'm having trouble getting my head around how to do journaling 
with this.

Another way of arranging the storage would, of course, be with every file a row 
of the Files table which has one column for each property stored:

Columns for the 'Files' table:

PathNameCreationDateCreatorApp  TypeContent
/   robots.txt  1/1/2012TextEdittextBLOB
/   index.html  1/1/2014WebmakerHTMLBLOB

Other tables in the database could be used to store other things.  Although the 
freespace and bad block list might be best stored as pseudo-files.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Simon Slavin

On 13 Sep 2014, at 7:05pm, Roger Binns  wrote:

> Ever hear of Windows and Transactional NTFS :-)
> 
> http://msdn.microsoft.com/en-us/magazine/cc163388.aspx
> 
> It turns out that adding transactions indiscriminately doesn't magically
> make everything better and lots of thought does need to be applied to get
> everything right.  That leads to this:
> 
> http://msdn.microsoft.com/en-us/library/windows/desktop/hh802690(v=vs.85).aspx

I had heard of it but never encountered in the wild.  It was a good idea which 
I don't think can be implemented on any variation of NTFS.  For instance, a 
quote from your second reference:

"You shouldn’t use TxF if you have a system that will have long-running 
transactions. The definition of "long-running" here is relative, however. A 
long-running transaction is any transaction that has been alive longer than 
many other transactions within the same log. This could mean a few seconds if 
there are thousands of transactions happening and lasting for very brief 
periods of time. On the other hand, a day or two might not be a long time if 
there are very few transactions happening in the system."

So if I have a banking system that slowly does end-of-day processing, it can't 
be used on the same /volume/ that is being updated in real-time, even if 
they're working on completely different files.  That's not much use.

> Before Unix came along it was quite common for files to be structured,
> managed by the operating system and to be record based with file apis
> working that way.  Unix turned files (and similar) into unstructured bags of
> bytes.

I would dispute that.  Unix is not at fault here.  Nor is Minix.  Even the 
mainframes of the day did not support transactional file systems routinely.  I 
can't think of any programming environment that 'expects' all file activity to 
be transactional.  We haven't go there yet.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Howard Chu

Scott Robison wrote:

On Sat, Sep 13, 2014 at 9:39 AM, Richard Hipp  wrote:


I say that a filesystem is an eventually-consistent key/value database.
The keys are the filenames and the values are all big BLOBs, specifically
the file content.  Filesystems also have a hierarchical keyspace, which is
an extension from the usual key/value concept, but it is still key/value.



Dan Bernstein, author of qmail & djbdns (among others), used the file
system as a configuration database for those applications. Rather than
having a text configuration file, he used the directory and file names as
keys and their contents as values. I seem to recall him later regretting
this choice (in part, anyway) but I always thought there was a certain
elegance to that solution. It's not perfect, but what is?

OpenLDAP's config database currently uses the filesystem this way as well. 
It's no paragon of efficiency, but it doesn't need to be particularly 
performant in the first place, and it requires zero setup.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Scott Robison
On Sat, Sep 13, 2014 at 9:39 AM, Richard Hipp  wrote:

> I say that a filesystem is an eventually-consistent key/value database.
> The keys are the filenames and the values are all big BLOBs, specifically
> the file content.  Filesystems also have a hierarchical keyspace, which is
> an extension from the usual key/value concept, but it is still key/value.
>

Dan Bernstein, author of qmail & djbdns (among others), used the file
system as a configuration database for those applications. Rather than
having a text configuration file, he used the directory and file names as
keys and their contents as values. I seem to recall him later regretting
this choice (in part, anyway) but I always thought there was a certain
elegance to that solution. It's not perfect, but what is?

-- 
Scott Robison
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Roger Binns
On 12/09/14 17:07, Simon Slavin wrote:
> Programmers don't expect file services to support transactions because file 
> services have never supported transactions. 

Ever hear of Windows and Transactional NTFS :-)

http://msdn.microsoft.com/en-us/magazine/cc163388.aspx

It turns out that adding transactions indiscriminately doesn't magically
make everything better and lots of thought does need to be applied to get
everything right.  That leads to this:

http://msdn.microsoft.com/en-us/library/windows/desktop/hh802690(v=vs.85).aspx

Before Unix came along it was quite common for files to be structured,
managed by the operating system and to be record based with file apis
working that way.  Unix turned files (and similar) into unstructured bags of
bytes.

Roger
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Luuk

On 13-9-2014 19:45, Simon Slavin wrote:


On 13 Sep 2014, at 6:16pm, Tim Streater  wrote:


On 13 Sep 2014 at 16:39, Richard Hipp  wrote:


I say that a filesystem is an eventually-consistent key/value database.
The keys are the filenames and the values are all big BLOBs, specifically
the file content.  Filesystems also have a hierarchical keyspace, which is
an extension from the usual key/value concept, but it is still key/value.


Key  Example value
---  -
filename somefile.txt
creation date
file typetext
opens with   textwrangler.app
content  
...

and so on.


That's not what I meant.  That's the file as a database.  What I want is the 
entire volume as a database.  It would be something like

Key Example value
--- -
/robots.txt/path/
/robots.txt/filenamerobots.txt
/robots.txt/creationdatewhatever
/robots.txt/filetypetext
/robots.txt/creatorapp  /Applications/textwrangler.app
/robots.txt/content BLOB
/index.html /
/index.html/filename/index.html
/index.html/creationdatewhatever
...

The actual primary key would be a a two column index, of course.

Simon.


My question will than be:
Would the primary index not be the i-node number (in case of ext3/ext4 
filesystem), or the sectornr in case of FAT


Files on disk (did?) get fragmented, but for a database this should not 
have influence on the 'i-node' where a file is stored.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Howard Chu

Simon Slavin wrote:

That's not what I meant.  That's the file as a database.  What I want is
the entire volume as a database.


That's exactly what I pointed you to before. The thesis is pretty enlightening 
too.


http://www.fsl.cs.sunysb.edu/docs/kbdbfs-msthesis/

--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Simon Slavin

On 13 Sep 2014, at 6:16pm, Tim Streater  wrote:

> On 13 Sep 2014 at 16:39, Richard Hipp  wrote: 
> 
>> I say that a filesystem is an eventually-consistent key/value database.
>> The keys are the filenames and the values are all big BLOBs, specifically
>> the file content.  Filesystems also have a hierarchical keyspace, which is
>> an extension from the usual key/value concept, but it is still key/value.
> 
> Key  Example value
> ---  -
> filename somefile.txt
> creation date
> file typetext
> opens with   textwrangler.app
> content  
> ...
> 
> and so on.

That's not what I meant.  That's the file as a database.  What I want is the 
entire volume as a database.  It would be something like

Key Example value
--- -
/robots.txt/path/
/robots.txt/filenamerobots.txt
/robots.txt/creationdatewhatever
/robots.txt/filetypetext
/robots.txt/creatorapp  /Applications/textwrangler.app
/robots.txt/content BLOB
/index.html /
/index.html/filename/index.html
/index.html/creationdatewhatever
...

The actual primary key would be a a two column index, of course.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Tim Streater
On 13 Sep 2014 at 16:39, Richard Hipp  wrote: 

> I say that a filesystem is an eventually-consistent key/value database.
> The keys are the filenames and the values are all big BLOBs, specifically
> the file content.  Filesystems also have a hierarchical keyspace, which is
> an extension from the usual key/value concept, but it is still key/value.

Key  Example value
---  -
filename somefile.txt
creation date
file typetext
opens with   textwrangler.app
content  
...

and so on.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread Richard Hipp
On Sat, Sep 13, 2014 at 11:01 AM, James K. Lowden 
wrote:

> I used "database" because that was his word.  Whether or not a
> filesystem is a database depends on the definition.  I would say that
> the files are data, but the filesystem is not a DBMS, partly for
> reasons you mention.  Do the files nonetheless constitute a database?
> I would say no, because they don't meet that treshhold.  Among other
> things, the filesystem offers no key: no way to identify the entities
> it stores.


I say that a filesystem is an eventually-consistent key/value database.
The keys are the filenames and the values are all big BLOBs, specifically
the file content.  Filesystems also have a hierarchical keyspace, which is
an extension from the usual key/value concept, but it is still key/value.

If your definition of "database" requires secondary keys, then lots of
things that go by the name "database" do not qualify.  Do you not consider
GDBM to be a database engine?  And yet it has only primary keys and values,
just like a filesystem.

This idea of viewing a filesystem as a key/value database is developed
further in http://www.sqlite.org/affcase1.html and
http://www.sqlite.org/appfileformat.html

"Git" is a classic example of using a filesystem as a database.  The
repository history is stored in a "database" called ".git".  Compare this
with Monotone or Fossil which store the repository history in a relational
database.
-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] presentation about ordering and atomicity of filesystems

2014-09-13 Thread James K. Lowden
On Sat, 13 Sep 2014 01:07:59 +0100
Simon Slavin  wrote:

> > Implement a block-transaction store on the device
> > itself: no inodes, no directories, just writeable blocks managed in
> > transactions.  Build your DBMS on that.  
> 
> That would be ... erm ... perhaps a new disk volume format.  Where
> the blocks of the volume were pages of a SQLite database.  

Yes.  It can be done in userspace; Sybase calls it a "device".  Better,
surely, would be a kernel-side implementation, living alongside
FFS, ext4, etc., except *not* providing Posix filesystem services.  

As Richard pointed out, if the tranlog is in the "file" and the "file"
has observable size, then the "file" grows by the size of the tranlog.
In Sybase, the tranlog is a table in a "device", the size of which was
preallocated in a disk partition.  It's a different model.  As I said,
the filesystem affords conveniences to user and programmer both.  

> > I asked your question: why not add transactions to FFS?  
> > 
> > His answer: that's the province of a database.  
> 
> A file system is a database.  It accepts data from a user.  It stores
> it away for later retrieval.  It allows it to be searched in various
> ways more convenient than just reading it from beginning to end every
> time.

I used "database" because that was his word.  Whether or not a
filesystem is a database depends on the definition.  I would say that
the files are data, but the filesystem is not a DBMS, partly for
reasons you mention.  Do the files nonetheless constitute a database?
I would say no, because they don't meet that treshhold.  Among other
things, the filesystem offers no key: no way to identify the entities
it stores.  If you say the offset into the file is the key, OK, but
then the "entity" is a byte, which is a pretty primitive database if
you ask me.  

I really think enriching the filesystem is not the way to go.  I saw a
presentation not long ago on running Postgres on ZFS.  ZFS, you may
know, has snapshots and transactions.  Now watch as the filesystem
journals writes to the transaction log, and takes snapshots of the
non-quiescent database files.  I didn't know whether to laugh or cry.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users