Re: [sqlite] Database corruption

2005-10-07 Thread Mrs. Brisby
On 2005-10-07 at 11:11, http://tick/%7Egeocar/netmail/ws.cgi?action=v2;[EMAIL PROTECTED]">[EMAIL 
PROTECTED] wrote:

 "Mrs. Brisby" http://tick/%7Egeocar/netmail/ws.cgi?action=v2;[EMAIL PROTECTED]">[EMAIL 
PROTECTED] wrote:

 

  I had some MacOSX systems suffer a power failure, when they came back

   I made some copies of the database and: [integrity_check failed]

 

 

 This suggests that you copied only the database file itself and not its

 rollback journal.  That can lead to corruption.


There was no rollback journal. This is why I'm worried.

 Failing that, you can try to do a ".dump" of the database into a fresh

 database.  If ".dump" gives problems, try running it separately for

 each table in the database file.  You should be able to recover most of

 your data that way, but some patching and fixing will still likely be

 necessary once you are done.

.dump appears to work for most data. I'm using this.




[sqlite] Database corruption

2005-10-07 Thread Mrs. Brisby
I had some MacOSX systems suffer a power failure, when they came back I made 
some copies of the database and:

sqlite PRAGMA integrity_check;
*** in database main ***
On tree page 39134 cell 65: 2nd reference to page 33247
On tree page 39134 cell 65: Child page depth differs
On tree page 39134 cell 66: Child page depth differs
Page 33220 is never used
wrong # of entries in index t1
SQL error: database disk image is malformed

I need to repair these databases in a hurry, but I'm not 100% sure where to 
start.

No, they're not all the same.

Any suggestions?



RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-12 Thread Mrs. Brisby
On Mon, 2005-09-12 at 07:15 -0400, Ned Batchelder wrote:
> I'm not sure if we agree or disagree.  You say:
> 
> > I don't think the number of optimizer controls should grow. 
> > It should be kept as close to zero as possible.
> 
> But then you say:
> 
> > If [the optimizer] can make a mistake, I want a simple
> > and _unobtrusive_ way to correct _just_that_mistake_ for
> > _just_that_query_.
> 
> As the optimizer grows, the number of mistakes it could make will grow. You
> seem to want a control for each mistake.  The number of controls will grow.

I don't want a control. A control is a conscious device that I want no
part of. Perhaps swapping the order of entries, or performing some bogus
event, etc. These are things that the SQLite list can mention, that can
always be considered SQLite oddities, but that users shouldn't be
concerned with trying "on their own".

These controls _are_bugs_. Period. Comments and pragmas have this nasty
tendency to stick around longer than necessary.

I realize that if the Pg people have decided _this_particular_ control
is necessary- that is, the bug cannot be fixed, then pragmas are not the
way to go, but instead SQL itself needs to be fixed.

I like, however, to follow the conventions set up by others, so I think
CROSS isn't such a bad idea as a result.


That all said, it seems like this problem is already solved- SQLite does
the right thing after ANALYZE is called. Perhaps it wouldn't be too
difficult to update the statistics that ANALYZE collects in after
COMMIT, or perhaps after a COMMIT that runs longer than a particular
amount of time.

This would obsolete the need for a control, and as I've said, controls
are bad.


> As a few of us have mentioned, other database engines have used inline
> pragmas in the form of specially formatted comments:

Other database engines are wrong so...?

Other database engines support [] for attribute boundaries or backslash
as an escaping character. I suppose these should be used too? After all,
they solve a very specific non-problem at a horrible expense to the
user, so they must be good right?

Note what I am saying: _optimizer_ controls are bad. They make the user
miserable because they lead the user to believe that their SQL is
invalid or incorrect instead of Pg or DB2 or MSSQL or MySQL or SQLite
being incapable.

gcc -O2 makes faster code as long as it's not smaller. Why isn't it
default?

find -noleaf on the converse works around a bug in the kernel. Why not
fix the kernel? -noleaf should do NOTHING on current systems.

Why do I need to intermix -O0 on some non-speed sensitive code because
of gcc bugs, and why should I the programmer bother?

I am much more willing to accept and workaround the failures of gcc,
some unixes, and SQLite than I am to have by own intelligence insulted
by a program I know very little about.

... after all, if I try a new program and it tells me I performed some
kind of syntax invocation error, I tend to believe it. I think _I_ must
have done something wrong. Controls simply reinforce this mind-think.


> select /* sqlite_stupid */ A.parent, A.child from /* sqlite_stupid */
> revision_ancestry ...
> 
> New keywords will also work.  The comments have the advantage that they are
> "portable": you can pass this exact query to any database.  Only SQLite will
> interpret the comments.  If SQLite invents a new STUPID keyword (or
> SQLITE_STUPID), the query will only be parsable by SQLite.

Agreed. That's exactly why I recommend them over code-hiding.

Code-hiding has the distinct advantage of keeping bugs around. People
love the fact that they can hide CSS from MSIE so that they can pretend
to have fully conforming XHTML1.1+CSS web sites that look absolutely
stunning in some latest version of Firefox,

but they don't change the fact that MSIE had a bug in it. People who
visit a website that looks like junk _NEVER_ say "oh my web browser is
being a piece of shit."

Instead, by forcing these bugs to be deliberate and visible, we prevent
people from deploying broken code to broken systems, but STILL allow
people who "have no other option" to proceed with the full knowledge
that a future SQLite release will require they change their code.


> Granted, using pragma comments in queries is unlikely in a system where
> queries are fed to multiple DBMS's, but at least it would be possible.

Which brings me back to PostgreSQL - which was commented as not having a
special pragma for this but instead a sly little hack identical to the
one SQLite is using. People who target PostgreSQL will already have had
the fix.


> -Original Message-
> From: Mrs. Brisby [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, 11 September, 2005 10:07 PM
> To: sqlite-users@sqlite.org
> Subject: RE: [sqlite] CROSS keyword disables certain join optimizations
> 
> On Sat, 2005-09-10 at 21:38

RE: [sqlite] CROSS keyword disables certain join optimizations

2005-09-11 Thread Mrs. Brisby
On Sat, 2005-09-10 at 21:38 -0400, Ned Batchelder wrote:
> Rather than overload an existing SQL keyword, would it be possible to
> provide pragmas to control the optimizer?  Assigning meanings to particular
> combinations of SQL queries won't scale as the number of optimizer controls
> grows.

I don't think the number of optimizer controls should grow. It should be
kept as close to zero as possible.

The query optimizer isn't (and shouldn't be) designed to help out people
who don't know SQL, but should be for people who know SQL but don't know
SQLite.

When SQLite does this optimization of table reorganization, it's doing
so because it doesn't have a real optimal way to answer the
query-as-stated but if the query is transformed into something seemingly
(and sometimes algebraically) equivalent, then it does so it can.

Consider a mythical SQL engine that doesn't reorder where terms before
searching an index of the form "a,b".

It can answer queries like:
WHERE a='abc' AND b='def';

but give it:
WHERE b='def' AND a='abc';

and it could take forever. Anyone reasonably aware of the various SQL
engines out there would undoubtedly be aware of engines that DO treat
them as identical, but here is one that isn't.

Given that this is indeed a _query_optimization_ do you really want a
control for this?

While I'm sure many SQL engines don't put this nonsense in the block of
code that looks for query optimizations, I want the query optimizer to
do the best job that it can. If it can make a mistake, I want a simple
and _unobtrusive_ way to correct _just_that_mistake_ for
_just_that_query_.

I think anyone close to the various SQL working groups-that-be should
consider a "STUPID" operator that takes the next term (whatever that
might mean to the SQL engine) and stops any query optimization on that
term. The STUPID keyword would be allowed everywhere. It'll improve the
readability of SQL as it is:
SELECT STUPID A.parent, STUPID A.child STUPID FROM STUPID
revision_ancestry STUPID AS STUPID A ...


I don't like CROSS, but it is the closest thing to what I might find
acceptable.

:)



Re: [sqlite] CROSS keyword disables certain join optimizations

2005-09-11 Thread Mrs. Brisby
On Sat, 2005-09-10 at 19:25 -0400, D. Richard Hipp wrote:
> Beginning with SQLite 3.2.3, the query optimizer has had
> the ability to reorder tables in the FROM clause if it thinks
> that doing so will make the query run faster.  This has caused
> a few problems for some folks.  To ameliorate those
> difficulties, I have now modified the query optimizer so
> that it will no longer reorder tables in a join if the join
> is constructed using the CROSS keyword.  For additional
> detail see
> 
>   http://www.sqlite.org/cvstrac/tktview?tn=1414
>   http://www.sqlite.org/cvstrac/wiki?p=QueryPlans
> 
> Thoughtful feedback on this design choice will be appreciated.
> I'd like to do a release of version 3.2.6 containing this and
> other changes on or before September 16.
> 
> The rational behind using the CROSS keyword to disable an
> optimization is that the CROSS keyword is perfectly
> valid SQL syntax but nobody ever uses it so I figured 
> we can put it to use to help control the optimizer without
> creating any incompatibilities.

The ticket mentions that postgres uses the same heuristic? Is there some
Pg documentation that demonstrates this?



Re: [sqlite] How to reduce the flash erasing times when I using sqlite

2005-09-11 Thread Mrs. Brisby
Don't?

Use a :memory: table or a ramdisk. periodically (or at commit time?)
save a copy of the database elsewhere- perhaps in another form- perhaps
in a form that's less pleasant to query quickly.

On Wed, 2005-09-07 at 14:48 +0800, SH L wrote:
> I found sqlite will update some fixed places
> frequently in  the database file when insert,update or
> even create operation happens.It means that if my
> database file in flash and I do high-frequency insert
> or update(about 3000/day).The flash will die after a
> while.How can I solve this problem. Thankyou.
> 
> 
>   
> ___ 
> 雅虎免费G邮箱-No.1的防毒防垃圾超大邮箱 
> http://cn.mail.yahoo.com



RE: [sqlite] Encryption and fixed header values (was Re: [sqlite] Why can i open a textfile?)

2005-08-29 Thread Mrs. Brisby
On Fri, 2005-08-26 at 12:53 -0400, Griggs, Donald wrote:
> Regarding Mrs. Brisby's ending comment:
>"A better defense: use a different key each time. Encrypt the session key
> separately."
> 
> I may way off in asking this, but:
>   Since we're talking about the encyption of data at rest, and aren't in a
> client-server situation, how could sqlite make session keys work?   What
> would constitute a session?

The lifetime of the session is the time between encrypting the data and
decrypting it. It doesn't have to be on line.

PGP (for example) encrypts the message using symmetric ciphers (like RC4
or Blowfish) using a completely random key. It then encrypts the key
using the public-key algorithms desired (RSA, DSA) and writes the
encrypted key.

This works because the receiver decrypts the encrypted random key and
uses it to decode the rest of the message. It has some fantastic
benefits:

1. assymetric ciphers are slow, this is certainly much faster
2. if any information is leaked about the "real key", it's very little.
Often very few bits of the real key are needed.
3. Brute-force attacks on the message will yield the session key, but
brute-force attacks on the encrypted session key won't yield enough
information to reconstruct the key.


This is similar to how SSL works as well, although the exchange is done
"on line" - which is probably why you're confused about it.

This is almost certainly what people are talking about when they refer
to session keys.


With some ciphers (like RC4), the session key can be obtained through
other means. Because RC4's encryption/decryption routine permutes the
keyspace, encrypting _anything_ causes permutations to occur.

But because those permutations are wildly predictable (at first), many
people discard the beginning of the RC4 keystream.

I disagree with this. I suggest taking some random junk and encrypt it.
Throw away the result. This permutes the keystream faster, and it means
that no plaintext is encrypted twice accidentally.

The random junk isn't secret. You can store it at the head of your
ciphertext. This "random junk" is often called an initialization vector-
and is a common way to reuse keys with more safety.

This "IV" and "key" combination are often considered "the session key".



Re: [sqlite] Encryption and fixed header values (was Re: [sqlite]Why can i open a textfile?)

2005-08-29 Thread Mrs. Brisby
On Fri, 2005-08-26 at 16:21 +0200, F.W.A. van Leeuwen wrote:
> > 
> > The usual defense against this attack is to mix some random information
> > into the beginning of the plaintext.
> > 
> > A better defense: use a different key each time. Encrypt the session key
> > separately.
> > 
> 
> And /or start encrypting after the fixed header.

Agreed. Thanks for pointing this out.



Re: [sqlite] Encryption and fixed header values (was Re: [sqlite] Why can i open a textfile?)

2005-08-25 Thread Mrs. Brisby
On Wed, 2005-08-24 at 22:55 -0400, D. Richard Hipp wrote:
>   Weaknesses in RC4 have been found where
> the first few numbers coming out of the PRNG leak information about the
> key.  If an attacker can guess the first few bytes of plaintext, and
> hence guess the first few numbers from the PRNG, and can do this many
> many times (millions of times) then the attacker can eventually
> reconstruct
> the key.

I noticed this. You understate how much it helps. The first few cycles
of RC4 are so bad that key recovery is easy for modern general purpose
computers.

> The usual defense against this attack (and the one used by SQLite)
> is to discard the first 1000 bytes or so of information coming out
> of the PRNG.  No key information leaks into later bytes of the
> PRNG stream (at least as far as we know) so this secures the cypher
> from attack.

It doesn't need to leak information about the key. A combination
known-plaintext and known-ciphertext attack works very well against RC4.

http://groups.google.com/group/sci.crypt/browse_frm/thread/2716ac20a3fc9971/64eba041932a98ae?lnk=st=1=en

Since the header is well known, convincing the program to encrypt the
database (by say, making a change to it) several times allows the user
to collect some known-plaintext and lots of ciphertext very quickly.

The usual defense against this attack is to mix some random information
into the beginning of the plaintext.

A better defense: use a different key each time. Encrypt the session key
separately.



Re: [sqlite] Encryption and fixed header values (was Re: [sqlite] Why can i open a textfile?)

2005-08-24 Thread Mrs. Brisby
On Mon, 2005-08-22 at 16:39 -0500, Dennis Jenkins wrote:
> RC4 is basically an XOR against a huge one-time pad.

No it's not. The only thing like a one-time pad is a one-time pad.

What makes an OTP secure isn't the protocol, but where the bits come
from.

If they come from ANYWHERE but a truly random source then it is not
secure.

RC4/ARCFOUR isn't a random source of bits, so it isn't a one-time pad.


> That pad is creating using a "key scheduling algorithm".

No it's not. The key scheduling system moves S-boxes around. There are
exactly 256 S-boxes with RC4/ARCFOUR. If you really did perform an XOR
operation on those SBOXes you would have no security unless:
* your key was exactly 256 bytes
* your data stream was exactly 256 bytes

Nevertheless, it still wouldn't be much security because you probably
didn't chose a strong key to begin with.


> The key state for the random number generator is 256 bytes of data and two 
> 8-bit indicies.  

No it's not. The substitution array (S-boxes) is 256 bytes. It
essentially translates one byte into another byte, then performs a
permutation on the S-boxes. That permutation is fixed. The S-boxes are
moved around each step, thus increasing the run-length. Unfortunately,
its not enough- getting the initial key bytes of the stream makes it
much easier to reconstruct the rest of the swapping routine.

> There are several attacks that can be used to derive the original key 
> state, but they all require huge samples of data to analyze.

No they don't. That's the problem. They only need a large number of
_uses_ of the key, not large amounts of data. Adding some random junk to
the beginning of the plaintext would help some, but I don't think SQLite
does this.


> RC4 is not the strongest encryption available, but for most of us, it is good 
> enough.

... and it wouldn't matter anyway because the key exists in the software
that decrypts it. That's much easier to get to anyway.

> The best way to attack any system using sqlite w/ the crypto extension 
> is to hook the call to "sqlite3_key()" and just steal the rc4 key 
> directly.  Much easier than crunhcing numbers on a super computer.

At least on this, I agree.



Re: Re[10]: [sqlite] Accessing Database on Network

2005-08-10 Thread Mrs. Brisby
On Wed, 2005-08-10 at 11:30 +0200, djm wrote:
> Hello,
> 
> 
> >> My understanding, after all of your helpful comments,  currently is:
> >> 
> >> Simultaneous reading of a server file by different clients is safe, as
> >> long as the file doesnt get changed at all (at least in a time scale
> >> where the Client Os could cache it).
> 
> Mrs.> Remove your parenthesized exception and this is correct.
> 
> My current plan is:
> 
> Store the (readonly) database on the server and include version info
> in a table in the database. Each time the client app starts, it
> accesses the server database and checks the version info. Based on
> this it decides whether it needs to copy the database file locally
> (the server version is newer or there is no local version) or can use
> the current local version (the version of the server database is the
> same as the local version).

So the server database never changes?

How do new versions get published?

[[ This is why I suggested a protocol with well known semantics. ]]


> Its possible that more than one client will be accessing this version
> info simulataneously. Can there be any problem? Using sqlite to read
> this version info via a SELECT should map cleanly to a read only
> access to the disk file, right? This is the only point where imnot
> fully sure. According to the above should then be fine?

Read only access isn't always "safe" in that all of ACID can be
satisfied. MS-DFS for example.

> The server database will need to be updated at some stage. The sys
> admin will haye to be responsible for seeing that no client is
> accessing the file while its being updated (by disconecting the server
> from the network or doing the job at niight when all clients are off,
> or just making sure that none are running my app or whtever). What
> should I advise him is a necessary/reasonable procedure, and what are
> the worst case scenarios for each appraoch?

I've already told you what's safe. SQLite has a broad range of
requirements that if you cannot lock down your requirements any better
than "the client knows more about engineering than I do" and "i have no
idea what operating system, networked filesystems and platforms will
ever be used for this application" you're either absurdly
future-proofing yourself, or you only think you are.

NFS has well known and well defined semantics. So does SMB/CIFS. Their
semantics can be made compatible with SQLite (using -osync and the urls
I posted last). DFS, AFS and Bullet cannot be made safe with SQLite
unless the filename has the version encoded in it and you use purely
atomic file creation (rename).

HTTP is probably the simplest.

> As far as I can understand, the only thing that absolutely must be
> ensured is that no app starts after the sys admin begins replacing the
> file, and before he's finished, because then the client would
> possibly get corrupt data. This could be ensured eg by
> temporarily disconnecting the server from the network.

Are you trying to remove race conditions or not?

> Given this worst that could happen (which is extremely unlikely, but not
> completely impossible), due to client os caching, would be that
> upon restarting the clients then dont know that the serever database
> is newer, and dont copy it locally it, and instead continue to work with the
> old data (local version).

...

> Does this sound like a good approach?

No.

> >> The same applies to sqlite accesses since it just uses regular files.
> >> And when you only querying the database (not changing the data), you
> >> are just reading the database file .. ie sqlite doesnt do anything
> >> which would break the clients only reading condition.
> >> 
> >> Am I right so far?
> 
> Mrs.> Almost.
> 
> Mrs.> You make a distinction between Client and Server and I don't
> Mrs.> think you know who they are.
> 
> Whats that supposed to mean?
> 
> Mrs.> It's helpful to understand that the server cannot update files. Ever.
> Mrs.> Only clients can update files. The server coordinates these updates. In
> Mrs.> reality, in many cases, the server coordinates fine. In others, it fails
> Mrs.> miserably.
> 
> You can work with the server just as a normal machine. It happens to
> be functioning as a network server for clients on the network. Eg. the
> system admin can copy a database file from a cd onto the harddisk of the
> server. The server updates the database here, not the client.

No you cannot. That's the point. You only think you can. When you update
a file (say) on CIFS, this could invalidate oplocks and force clients to
redownload. Those notifications can reach clients at a different rate.

Are you trying to remove race conditions? Or are you simply trying to
make them harder to find?

> >> The size of a file depends for example on the cluster size of the
> >> underlying partition. And regardless is of little value in comparing
> >> two files; the same file size/name doesnt mean two files are
> >> identical.
> 
> Mrs.> The size of a file does not 

Re: Re[8]: [sqlite] Accessing Database on Network

2005-08-09 Thread Mrs. Brisby
On Wed, 2005-08-10 at 00:30 +0200, djm wrote:
> Hello,
> 
> Henry> No. It might be on your particular systems, but we cannot make
> Henry> a general case. It is possible that it will always be safe on
> Henry> (for example) Windows 2000, while not safe on NT or XP! It
> Henry> depends on how the OS handles caching. They are supposed to
> Henry> check on each file access to see if the file changed, but
> Henry> networks are slow, so they might not. Thus you could start
> Henry> reading data that is in the cache, which is out of date.
> 
> My understanding, after all of your helpful comments,  currently is:
> 
> Simultaneous reading of a server file by different clients is safe, as
> long as the file doesnt get changed at all (at least in a time scale
> where the Client Os could cache it).

Remove your parenthesized exception and this is correct.


>  Updaing the file might cause
> problems becuase of caching on the Client Os side, no matter how
> cleanly and isolated this update was done (eg all clients off, and
> update on server machine). (Im not sure I fully understand what terms
> like "atomic and "durable" mean in this context but I presume Ive
> gotten the jist?)

This is correct.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cossdk/html/857d145c-710d-4097-8ed6-df11e8d52228.asp

You should really know what atomic and durable mean.



> The same applies to sqlite accesses since it just uses regular files.
> And when you only querying the database (not changing the data), you
> are just reading the database file .. ie sqlite doesnt do anything
> which would break the clients only reading condition.
> 
> Am I right so far?

Almost. You make a distinction between Client and Server and I don't
think you know who they are.

It's helpful to understand that the server cannot update files. Ever.
Only clients can update files. The server coordinates these updates. In
reality, in many cases, the server coordinates fine. In others, it fails
miserably.

The file cannot be updated by anything but a server. If you rename() it
on the "local machine" you are doing that rename as a client- at least
as far as all of these statements about networked/distributed
filesystems being unsafe are.


> >> In my opinion system time stamps etc are not a reliable means of
> >> comparing 2 files. Many things can change the timestamp of a file,
> >> without changing the contents, and one (server) os/filesystem can
> >> report a different file size to another (local) for the same file
> >> (contents). As I said already, I think having a version number
> >> embedded in the databse itself is much more relible.
> 
> Mrs.> Wrong. The sqlite database is binary. Any size check is going to
> Mrs.> be in 8-bit bytes and will always be the same- except for
> Mrs.> platforms that SQLite doesn't run on- any comparison with a
> Mrs.> system that doesn't have 8-bit bytes will have to be aware of
> Mrs.> that fact.
> 
> The size of a file depends for example on the cluster size of the
> underlying partition. And regardless is of little value in comparing
> two files; the same file size/name doesnt mean two files are
> identical.

The size of a file does not depend on the cluster size of the underlying
partition. That's how much SPACE the file takes up.

Whether or not your filesystem records the size of a file isn't
important. The concept of file size exists, and it has nothing to do
with clusters or blocks or extents.

As a matter of fact, almost all filesystems record the correct file
size.

CP/M is a notable filesystem that does not. AFAIK, SQLite won't run in
CP/M so this is moot.


The test is not to determine whether or not the files are identical, but
if one has been changed. This method will certainly download files when
they haven't been changed (although it's unlikely). This unusual case
wastes some bandwidth. In contrast to downloading it all the time, where
you always waste bandwidth.


> Mrs.> You still haven't said what platform you're developing for.
> 
> Windown in particular, but also others.

http://support.microsoft.com/kb/q163401/
http://support.microsoft.com/kb/q148367/
http://www.winguides.com/registry/display.php/1295/

WARNING: in this mode, SQLite will be much much slower than your earlier
tests would indicate.


others networked filesystems have other semantics. I'm not going to even
try to list them all.



> Mrs.> djm: You still haven't told us exactly what you're using.
> 
> Im currently using C++ (Qt) but also plan to use php with sqlite for
> other projjects. In either cases the app should be cross platform.

If you're looking for cross-platform you need to select protocols that
have well defined semantics. HTTP sounds like a good bet. Full file copy
sounds like a better one.



Re: Re[6]: [sqlite] Accessing Database on Network

2005-08-03 Thread Mrs. Brisby
On Wed, 2005-08-03 at 08:46 -0500, Henry Miller wrote:
> >In my opinion system time stamps etc are not a reliable means of
> >comparing 2 files. Many things can change the timestamp of a file,
> >without changing the contents, and one (server) os/filesystem can
> >report a different file size to another (local) for the same file
> >(contents). As I said already, I think having a version number
> >embedded in the databse itself is much more relible.
> 
> You should be running NTP (network time protocol) on all computers.
> This will keep all your system times to within milliseconds.   Unix
> systems keep the last modified times tamp separately.  Microsoft
> Windows sets (resets?  I can never remember) the archive bit, which
> could be abused to tell you when a file is modified - at the cost of
> breaking backups so I can't recommend it.

NTP isn't relevant. Set the mtime to whatever you saw on the server
using wstat() or utime() or what have you. Don't bother trying "to get
close".




Re: Re[6]: [sqlite] Accessing Database on Network

2005-08-03 Thread Mrs. Brisby
On Wed, 2005-08-03 at 10:34 +0200, djm wrote:
> Hello,
> 
> Wednesday, August 3, 2005, 4:41:24 AM, you wrote:
> 
> > No, none of those things are guaranteed. If there's even a single
> > writer it can be unsafe (consider AFS putting a corrupt journal up
> > that clients notice download, and thrash their local copy).
> 
> But Im saying there -wont- be a single writer (except the rare update
> process on the server, during which all other clients wont be
> reading), and the question is it it always safe then?

The answer is "sometimes not". Sometimes, it's even "often not".

It really depends on your situation. It depends on what networked
filesystem you're using, and what operating system you're using.

In the case of UNIX and AFS- if you have all clients turned off - or you
do it during AFS replication, then yes.


> > Okay. that's what's important. That update procedure is completely
> > unsafe UNLESS you can guarantee that the sqlite database will be
> > overwritten atomically. Most operating systems don't have such an
> > operation- the closest thing being rename().
> 
> And this is presumably only important if another client is reading the
> file while its being updated. If no client is reading the file on the
> server during update (and none are writing anyways), then the os and
> the filesystem should be irrelevant, right?

It depends on what you mean by "no client is reading."
If you really mean the machines are turned off [and have no cache to
upload], then maybe.

If you mean that AND you're doing the update "on the server" (instead of
by using another node/client/workstation) then the answer is "probably".


> > If you're willing to do that, why not download the database
> > periodically? Why not store the database via HTTP or otherwise check
> > the mtime of it, and download it anew if the size/mtime doesn't
> > match?
> 
> > This'll be a lot safer AND provides a mechanism by which corrupt
> > databases can otherwise be transparently updated.
> 
> Downloading a copy of the database is indeed is indeed a good
> suggestion, but it doesnt change my original question.
> 
> In my opinion system time stamps etc are not a reliable means of
> comparing 2 files. Many things can change the timestamp of a file,
> without changing the contents, and one (server) os/filesystem can
> report a different file size to another (local) for the same file
> (contents). As I said already, I think having a version number
> embedded in the databse itself is much more relible.

Wrong. The sqlite database is binary. Any size check is going to be in
8-bit bytes and will always be the same- except for platforms that
SQLite doesn't run on- any comparison with a system that doesn't have
8-bit bytes will have to be aware of that fact.

Changing the timestamp COULD be enough to have clients interrogate it-
for example, perform some rsync operations on the file, or check in a
prearranged place for a generation number.

Either store the version/generation number in a separate file or hack
the database to store it in the header. IIRC there are a few unused
bytes there.

Your OS might have other freshness checks (unix, for example, can use
inodes if you always rename()), but even if they don't, there are other
mechanisms:

* Encoding the generation number/version into the filename
* Update some EA (attribute)
* Update the file's mtime to some value THAT IT WASN'T before (by
recording all used mtimes)

You still haven't said what platform you're developing for.

> So the question is still is it always safe if 2 clients are
> simultaneously reading this version info from the database (or for
> that matter, but its not sqlite specific, if 2 clients are
> accessing/running the executable file) ?

The answer is still one of "usually", "probably" or "maybe".

By using the methods I suggest the answer is "yes".

> > you said "I need to access a database on the network (x), [how do I]
> > do this safely (y)".
> 
>  I didnt. Youre rephrasing my post, to suit your means, and I think
>  the point is degenerating in the process. And as I said already I
>  wasnt trying to be rude, and Im sorry if it seemed so. But if you
>  insist on interpreting it as such I suppose I cant stop you.

I'm showing you how I read your original post. Is this critically
different than what you're saying? If so, how?



Re: Re[4]: [sqlite] Accessing Database on Network

2005-08-02 Thread Mrs. Brisby
On Tue, 2005-08-02 at 17:36 +0200, djm wrote:
> Hello,
> 
> Thanks very much for the detailed and helpful response. Ill certainly
> have a look at the book you receommended.. thanks for the tip.
> 
> Sunday, July 31, 2005, 4:47:11 PM, you wrote:
> 
> > As a result, any piece of software that relies on these semantics is
> > bound to be surprised.
> 
> But in this case the piece of sofware is sqlite, and Im not familiar
> enough with its locking and other interanals to know exactly how safe
> it is to do what. As far as I can tell its just reading and writing
> sql commands that are of primary interest, and its up to sqlite to
> make this as safe as it can, and to state under which conditions it
> cant. The information on the website was pretty vague. It shouldnt
> really be a prerequisite to be intimitely familiar with various
> operating systems and/or filesystems to use sqlite or another
> database, on a network to use sqlite effectively (not that such
> knowledge wouldnt be very desirable, and I will read the book you
> recommended).

I disagree with that: see below.

> > As a classical example: AFS deals with this problem by copying the file
> > locally (completely!) before allowing the client to read it. Once the
> > last client closes it, any changes are pushed up blindly. An AFS enabled
> > SQLite is bound to produce corrupt databases if there is more than one
> > writer (!)
> 
> What do you mean "pushed up blindly".

I mean that AFS will actually upload the file overwriting anything else
there.

> > NFS caches blocks together before sending them. This violates the
> > single-byte atomicity of UNIX's own filesystem. As a result, no amount
> > of journalling will help- but at least rename() is still atomic. So with
> > NFS, one should write the new database - completely from scratch, and
> > rename it over the original. Clients should NEVER update a shared file
> > on NFS. Ever.
> 
> How is that different to the above? Isnt the local copy above copied
> over the original on the server?

No. The cache isn't write through in the case of NFS (with -osync), and
the lock rpc causes the read-cache to be invalidated anyway.

> > .. you really shouldn't expect any writers to function on SQLite in
> > a reliable manner...
> 
> Fair enough. I had presumed this alerady. However is is always 100%
> safe to do simultaneous reads (with all sqlite versions (3 and 2) and
> on all os's it runs on?) And does what one reasonybly presumes is a
> read (non write) in terms of sql commands (something that shouldnt
> change the data in the database) always translate to a open file in
> read_only mode for the os?

No, none of those things are guaranteed. If there's even a single writer
it can be unsafe (consider AFS putting a corrupt journal up that clients
notice download, and thrash their local copy).

If the file is never written to, then why is it on the network (besides
for distribution)?


> >> > If there's never any writers, why bother keeping it on the network?
> >> 
> >> I do have reasons, most of which are based on particular customer
> >> requirenments.
> 
> > Customers never say "the database has to be stored on the network so
> > it's slower."
> 
> > They might put something else there- and that's my question: what is the
> > exact requirement that makes you want to keep it on the network _IF_
> > there aren't any writers?
> 
> My customer wants the app instaled cleanly on the network (server) and
> all clients to be oblivious to any updates app or database (presuming
> theyre not unning the app while its being updated). The app should
> also work for the other customers just running it locally.

Okay. that's what's important. That update procedure is completely
unsafe UNLESS you can guarantee that the sqlite database will be
overwritten atomically. Most operating systems don't have such an
operation- the closest thing being rename().

If you're willing to do that, why not download the database
periodically? Why not store the database via HTTP or otherwise check the
mtime of it, and download it anew if the size/mtime doesn't match?

This'll be a lot safer AND provides a mechanism by which corrupt
databases can otherwise be transparently updated.

> > And don't just say "I do have reasons" when you're asking a question.
> > It's really rude:
> 
> > http://www.catb.org/~esr/faqs/smart-questions.html#id3002514
> 
> Sorry if I sounded so but I didnt mean to be rude. I just thought that
> the motivation behind my question would be wasting bandwidth, since
> the question was valid (and in my opinion interesting) whetever the
> motivation. However I didnt ask "how to use x to do y". I asked "can I
> do x" and you asked "why dont I do y instead". And I said that Id
> still like to know even out of curiouity if its safe to do x ;-)

you said "I need to access a database on the network (x), [how do I] do
this safely (y)".

The answer is "it's almost never safe" but nobody wants a short answer-
they want to know why- with 

Re: [sqlite] Multi-threading.

2005-07-30 Thread Mrs. Brisby
On Sat, 2005-07-30 at 20:29 +0200, Jan-Eric Duden wrote:
> Mrs. Brisby wrote:
> 
> >On Sat, 2005-07-30 at 14:30 +0200, Jan-Eric Duden wrote:
> >  
> >
> >>Win9X doesn't support the API async file operations.
> >>WinNT/2K/XP does support it.
> >>
> >>
> >
> >It supports everything it needs to:
> >
> >http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/waitformultipleobjects.asp
> >
> >
> >it doesn't support async file NOTIFICATIONS, (afaik), but those aren't
> >necessary...
> >
> >  
> >
> How should waitformultipleobjects help with aync file io?

It doesn't. Read below.

> For async file io on Win32 you would open a file with CreateFile and 
> pass the  FILE_FLAG_OVERLAPPED flag.
> After that, functions like WriteFile and ReadFile work asynchronously. 
> Unfortunately, on Win9X/ME it is not supported for files on disk.
> For example, see the docs of WriteFile/ReadFile/WriteFileEx/ReadFileEx:
> *"Windows 95/98/Me: *For asynchronous write operations, /hFile/ can be a 
> communications resource opened with the FILE_FLAG_OVERLAPPED flag by 
> *CreateFile*, or a socket handle returned by *socket* or *accept*. You 
> cannot perform asynchronous write operations on mailslots, named pipes, 
> or *disk files*. "

No you don't. It's not just that Windows 3.11/95/98/ME don't support
FILE_FLAG_OVERLAPPED - its that their filesystem isn't reentrant. It
cannot do a context switch during that either.

You use a large read-size (bigger than the disk-block) and it can
context-switch inbetween disk operations. Threads look like a win here!

You can also use the disk-block reads. The operating system you're
talking about is uniprocessor only. Process control returns to your
program just as fast as a thread-switch can occur, and you've saved a
context-switch.

Use waitformultipleobjects to check your event pump after each readFile
or writeFile operation.

[[ as a side note: no, it's not really that easy. If the disk is asleep,
spun down, or if you run two programs you need to do extra work:
* wake up the disk
* lock the underlying physical disk
Because your code is simpler, these things aren't such a big deal. ]]


I will personally advocate splitting processes into a worker-master
situation- I just don't recommend pool-threading. That's a lie. I
recommend against pool-threading.



Re: [sqlite] Accessing Database on Network

2005-07-30 Thread Mrs. Brisby
On Thu, 2005-07-28 at 11:28 +0200, djm wrote:
> Hello,
> 
> The documentation suggests that its unsafe to use SQLite when the
> database file is on a windows network server and various other
> machines may want to simultaneously access it. If however none of
> these machines change the data in the databse (all accesses are just
> queries) is it then completely safe?
> 
> Presumably only querying the database (and not inserting/altering etc)
> translates internally in sqlite to opening the database in read_only
> mode, which is presumably safe to do in several clients concurrently?
> If so from what version of windows/sqlite can I be confident?

Understanding what underlying operations aren't safe, is helpful.

If there's never any writers, why bother keeping it on the network?

Have your program "check the network server" for a newer version of the
file and copy it down if necessary.

This will reduce network traffic because you won't be checking for locks
every query...



Re: [sqlite] Multi-threading.

2005-07-30 Thread Mrs. Brisby
On Sat, 2005-07-30 at 14:30 +0200, Jan-Eric Duden wrote:
> Win9X doesn't support the API async file operations.
> WinNT/2K/XP does support it.

It supports everything it needs to:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/waitformultipleobjects.asp


it doesn't support async file NOTIFICATIONS, (afaik), but those aren't
necessary...

> Regarding threads:
> 
> >Threads complicate things. People think they "understand threads" long
> >before they actually do. Goto complicates things. People think they
> >"understand goto" long before they actually do.
> 
> That's a completely normal property of human nature. You should get used to 
> it.
> This behavior is not limited to the issue of using thread or goto.
> It applies to everything that needs some skill.
> Most people just don't know what they can do or can't do until they fail.
> Even if they fail they think they succeeded.
> To make things worse, once you taught someone to know better, then someone 
> else will take his/her place.

Agreed.



Re: [sqlite] Multi-threading.

2005-07-29 Thread Mrs. Brisby
On Fri, 2005-07-29 at 14:29 -0700, Robert Simpson wrote:
> Threads are a tool ... sortof like a chainsaw.  Used properly, they can save 
> you time and accomplish your goal.  In the hands of an expert, you can even 
> make some impressive art.  Well, perhaps the latter only applies to 
> chainsaws, but you get the idea.  In the hands of the ignorant or 
> inexperienced, you can easily lop off an arm or a leg, or getting back to 
> threading, seriously undermine the integrity of your code.  Threading, like 
> chainsaws, can be dangerous if not used properly.

Agreed.


> > 20 years ago, people said the exact same thing about goto- the places
> > where it can help are in programs that don't often use them, but the
> > places where it DOESN'T HELP- and actually hurts, are the ones using
> > goto.
> 
> I used goto's back when I was programming on the Vic-20 and I only had 3k of 
> RAM to work with -- every byte was valuable back then.  I also couldn't 
> understand the various branches of my code a week later when I went to debug 
> it, so I had a love-hate relationship with gotos.

Like chainsaws, gotos aren't useful for every task.

> I don't use them primarily because they, like threads, are easily abused and 
> can potentially make your code extremely hard to follow.  Case in point, one 
> of the older programmers in my office uses goto's liberally (on one project 
> we discovered a 1:9 ratio of goto's in his code -- meaning 1 goto for every 
> 9 lines of code on average).  His code is unintelligible.  He has labels 
> inside his for() loops and goto's jumping in and out of them.  He's got code 
> leapfrogging all over creation, and its the sickest ugliest thing I've ever 
> witnessed.  Who in hell even knows if the stuff works -- its all but 
> impossible to understand without running it.

Agreed.

> > Now, if you're feeling like you're on a high horse, go ahead: point out
> > a single common platform where threads are the necessity.
> 
> That argument is silly.  There exist no *platforms* where multithreading is 
> a necessity.  The platform doesn't dictate whether threading is useful or 
> warranted for a given application.  The programmer decides that.

Hence the point. Look at the parent and grandparent. There are many
people posting on this thread that _believe_ that threads are a
necessity.

> The real problem many coders encounter when writing multithreaded 
> applications is that some of the underlying libraries they may be using may 
> include optimizations that make the library inherently unsafe in a 
> multithreaded environment.  For example, a library that uses a global 
> variable to store a handle to a database file, and all instances of the 
> object simply refer back to that global handle.  You could code all the 
> synchronization objects you like and make sure all the object instances 
> never cross a thread boundary, but in the end if the underlying library's 
> objects are all sharing a global variable unbenknownst to you, then your 
> code is still going to have a potentially lethal race condition.  I think 
> *this* is the major issue people are concerned about when they come to the 
> SQLite list asking if SQLite is threadsafe.

Right. But they don't ask that. They say "I found a bug in SQLite when
using threads".



Re: [sqlite] Multi-threading.

2005-07-29 Thread Mrs. Brisby
On Fri, 2005-07-29 at 16:22 -0600, K. Haley wrote:
> Mrs. Brisby wrote:
> 
> >Now, if you're feeling like you're on a high horse, go ahead: point out
> >a single common platform where threads are the necessity.
> >
> >Here's some hints:
> >
> >* it's not windows
> >* it's not unix
> >* it's not java (closures)
> >* it's not lisp (closures)
> >* it's not forth (coroutines)
> >* it's not macos (see unix)
> >* it's not gtk+ (closures/event pump)
> >
> Argh...  gtk+ uses threads on windows.  The g_io_channel async api is
> implemented this way.  This is done because windows has no async file io
> api.  There may be other places/platforms where glib/gtk+ uses threads.

Err, windows most certainly does have Async API.



Re: [sqlite] Multi-threading.

2005-07-29 Thread Mrs. Brisby
On Thu, 2005-07-28 at 07:42 -0500, Dennis Jenkins wrote:
> Mrs. Brisby wrote:
> 
> >meanwhile, two threads attempting to multiplex access to a single sqlite
> >handle are a) bound to be confused, and b) need those resource locks
> >anyway.
> >
> >  
> >
> (background: I've been using threads on win32 since 1998.  Sometimes to 
> spread load across multiple CPUs, but usually just because I find it 
> simpler than state logic to do everything in one thread.  I'm not 
> entering into this debate on threads vs. non-threads.)

I find I use goto a lot. I find it's simpler than lots of if/break
scenarios.

I think you'd be surprised to find out that you might actually agree
with me :)


> I just want to know why anyone would code using multiple threads 
> accessing the same SQLITE connection object ( or handle or whatever you 
> want to call it).  I allocate a separate connection object for each 
> thread and have ZERO troubles with locking so long as I use transactions 
> properly.  Assuming a multi-threaded (with in the same process) 
> environment, what benefits are there to use a single (global?) SQLITE 
> object (protected by a mutex, semaphore, critical section, whatever)?  
> It seems so much more complicated and brittle.

The reason they would write their code this way is because they don't
know how to use threads. They think threads work differently than they
do.

And folks, this here is my whole point. Coming from the position of
someone who knows how to use threads: Why would anyone chose to use them
incorrectly. The only reasonable answer is that they simply don't know
how to use threads.

Threads complicate things. People think they "understand threads" long
before they actually do. Goto complicates things. People think they
"understand goto" long before they actually do.

They think they know the dangers of threads, and they think they know
the dangers of goto.

And yet, there are folks who vehemently avoid "goto" as if it were a
plague, that simply don't understand why people would just as
aggressively reject "threads".



Re: [sqlite] Multi-threading.

2005-07-29 Thread Mrs. Brisby
On Thu, 2005-07-28 at 08:44 -0400, Eli Burke wrote:
> Mrs. Brisby wrote:
> > My PII-350 easily handles 2 million local deliveries per hour, and it
> > fork()s for each one.
> > 
> > As has been mentioned: If you think threads are going to make your
> > program fast, you don't know what you're doing.
> 
> Like it or not, the world is bigger than just Unix. Some of us have to
> write code
> for other platforms, or even cross-platform. You can preach until you
> turn green
> about the dangers of threads in the hands of all the rest of us dumb
> idiots, but that
> doesn't change the fact that sometimes they are an appropriate (or the
> only) solution.

Like it or not, you still don't have any idea what you are doing. They
[threads] are never the only solution, and they are rarely appropriate.

I'm certain I'm not being entirely clear on just how rare it is: Threads
so rarely help that the programs that they CAN help don't often use
them, and the programs that it DOESN'T HELP- nay actually hurts, are the
ones using threads.

20 years ago, people said the exact same thing about goto- the places
where it can help are in programs that don't often use them, but the
places where it DOESN'T HELP- and actually hurts, are the ones using
goto.

Now, if you're feeling like you're on a high horse, go ahead: point out
a single common platform where threads are the necessity.

Here's some hints:

* it's not windows
* it's not unix
* it's not java (closures)
* it's not lisp (closures)
* it's not forth (coroutines)
* it's not macos (see unix)
* it's not gtk+ (closures/event pump)

That said: Writing kernel processes for the linux kernel requires a lot
of thread-think in that you have to keep all of your memory access
private. I generally suspect, however, that the people having problems
with threads _there_ aren't the same people who are having problems with
threads _here_.


> Mrs. Brisby wrote:
> > Maybe people think that SQLite should serialize its own internals. Maybe
> > that's why people keep saying this.
> [snip]
> > meanwhile, two threads attempting to multiplex access to a single sqlite
> > handle are a) bound to be confused, and b) need those resource locks anyway.
> 
> I don't think anyone is asking for SQLite to be serialized, or for it to
> work properly
> when a db handle is reused by multiple threads. Yes, there are a lot of
> questions about
> threads on the mailing list. Maybe it is because Kervin Pierre is
> right-- the documentation
> RE threads is poor.

SQLite doesn't have anything to do with threads. That's the problem.

What people miss is that they either think memory-chunk serialization is
automatic, or automatically provided by SQLite.

But SQLite isn't different from the majority of third-party libraries
out there, and if _THIS_ is what is tripping people up, then it should
be all the more evident that thread-think makes people dumb.

[[ personally: I don't think this is what's tripping people up. _I_
think that they really don't know how to use threads in the first
place. ]]


> It's no big secret that Dr. Hipp is in the "threads are bad" camp, and
> so getting helpful
> information means spending hours reading through old mailing list posts,
> sorting through the
> chaff, trying to figure out what behavior applies to the current
> software version. (a number
> of significant threading issues have been resolved since 3.0 was
> released). There is a short
> article in the wiki on threads, but you have to know to look for it.

No. In order to use threads with SQLite you need to know how to use
threads. In order to use threads with ImageMagick, you need to know how
to use threads. In order to use threads with PostgreSQL you need to know
how to use threads. In order to use threads with libmodplug you need to
know how to use threads.

Notice a common theme? All of these libraries require the user do their
own serialization, and yet people continue to ask "how do I make
OpenLDAP work with threads" when they really mean "how do I use
threads."

There isn't any grey area here. Threads are complicated- so complicated
that they screw up otherwise promising programmers into thinking that
they're necessary and/or easy, when they're clearly neither.


> Consider the lone FAQ entry on threading. It contains two helpful bits
> of advice. 1) Don't
> reuse db handles between threads. 2) Don't reuse db handles after
> fork(). Now imagine if
> that information was actually in a useful place, namely in the
> sqlite3_open API reference.
> Perhaps that would cut down on some new user confusion?

I doubt it. It's worth a shot though.



Re: [sqlite] Multi-threading.

2005-07-27 Thread Mrs. Brisby
On Wed, 2005-07-27 at 13:31 -0400, Andrew Piskorski wrote:
> On Tue, Jul 26, 2005 at 10:21:22PM -0400, Mrs. Brisby wrote:
> 
> > That's incorrect. Threading increases development time and produces less
> > stable applications. In fairness: it's the skill level of the engineer
...
> When people - even smart and knowledgeable people - say "thread", they
> typically mean one of two related but quite distinct concepts, and
> they often don't say precisely which they mean:
[snip]

I don't think there's any confusion here: We certainly are referring to
shared memory concurrent multiprogramming. If someone was accidentally
referring to something else, then they can be safely excused.

This discussion was never academic, but practical. There aren't any
transactional memory systems being used here.

Javascript, LISP, Perl, TCL and Erlang aren't relevant, because people
using SQLite on those platforms with those languages aren't having the
multi-threading problems that spawned this discussion. People using
those languages to do real work aren't running into the problems that
we're talking about.

You brought up some excellent examples of "highly successful" programs
that use threads or thread-think. I commend that. It's also completely
irrelevant.

It's not that nobody can use threads safely- it's that those who can
rarely chose to use threads because the skillset that comes with knowing
how to use threads safely is axillary to the skillset that comes from
knowing how to use "goto" safely, and how to write serialized
event-driven applications easily.

I keep using terms like "thread-think" and "process-think" and
"event-think": It's evident that one can write event-driven programs
with threads (Amoeba), or processes-think programs with threads (perl)
or process-think programs with events (GNU Pth) or event-think programs
with processes (yay signals).

thread-think is the idea of having two basic constructs, "UP" and
"DOWN"- without these two constructs (and having their accepted meaning)
no memory access nor IPC is atomic.

process-think considers all memory access atomic, but not necessarily
IPC.

meanwhile, event-think considers memory access AND IPC access atomic.
This is as close to "transactional memory" as exists and is useful. It's
also the mindset that I suggest to new programmers, and the one I myself
find myself using often.


Sometimes it _does_ matter. Sometimes, with the aid of a profiler,
someone can find out exactly how to squeeze a little bit more
performance out of this application.

But: nobody on this list having problems with threads is using threads
for that reason. Or rather, if they are, they're not complaining.





Re: [sqlite] Multi-threading.

2005-07-27 Thread Mrs. Brisby
On Wed, 2005-07-27 at 09:08 +0100, Paul Smith wrote:
> At 03:21 27/07/2005, Mrs. Brisby wrote:
> >On Mon, 2005-07-25 at 09:48 -0500, Jay Sprenkle wrote:
> > > The theory has been proposed that threads aren't better than separate
> > > processes, or application implemented context switching. Does anyone
> > > have an experiment that will prove the point either way? It will have
> > > to be OS specific though, since I'm sure not all thread
> > > implementations are equal.
> 
> This page might be interesting.. http://john.redmood.com/osfastest.html

My PII-350 easily handles 2 million local deliveries per hour, and it
fork()s for each one.

As has been mentioned: If you think threads are going to make your
program fast, you don't know what you're doing.

Unit testing and a profiler will give much greater performance gains
than trying to emulate piece of shit applications like MailEngine.



Re: [sqlite] Multi-threading.

2005-07-27 Thread Mrs. Brisby
On Tue, 2005-07-26 at 23:20 -0400, Kervin L. Pierre wrote:
> We get all those threads related questions because SQLite
> is difficult to use with threads and has sparse sometimes
> conflicting documentation on the subject.

Maybe people think that SQLite should serialize its own internals. Maybe
that's why people keep saying this.

It shouldn't. Those serialization steps are expensive and not always
necessary. A programmer using thread-think should be able to create the
appropriate locks WHEN appropriate-

e.g. a "gui application with an SQLite worker thread" won't need any
locks on sqlite's data structures because only one thread will be
accessing them.

meanwhile, two threads attempting to multiplex access to a single sqlite
handle are a) bound to be confused, and b) need those resource locks
anyway.


> Don't get me wrong, again I am not complaining, it is *way*
> better than the other options out there.  But you can't
> expect someone to magically figure out an API based on
> one or two sentence definitions and comments in a header
> file without asking a few questions and making a few
> mistakes.

I _NEVER_ said that I expect people to figure out an API through magical
means.

I _would_ however expect people trying to use a third-party library to
already have isolated their problem _to_ that third party library.

I resent people saying: "I'm writing a program using threads and SQLite
doesn't work with it," when what they should be saying is "I don't know
how to use threads."



Re: [sqlite] Multi-threading.

2005-07-26 Thread Mrs. Brisby
On Mon, 2005-07-25 at 09:00 +0100, Ben Clewett wrote:
> I like what you say about computer science being a Science.  This is 
> exactly my point.  A science is a collection of theories which hold true 
> within a domain until otherwise dissproven.  For instance Isac Newtons's 
> law of gravety and Einstain's law.  Both are true within a fixed domain. 
>   Both are used today.  Neither are truelly corrrect.  There will be 
> another more complex theroy in time.

You have that backwards. They remain theory until proven. Not the other
way around.

> This is the same with Threading.  There may be places where this is 
> useful.  There may be places where it should not be used.  Both are 
> theories within computer science which hold true in a limited domain. 
> So when I said this was my opinion, I should have stated this is my 
> theory within a fixed domain.  However when you stated that I was wrong, 
> I don't think this is compatible of your statement that Computer Science 
> is a Science.  Nothing in any science is either right or wrong.  That's 
> a Religion.

You have that backwards. Science most certainly has right and wrong. We
also find it useful to examine things that aren't known to be right or
wrong, but it is that that makes science different than religion.

> I don't want to spend ages on the argument, I can see there are 
> passionate views.  I only want to explore the idea that threading is a 
> viable strategy for a modern high level language and can produce stable 
> applications with low development time.

That's incorrect. Threading increases development time and produces less
stable applications. In fairness: it's the skill level of the engineer
that does this, but then, they _did_ chose to use a threaded development
model.

http://www.google.com/search?hl=en==problems+with
+multithreading=Search

Maybe you're special. Most people aren't special and as soon as you
throw threads into the equation they can't find their ass with both
hands.


> For instance a call to a TCP blocking Wait.  It's perfectly possible for 
> one thread to cycle round handing GUI events, then checking on the port 
> for new packets.
> 
> But an alternate method is to partition the code space into two 
> autonomous threads using their own variable set.  Two simple sets of 
> code which are not coupled, and remain highly cohesive and encapsulated. 

That's why we have the select() system call. That's why GTK uses glib to
encapsulate file descriptors (including those of sockets) so that event
programming can allow the GUI events to be processed "immediately" and
then handle "other things" without worrying about threads.

If you use a language like Java or Javascript (or lisp or perl or etc,
etc) you get something called a "closure" - and they make using
event-programming-think a real pleasure, because you don't have to worry
about locking, or surprises because race conditions simply won't occur.

http://www.kegel.com/c10k.html

Should be read by anyone thinking they need threads "for performance
reasons".

>   Communicating through a thread-safe object.  Such code is fast to 
> write, easy to read, robust and does the job.  I can complete this in c# 
> in ten minutes, and it will not break.  With large amounts of time, 
> better methods may be available.  But this theory is viable within its 
> domain.

This is speculative, and doesn't provide anything useful to anyone
reading this thread.

>  > I wasn't about to consider Windows a modern GUI system :)
> 
> Are you saying it's not modern, or not a GUI?  It may not be prefect and 
> it is very complex.  But it's not about to go away, and it's installed 
> on modern computers.  More importantly, my views on threads remain.  If 
> you give the GUI it's own thread, you have implemented a simple solution 
> to ensure that the correct cursor and mouse events are honoured.  Then a 
> worker thread is free to do what ever it likes without being coupled to 
> the GUI.  Simple robust coding using thread-safe objects.

It's a joke, and it's moot anyway. As Richard pointed out, Windows
doesn't demand threads from its programmers either.

> I am also interested in peoples views that threading is slower.  Since 
> all processors are likely to be hyperthreaded, multicore, or both.  I 
> can see no place where this will be true in the future.

Threading is often slower, but if you're worried about the performance
drops that _threads_ cause, you wouldn't use C#.

Threading is _harder_to_program_. It makes buggier programs. Fewer
people can follow and debug it as the program gets larger.

And worst of all: people are continually encouraged to use it for
exactly the opposite reason.


>  > Java uses references, not pointers.
> 
> Is there any important difference?  My point was not about language.  It 
> was a question to Dr Hipp about what he feels is missing from the Java 
> pointer (reference) model.  Perhaps I should have explained better.

I brought this up.

Pointers have their 

Re: [sqlite] Multi-threading.

2005-07-26 Thread Mrs. Brisby
On Mon, 2005-07-25 at 09:48 -0500, Jay Sprenkle wrote:
> The theory has been proposed that threads aren't better than separate
> processes, or application implemented context switching. Does anyone
> have an experiment that will prove the point either way? It will have
> to be OS specific though, since I'm sure not all thread
> implementations are equal.

Define "better".

It's evident that many bugs are more common, and some kinds of bugs only
appear in threading applications. It's also debatable which is easier to
program.

I think the only useful measure might be context-switch performance.

But note that if two processes (A and B) are doing rapid context
switches between eachother, if their pages are all COW the page tables
don't need to be updated. Your overhead would at most be comparison
time. This behavior occurs quite frequently on unix systems.

The question one postulates at this point is comparing page pointers
more expensive than locks/system calls. The answer seems to be "usually
not" although I have indeed run into cases where the answer is "yes".

On other systems, where we have _three_ processes; two that COW their
page tables and a third that has it's own address space, rapid context
switches between all three if the third process goes between each of the
other two (A->c->B->c->A->etc) - such as it is under Windows NT kernels
that move many operations like network I/O onto a separate "process" -
complete with its own address space and etc.

On those systems, locks are definitely cheaper.

However: Note that even on those systems, scheduling tasks like that are
generally better optimized by moving both A and B's tasks into a single
event-driven loop (and let c use the other processor).


As a side note, many unix applications can be developed for a single
event-driven loop and then with a few well placed fork()s be able to
take advantage of multiprocessor systems immediately. If you take memory
from your stack- you could even use pthread_create().

In this situation, however, I'd like to point out that we're not using a
thread-programming-think even if the underlying technology might be
called a thread.




RE: [sqlite] Multi-threading.

2005-07-22 Thread Mrs. Brisby
On Fri, 2005-07-22 at 17:23 -0400, D. Richard Hipp wrote:
> On Fri, 2005-07-22 at 14:00 -0700, Tim McDaniel wrote:
> > It is hard, if not impossible, with modern GUI
> > systems to write any relatively complex app that is both performant and
> > graphically responsive without using threads.
> > 
> 
> Not true.
> 
> The event model works just fine for GUI programming.
> The bulk of my consulting practice (for the past 13
> years) has been writing high-performance GUIs running
> in front of numerically intensive scientific and
> engineering applications.  Most of these run on 
> windows - at least over the past 5 years.  None
> of them have ever used more than a single thread.

I wasn't about to consider Windows a modern GUI system :)



RE: [sqlite] Multi-threading.

2005-07-22 Thread Mrs. Brisby
On Fri, 2005-07-22 at 14:00 -0700, Tim McDaniel wrote:
> > History has demonstrated that programmers building 
> > multithreaded applications tend to produce buggier code, and 
> > code that touches more pages than a non-threaded version. As 
> > a result, the non-threaded version is easier to write, safer, 
> > and runs faster.
> 
> So, what's your point?  That writing things the easy way leads to safer,
> less buggy, faster code?  That's hardly a point.  The original poster
> presented one of the more compelling reasons for multi-threading in
> modern apps, the GUI.  It is hard, if not impossible, with modern GUI
> systems to write any relatively complex app that is both performant and
> graphically responsive without using threads.

I have no problems writing GTK applications without threads. I suspect
the toolkits you are familiar with are inadequate.

Most GUI toolkits rely heavily on callback interfaces. Closures and
signal-based event notification are much more beneficial than threads:
they produce fewer bugs, and are easier to program still.

But this isn't an argument for closures and signal-based event
notification. This is a general argument against threads.

> At least for the short term, Moore's Law is slowing down, we might as
> well start calling it Moore's Dream.  All main CPUs are going multicore,
> even game consoles, and one of the only realistic ways to take advantage
> of that is through multi-threading.  Saying it is hard doesn't change
> reality.

The other "realistic" way is multiple processes, which is the point I
argue for.

Threads can be useful, but they're only faster than processes if they
touch less pages.

Threads require more memory (synchronization primitives, etc) and many
languages that encourage threading provide no mechanism to isolate those
privileges onto a few pages.

Those page hits tend to cause MORE page-table access than
similarly-written programs than use processes.

I use threads sparingly, and try to keep all memory access on the stack
when I do use threads so that I am NOT touching any heap-memory (rather
as little as possible), and therefore, require few synchronization
primitives.

I tend to not see very many programmers doing this. I suspect this is
because programmers believe "threads are faster and cheaper, so everyone
should use them."

In general, threads don't solve problems because the problems that they
CAN solve, people rarely face, and the problems that they are ATTEMPTING
to solve can be solved in better, cleaner ways.


> > > As another user also mentioned, a Windows system works 
> > better with few 
> > > processes with many threads.
> > 
> > Windows uses threads because x86 page tables are expensive to 
> > load. It doesn't help: the system-call method Windows uses 
> > eats any benefit that it has, again producing net-zero.
> 
> This being THE reason Windows emphasizes threads over processes is hard
> to swallow.

What? That threads are cheaper than processes? If Windows has another
reason for using threads instead of processes I'd love to hear it.

With all of the downsides to threads, the only reasonable one is that
they _may_ be cheaper than processes.

Unfortunately, that textbook _may_ is a reality _isn't_ in most cases.

Nevertheless, my point to this statement was lost: Windows could
optimize process creation, but it could also optimize system calls.
Those steps would make the parent-threads' argument moot (not to mix
metaphors).


> > > I am also interested in your comments on Pointers and GoTo.  I note 
> > > that Java is 100% pointers.  Apart from basic types, all 
> > object access 
> > > is by pointer.
> > 
> > Java uses references, not pointers.
> 
> This is purely semantic nit picking.

That's your opinion. Other languages have both. When they do, the
distinction is more important.


> > > Using Exceptions correctly, I have never felt the need for a GoTo.
> > > Exceptions do the same as GoTo, accept, maybe, in a slightly more 
> > > developed and useful way.
> > 
> > Exceptions are slower than goto. They are also less 
> > straightforward when deeply nested (long chains of throws XYZ 
> > come to mind...)
> > 
> 
> I would agree that exceptions are not a good replacement for gotos.
> However, having been a professional C++ programmer for over 10 years, I
> have never needed a goto.  This probably stems more from the fact that
> with C++/Java/C# you don't really need gotos, but with C/Basic/etc there
> are arguably things that you can't do, or would be quite hard to do,
> without gotos.

I'm glad I gave you an opportunity to post your VC. Meanwhile, while C
doesn't _need_ gotos:

for (i = 0; i < n; i++) for (j = 0;j < m; j++) if (q(i,j)) goto TAIL;
TAIL:

looks better than the alternative. Just because you don't _need_ a
condom doesn't mean it's not a good idea.

I think that avoiding goto is good for the same reason that avoiding
threads are good. Once you've gotten used to knowing why they're bad is
when you can begin to 

Re: [sqlite] Multi-threading.

2005-07-22 Thread Mrs. Brisby
On Wed, 2005-07-20 at 17:26 +0100, Ben Clewett wrote:
> Dr Hipp,
> 
> I am just playing devils advocate here because I have completed much
> Java programming in a multi-threaded application. :)
> 
> I understand the problems of multi-threading.  I am reminded that it
> took nearly 20 years of development to get multi-processor support in a
> modern OS stable.  Much success for this can be attributed to Semaphore
> Flags.  With CPU hardware support to ensure that the semaphore it's self
> cannot be controlled by more than one process.

ITC in 1970 supported multiple threads trivially.

> Multi-thread applications suffer the same problems.  Without semaphore
> flags or 20 years of development.  A novice programmer can happily
> create a second thread and quickly create terribly applications.



> However the need for multi-threads is compelling.  Especially in a GUI
> environment.  For instance a Mail reader.  Where one thread is needed to
> ensure the GUI is drawn correctly and respond to GUI events.  Another to
> download and dispatch mail.  (My Thunderbird has 10 threads.  This may
> be a bit of overkill :)

No. Threads are not a need. They allow you to use blocking system calls
in parallel without extra page table loads.

History has demonstrated that programmers building multithreaded
applications tend to produce buggier code, and code that touches more
pages than a non-threaded version. As a result, the non-threaded version
is easier to write, safer, and runs faster.


> As another user also mentioned, a Windows system works better with few
> processes with many threads.

Windows uses threads because x86 page tables are expensive to load. It
doesn't help: the system-call method Windows uses eats any benefit that
it has, again producing net-zero.

> I believe the issue is not whether to use threads, but to use them
> correctly.  Which is not a hard thing to do with a little support.



> This is where Java (and .NET) work well.  If you use them correctly.
> They provide thread-safe objects.  Which have been designed to use
> semaphore flags internally.  If the programmer uses these thread-safe
> objects correctly, they will not encounter thread issues.  For instance,
> all communication between threads should be exclusively through these
> thread-safe objects.

Java uses threads poorly. They're expensive to set up, and many Java
programmers yield to non-blocking methods as Java closures tend to be
easier to program, and faster too.

> Further, Java and .NET provide Sycronisation methods.  The defining of a
> method to be synchronised automatically creates the locks to ensure
> thread safe access.

...

> I am also interested in your comments on Pointers and GoTo.  I note that
> Java is 100% pointers.  Apart from basic types, all object access is by
> pointer.

Java uses references, not pointers.

> Using Exceptions correctly, I have never felt the need for a GoTo.
> Exceptions do the same as GoTo, accept, maybe, in a slightly more
> developed and useful way.

Exceptions are slower than goto. They are also less straightforward when
deeply nested (long chains of throws XYZ come to mind...)

> These are just my opinions :)

They are wrong.



Re: [sqlite] Is there a new sqlite specifically design for WinCE.

2005-05-16 Thread Mrs. Brisby
Don't reply to existing messages with a new topic. People won't see your
message.

On Mon, 2005-05-16 at 08:24 -0400, steven frierdich wrote:
> Is there a new Sqlite designed specifically for WinCE? I am using a 
> Sqlite 3.0.7 version that was design for WinCE. Is there a newer one?
> Thanks
> Steve

I don't believe there is. If you found a 3.0.7 for WinCE, chances are it
was distributed with patches (if there were any) which can probably be
applied to later versions.



Re: [sqlite] InnoDB Transactions

2005-05-16 Thread Mrs. Brisby
On Sun, 2005-05-15 at 19:48 -0700, Dan Kennedy wrote:
> Anyone have a quick summary of InnoDB algorithm for comparison? I
> suspect whatever trick they use won't work with SQLite because
> there is no server process, but it could be interesting anyway.

There are several ways the extra fsync() might be avoided:

1. using an internal journal, and an operating system guarantee that
earlier write() occurs before later write() when sent to disk at fsync
()-time. I do not know of any operating system that makes this guarantee
[and keeps it!]

2. write changes to "changeblobs" instead of journals and have sqlite
logically merge the changeblobs when loading (thus, original db is
modified less frequently).

However, because multiple files have to be searched, this slows down
reads quite a bit (unless the index were redesigned). This is a
tradeoff- writes _might_ be faster, but reads certainly aren't.

interestingly enough, lucene does this. actually, many real databases do
this because it's very easy and allows you to use structures that are
less mutable in the database, and thus often simplifies implementation.

this wouldn't require database-structure changes (actually), but it
would mean that the database would have to be recovered/vacuumed before
it would be a single file again, and thus usable on older sqlites...

... however, hooking into the changeblob system makes single-master
replication very simple and efficient (without mucking about in the
users' code)


3. block-level versioning. use a buddy system to keep 2 copies of each
block. changes to a block write out to the new spot. each block would
have a chain to the next "changed block" known about [for recovery], and
a block bitmap would be used to maskout the unused blocks.

during recovery, the last changed block (in the header) is scanned and
each change in that set is checked against the bitmap. 

in this way, space is traded; speed remains "about the same", although
because the space is bigger, the kernel would need to be given tricks to
know when a block could be evicted safely. This is similar to the
problem faced by systems like UML and might require kernel patches to
avoid hurting the VMM.

if you allow (say) 8 copies of each block, you can detect when two
simultaneous writers are about to trample over eachother VERY easily,
although you limit yourselves to 4 copies.

it is NOT a modification of this to generalize for "n copies" of each
block as you then have a serialized index of how many blocks each (ahem)
block there is.



RE: [sqlite] backup open database

2005-05-01 Thread Mrs. Brisby
Alrighty: embedded hardware _does_ make sense.

Some options:

0. gzip the dumped database. don't save indexes, etc. Rebuild the
database on startup. You can even diff against older copies if your
flash has a [significantly] limited number of write cycles.

this is probably the best/easiest method, but if you don't have control
of the startup routine, and can guarantee some code is run at shutdown,
it probably won't work for you.

1. Examine the sqlite source code: write a routine that locks the main
db exclusive, checks for the -journal file, and as necessary
unlocks/waits until it can get a lock with the -journal file missing.
Use this, then copy the database normally.

This requires some coding, and gaining understanding of what sqlite uses
various files for. It is the least invasive on your existing system,
even if it's probably the hardest to get right.

2. Use a ramdisk, and put LVM on it. Or if this is not a linux or
freebsd-based embedded box, alter your operating system to allow you to
temporarily defer all filesystem operations "somehow".

Fork the filesystem using LVM's snapshots (freebsd has a similar
invention, ms windows has shadow copies, but i don't know as to their
availability in the embedded world). If you can defer filesystem
operations, simply copy the database+journal to another directory while
all other filesystem operations are suspended.

Use sqlite's open on the database+journal to recover it on the forked
volume, then close: copy the resulting db to flash.

This is actually really simple, and could be made very general for
saving all kinds of configuration data. I use a method very similar to
this for router configuration as I can simply do a snapshot and cpio
directly into flash (skipping my temporaries as I do a O_EXCL->link-
>unlink to guarantee atomicity), then delete the snapshot.

It does (however) take advantage of the fact that sqlite is crashproof.
It'll do it's recovery later.

On Sun, 2005-05-01 at 10:37 +0800, Damian Slee wrote:
> hi,
> i had have a look, i may be able to use this.  this will be in an embedded 
> hardware application, so i don't have a command line, only what i code in. 
> 
> i was hoping to do a binary copy of the db from a ram file system to flash 
> memory for permanent storage.  flash is really slow, so a copy would be a lot 
> quicker than a series of .dump Inserts.  which really requires sqlite running 
> on a flash file system as well.  a copy is really all i want, but knowing 
> when everything is commited, or locking out other threads without them having 
> to close the db.
> 
> thanks,
> 
> damian
> 
>  
> 
> 
> 
> 
> From: Mrs. Brisby [mailto:[EMAIL PROTECTED]
> Sent: Sun 1/05/2005 1:07 AM
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] backup open database
> 
> 
> 
> Why is it inappropriate to use:
> 
> sqlite originaldb ".dump" | sqlite backupdb
> 
> 
> On Sun, 2005-05-01 at 00:28 +0800, Damian Slee wrote:
> > hi all,
> > i want to make a copy of a sqlite3 database file while it is open.  is 
> > there anyway that i can tell programatically that any caches/journals are 
> > flushed out? or is there any way to get and exclusive lock i guess, before 
> > making a backup copy?
> >
> > my proposed application would have quite a few threads which may have the 
> > same sqlite db open for read or write (rarer), but i dont want to shut down 
> > the application to definitely know it is safe to copy.
> >
> >
> > thanks,
> >
> > damian
> 
> 
> 
> 



Re: [sqlite] backup open database

2005-04-30 Thread Mrs. Brisby
Why is it inappropriate to use:

sqlite originaldb ".dump" | sqlite backupdb


On Sun, 2005-05-01 at 00:28 +0800, Damian Slee wrote:
> hi all,
> i want to make a copy of a sqlite3 database file while it is open.  is there 
> anyway that i can tell programatically that any caches/journals are flushed 
> out? or is there any way to get and exclusive lock i guess, before making a 
> backup copy?
>  
> my proposed application would have quite a few threads which may have the 
> same sqlite db open for read or write (rarer), but i dont want to shut down 
> the application to definitely know it is safe to copy.
>  
> 
> thanks,
>  
> damian



Re: [sqlite] using cross-link table

2005-03-26 Thread Mrs. Brisby
On Sat, 2005-03-26 at 10:29 -0500, Jason Jobe wrote:
>   // to find people whose great-grandfather's name is Fred
> Well, this is probably correct sql but hopefully you get the idea. So
> part of the question is:
> 
>   1. How efficient is this?

Not at all. It doesn't work. If you meant:

select p1.* from person as p1, person as p2, person as p3
where p3.name='Fred' and
p3.parent_rowid = p2.rowid and p2.parent_rowid = p1.rowid;

then it's still not very good- as fast as three integer lookups and one
string lookup. An index would help that string lookup. More so if it
could be unique.




>   2. Is there an ordering that would make it much more efficient?

Ordering?

>   3. Would it be better to write the custom function "related-to" to use
> its own index?

Not likely.

>   4. Can a custom function leverage the sqlite data structures is some
> low level ways to gain efficiency?

Sort of.

SQL doesn't have trees. It looks like you're willing to commit to the
cost of maintaining transitive closure of graphs, but I don't believe
you have to. See:
http://www.intelligententerprise.com/001020/celko.jhtml



Re: [sqlite] Prepared Statement Interface

2005-03-26 Thread Mrs. Brisby
On Sat, 2005-03-26 at 15:15 -0600, Eric Bohlman wrote:
>   #define SEX "'M'"
> > sqlite3_bind_text(stmt, index, SEX, sizeof(SEX), 
> > SQLITE_TRANSIENT);

[note the extra ' in SEX]

> Change sizeof to strlen here (and in the corresponding statement below); 
> sqlite3_bind_text wants the length of the string, not the size of the 
> pointer to it.

sizeof(SEX)-1 is fine.



Re: [sqlite] Handling database updates in my program updates

2005-01-29 Thread Mrs. Brisby
None of this is necessary if you select a durable schema.

Whenever you think you need to "add a field" - add a whole new table and
use joins anywhere you need access to the new field.

You can't "delete" a field, but deleting a field usually means losing
data anyway.

You can't change the nature of a field without changing the domain that
the data exists in. Keeping strict 1NF tables can really help avoid
this.


Your "update procedure" wouldn't be necessary at all.



On Sun, 2005-01-30 at 11:44 +1300, Murray Moffatt wrote:
> I'm creating a shareware program using Visual Basic and SQLite.  I'm using  
> the SQLiteDB wrapper.
> 
> I want to plan how I am going to handle updates to my program that involve  
> modifing the database, and thus I'd like some advice as I'm sure others  
> have faced this problem before.  I need to be able to send out an update  
> to existing users which keeps all of their existing data in tact.  I  
> imagine that most of the updates will involve adding new fields.
> 
> At the moment my plan is to have a Control table in the database that  
> includes a field that holds the version number of the database.  Then when  
> my program starts up it can check this version against its own version and  
> if the database version is lower then it can tell the user that they need  
> to upgrade their database and automatically run an Update program that  
> will be included with all update releases.
> 
> The Update program will have an empty copy of the latest database (i.e.  
> tables all set up but no actual data), and will import all the records  
>  from the old database into the new one.  Then rename the old database  
> (thus keeping a copy in case something goes wrong) and copy the new one in  
> its place.
> 
> One problem I thought of is what happens if a user skips an update and  
> thus is upgrading from, say, version 1 to version 3.  Rather than create a  
> convoluted system where that Update program can convert from any version  
> to any other, I would like to make the Update program as generic as  
> possible, i.e. it reads the old database and matches the fields in it with  
> the fields in the new database and copies that data.  In this way it won't  
> expect to see certain fields in certain versions, instead it just sees a  
> Name field in the old database and copies the data found in that field  
> into the Name field in the new database.  Obviously all new fields will be  
> empty.
> 
> Does this sound like a logical way of handling database updates?  Does  
> anyone have any suggestions for a better method, or possible pitfalls in  
> my idea?
> 
> One assumption I'm making is that if I upgrade the verson of SQLite that  
> is used by my program then I assume that newer versions will always be  
> able to read databases created by older versions.  Is this correct?



Re: [sqlite] How to create and uses a stored procedure for Sqlite

2005-01-26 Thread Mrs. Brisby
Steve:

You really should point out if you're familiar with other environments,
and ask questions that represent your eventual goals.

You're fishing right now, and your questions are so loaded it's not even
funny.

I'm going on a hunch here, but I suspect you might be familiar with [a
particular vendor] SQL server because you keep mentioning triggers and
stored procedures for things that I understand aren't easy to do
_without_ triggers and stored procedures on that system.

SQLite is _not_ a replacement for that package. SQLite is a way for
programs that contain most of their business logic _outside_ the SQL
server to carry a full relational system and use it for _storage only_.

SQLite is a replacement (or possibly a supplement) for systems like
MySQL or Postgres which are attracting _new users_ to store data, and
possibly constraints for that data inside this remote, blackbox, thing
they keep calling "SQL Server".

But like most MySQL and Postgres applications, the business logic for
SQLite applications is NOT going to exist in the SQL server. It's going
to exist [almost entirely] inside your application.

This is why people keep mentioning TCL, or C [or perl, or python or
ruby]- that's where your business logic has to go.

I suggested you use the SQLite "REPLACE" command for something that you
might've just used:

BEGIN;SELECT  ; UPDATE ; COMMIT

Both are _almost_ the same on SQLite, and most SQL-beginners have no
problem using that (if they don't know about REPLACE), but you said a
_single_statement_.

A single_statement has meaning on that other SQL server, but not here.
When using SQLite I even recommend that you use multiple queries (but
not a BEGIN-SELECT-UPDATE-COMMIT in a tight loop) as the query
analyzer/optimizer in SQLite still leaves much to be desired.

And should: SQLite is designed to be small and fast in "most cases".
Programs that use SQLite typically have less than 10% of their source
being SQL. The remainder "business logic" gets profiled. Rarely is it
one of that 10% of SQL, but then, 10 minutes of thought for hand SQL
optimization isn't so bad when you get to use a _normal_ profiler for
the other 90% of your code.

If you're looking for a way to recode business logic written in some
stored procedure-language, you really need to let the list know this.
There are quite a few really smart people on this list that are actually
familiar with your systems and have probably even done something quite
similar to what you're doing now.

Please take a moment to explain what your current project is so that we
can help you better.


On Tue, 2005-01-25 at 17:11 -0500, Steve Frierdich wrote:
> Anyone know of any links or have example on how to create a stored 
> procedure and how to use that stored procedure in code on a sqlite database?
> Thanks
> Steve
> 
> >  
> >



Re: [sqlite] CHECKING fOR A STRING IN A COLUMN IN A TABLE AND IF STRING IS NOT IN THE COLUMN ADD IT

2005-01-25 Thread Mrs. Brisby
On Tue, 2005-01-25 at 09:34 -0500, Steve Frierdich wrote:
> Does anyone have any code they can send me that shows how to check for a 
> string in a column in a table, and if the string is not there on how to 
> insert the string in the column in the table in a new row. Then finally 
> on how to update the table?

You should look at the sqlite documentation for REPLACE.

> And if possible on how to maybe write a trigger for the above operation 
> and use the trigger in a Visaul C++ or a Window's 32 program?

I don't understand why you think you need a trigger for this.



Re: [sqlite] $ in table and column names

2005-01-18 Thread Mrs. Brisby
On Mon, 2005-01-10 at 18:00 -0500, D. Richard Hipp wrote:
> A user has reported a bug saying that SQLite does
> not allow the '$' in the middle of indentifiers
> (without quoting).  The bug reports says that
> statements like this:
> 
> CREATE TABLE ex$1( col$abc INTEGER );
> 
> are legal and work fine in other database engines.
> This seems very odd to me.  Can anybody else
> confirm the use of '$' in the middle of identifier
> names in other database engines?

I find quoting to be satisfactory, and use it whenever attributes are
called for that aren't of the set [a-zA-Z]+

Nevertheless, I think it should be useful to understand why someone
would want this. I presume most people on this list are familiar with $-
expansion in most modern interpretive languages (esp. considering the
strong ties to php, tcl, and perl the list members seem to have).

Fortran allows the dollar-sign in variable names, and it's common! Many
Fortran programmers use the dollar-sign in the absence of real
structures and classes- I can imagine this making porting Fortran code
easier.

VMS uses the dollar-sign in a similar way- to separate classes (think:
SYS$CURRENCY or COB$CARDREADER :) )

Many REXX programmers use the dollar-sign as part of an identifier to
represent a kind of "global" scope (e.g. $.foo and EXPOSE $.) - some
REXX programmers end up "thinking" SQL in the same way.

Now a dissent: Some SQL interfaces (Informix comes to mind) use the
dollar-sign BECAUSE it's not allowed in SQL to mean "this follows is
SQL". I cannot imagine allowing the dollar-sign _in_ identifiers would
mess this up, but it might screw up Really Stupid Parsers[tm].

I don't think it would make many TCL or PERL programmers happy to see $
being what (in their mind) could be overloading. It might introduce
confusion. Simply requiring it be quoted could help alleviate it. Then
again, those programmers aren't likely to take advantage of it.

Again: I'm fine with quoting. I don't particularly like the idea of
barewords changing meanings, but I confess to taking advantage of it.



Re: [sqlite] sql tables. . .?

2005-01-08 Thread Mrs. Brisby
On Sat, 2005-01-08 at 00:11 -0500, John Richard Moser wrote:
> BTW, I was thinking that my package hierarchy would pretty much be lain
> out like
> 
> Category->SubCategory->Group->Data

Why would you think this? Did you read a book on relations or databases
(as already suggested) yet?

> Where Data would be a table of package name, version, files, etc, some
> of which would have to be tables.  This places a table (data) inside a
> table (group) inside a table (subcategory) inside a table (category)
> logically.

No, they would not have to.

Go get a book on the relational model (just about any decent book on
databases ought to at least have a primer on it).

What you want is called a "join".


> SQL databases, can they handle this literal hierarchy, or do I not have
> the ability to embed tables in tables?

Most SQL engines do not support tree-structures inherently, although
there are several very good ways of getting at it for most applications.

However, the application you describe does not appear to require a tree
structure. It can trivially be satisfied using a few relations.


> All content of all messages exchanged herein are left in the
> Public Domain, unless otherwise explicitly stated.

It may be your right to put your comments in the public domain, but it
is not your right to put my comments in the public domain.



Re: [sqlite] sql tables. . .?

2005-01-08 Thread Mrs. Brisby
On Sat, 2005-01-08 at 00:04 -0500, John Richard Moser wrote:
> What can be in an SQL table?

Tuples.

> Can tables contain tables?

Thankfully not.


>   I want to be able to search for a given
> package and find what files are in it, search for a file and find what
> packages supply it. . . what's the best way to do this?

CREATE TABLE data (package TEXT, file TEXT);

SELECT file FROM data WHERE package='desired';

SELECT package FROM data WHERE file='desired';


Note that indexes are probably desired.


> All content of all messages exchanged herein are left in the
> Public Domain, unless otherwise explicitly stated.

It may be your right to put your comments in the public domain, but it
is not your right to put my comments in the public domain.




Re: [sqlite] Precompiled statements in the TCL interface

2004-12-31 Thread Mrs. Brisby
On Fri, 2004-12-31 at 08:30 -0500, D. Richard Hipp wrote:
> The issue with the third approach is deciding when to clear the
> precompiled statement cache.  Precompiled statements use memory
> and we do not want them to hang around forever.

Why not? Programs that "generate" SQL are often-foolish. Penalize them.

Perhaps that's too draconian.

Is it possible for the compile command to tell anything about what's
calling it (filename, line number; like perl's caller() function). If
so, simply reserve one cache entry per exec per file+line.



Re: [sqlite] PHP mods was: Changing table and security

2004-12-23 Thread Mrs. Brisby
On Thu, 2004-12-23 at 04:27 -0500, Randall Randall wrote:
> >>
> >> Arg! Just invites people to crash your web site.
> >>
> > *their* web site... think about phpbb or similar phorums
> 
> The unsaid part here, though quite off-topic, is that PHP bulletin
> boards are often changed with "mods" which alter the source of the
> PHP files, modify the database, etc.  The more of these you apply,
> the more likely that you've completely destroyed your installation,
> but the first few are likely to work great, and in the meantime a
> given mod can't assume that the original board's schema still applies.
> 
> I don't usually work with PHP, but a disproportionate amount of the
> time I have spent has been in applying and repairing such mods for
> clients.

No. The unsaid part is that the authors of "phpbb and similar phorums"
actually like corrupting databases and crashing web sites. I can't
believe they're stupid.

Each "mod" should create it's own tables, say:

CREATE user_birthday (username TEXT, birthday TEXT);

and then when they want to get the "email" out of the regular "user"
table, they'll want a:

SELECT username,email,birthday FROM user JOIN user_birthday USING
(username);

and be done with it. There's absolutely NO REASON WHATSOEVER for two
modules to be borking around the database.

Even if they weren't clever enough for that, they might just have a:

CREATE userdata (username TEXT, key TEXT, value TEXT);

to go ahead and treat their MySQL as the nested associated array that
they've been doing all that nasty wrapping to get this whole time.


The only times I've EVER needed an ALTER TABLE are:
* to change the "DEFAULT" of a column
* to add an additional constraint
* to remove a constraint on data I'm no longer using.

That's it. SQLite already does all of these things just fine (albeit:
with triggers).

Other "needs" for ALTER TABLE are done during development, and I have no
problem taking the database offline during development to rebuild the
schema...


So tell your friends, tell your neighbors: Stop writing stupid SQL.



Re: [sqlite] OT: Reply-To Munging Considered Useful

2004-07-23 Thread Mrs. Brisby
On Fri, 2004-07-23 at 10:25, D. Richard Hipp wrote:
> Dan Keeley wrote:
> > Or, why dont we have a vote, and once it's decided, tuff luck thats how 
> > it stays? :)
> > 
> 
> Better:  Somebody please write me a simple, secure, mail handler to
> replace qmail/ezmlm that lets each user decide for themselves whether
> they want a Reply-To back to the mailing list or unmunged headers.
> 
> I'll be happy to supply volunteers with a detailed specification of
> what I am looking for in a mail system.

I can't. Your mailing list messages prevent me from replying to you. :)

ezmlm _will_ do this, btw. Simply create sublists; one with munging and
one without.



Re: [sqlite] syncronizing

2004-05-22 Thread Mrs. Brisby
On Sat, 2004-05-22 at 13:47, Puneet Kishor wrote:
> On May 22, 2004, at 12:31 PM, nathanvi wrote:
> 
> > In my office we have to share a databse (a list of addresses).
> > A lot of people has a notebook.
> > So my idea was:
> > - a file.sqlite on the server in which all people can add, delete,
> > select contacts
> > - when someone have to go away, he can copy this file on his notebook
> > and modify it when he is outside office
> >
> > The problem is when he return in office because i would like the copy
> > and the original in the server are syncronized...
> > Is it possible?
> > Is there an utility which does it?
> 
> I don't know of any utility that does synchronization, but I do know 
> that synchronization is not a simple problem. You have to make rules 
> for conflict-resolution when source and target are different, and just 
> looking at the modified date stamp might not be enough. For your 
> specific problem you might be better off with a tool suited for 
> something like this... Palm Desktop or other such address book type 
> programs come to mind. A database may not be the best tool for such a 
> thing. That said, of course, one could create a program that does such 
> a synchronization, but it may not be worth the trouble.
> 
> Sorry for no encouraging answer vis a vis SQLite.

I have a similar problem; for the synchronization step of my software,
the database is dumped to a text format sorting on an object id and
version column, and then I use diff to produce a patch that I then
simply upload.

I then use rsync to download updates.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Implementing LISTEN

2004-05-11 Thread Mrs. Brisby
The hard part is to do it portably.

I use a multicast socket that writes the change-event. Then I use a
select() call on the multicast socket in other clients- if they see the
change event, they incorporate the new data into their cache.

If you want it reasonably seamless, create triggers that evaluate a
dummy-function that calls your notify: (e.g. SELECT foo() FROM table
LIMIT 1)



On Sun, 2004-05-09 at 18:39, Chris Waters wrote:
> Hi,
> 
> I need the ability to tell if changes have been made in the database. Kind
> of like the LISTEN capability in Postgresql. In particular I need to know
> if:
> 
> * A row was inserted.
> * A row was deleted.
> * A row was modified.
> 
> I would like to implement this through a callback that an application could
> register. The callback would specify the operation that was performed, the
> affected table and maybe the rowid affected.
> 
> If the rowid is reported then for operations that affect a large number of
> rows there would be a performance penalty to call the callback per row. For
> my application this is not a problem, but perhaps when the callback is
> specified the application could indicate whether it wants per row, or per
> table notification.
> 
> I am looking for ideas on where I should hook into the source code to add
> this functionality. From what I have seen so far it looks like I could
> change the implementation of the Delete and MakeRecord VM instructions to do
> the callback.
> 
> I haven't thought through the timing implications yet. Ideally when a row is
> inserted, code in the callback could query the newly inserted row. I am not
> 100% sure at what point the row becomes visible for query.
> 
> I would welcome any insight since this is my first attempt (of many I have
> lots of features I want :-) to modify the sqlite code.
> 
> Thanks,
> 
> Chris.
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Password-function in SQLite?

2004-04-26 Thread Mrs. Brisby
You can create a new function to do this quite painlessly. Poke around
the wiki for information on creating a new function, or read the source.

But beware: consider creating one that uses SHA1 or MD5 as a hashing
function instead of your local systems' crypt() as not all crypt() are
created equal.

On Mon, 2004-04-26 at 03:09, Hermann Kuffner wrote:
> Is there any function in SQLite to mask input values like
> 'password('anything')' in MySQL?
> I would need this, because of the sqlite-db-files can be opened and read
> by any text-editor and sometimes I must store connection-data(ip's,
> passwords etc.) in the db.
> 
> Thanks,
> 
> hermann
> 
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] row size limit

2004-04-19 Thread Mrs. Brisby
On Mon, 2004-04-19 at 20:36, D. Richard Hipp wrote:
> Mrs. Brisby wrote:
> >>
> >>The linked-list structure of overflow storage is part of the problem.
> >>But the fact that SQLite uses synchronous I/O is also a factor.  In
> >>order to make BLOBs fast in SQLite, I would have to change to a different
> >>indexing technique for overflow storage *and* come up with some kind
> >>of cross-platform, asynchronous disk read mechanism.
> > 
> > 
> > D.R.Morrison (1968)'s PATRICIA would certainly be faster for indexing
> > large objects.
> > 
> 
> A key feature of SQLite 3.0 (needed to support internationalization)
> is the ability of users to specify their own comparison functions then
> have SQLite use that comparison function to order indices.  PATRICIA
> does not support user-defined comparison functions.  Keys in PATRICIA
> must occur in memcmp() order, as far as I am aware.

Why not fold the strings at insert time to keep your indexing simple?
You can still get internationalization, but require the user supply a
function with performs this folding: á -> a for example.

> > Asynchronous read isn't necessary, but vectored reads are. Consider
> > readv() POSIX 1003.1-2001 -- in fact, you could probably make
> > result-fields return a struct iovec * that would "point" to the value
> > within the database.
> > 
> 
> readv() doesn't help, actually.  BLOBs are stored in 1k blocks scattered
> all over the file.  readv() reads a continguous range of bytes - it
> puts those bytes into scattered buffers but the bytes must originate
> from a contiguous region of the file.  I'd still have to do 1024
> sequential readv()s in order to extract a 1MB blob.

My brain fizzled out there for a moment. I don't know where I was.

Sadly you're right. While, POSIX 1003.1-2003 does define aio_read(), it
still is a portability nightmare.

On systems where context switches are cheap, one could use fork() or
posix threads to populate a number of pipes, but doubt this would buy
much (if anything)... ever.

Wouldn't it be nice if poll() actually did something interesting with
regular files? :)

Looks to me like you can either make two I/O policies (or more), sort
your reads/seeks, OR move the blobs into another file :)



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] row size limit

2004-04-19 Thread Mrs. Brisby
On Mon, 2004-04-19 at 07:04, D. Richard Hipp wrote:
> Darren Duncan wrote:
> > 
> > I think the simple answer is that SQLite uses a linked list which can't 
> > know where a page is until reading the next one, but other databases use 
> > something other than a linked list; they would trade a bit of complexity 
> > for speed. -- Darren Duncan
> > 
> 
> The linked-list structure of overflow storage is part of the problem.
> But the fact that SQLite uses synchronous I/O is also a factor.  In
> order to make BLOBs fast in SQLite, I would have to change to a different
> indexing technique for overflow storage *and* come up with some kind
> of cross-platform, asynchronous disk read mechanism.

D.R.Morrison (1968)'s PATRICIA would certainly be faster for indexing
large objects.

For those of you without google, PATRICIA is called "Crit-bit trees" by
DJB, and "supports the following operations at high speed:
  * See whether a string x is in the tree.
  * Add x to the tree.
  * Remove x from the tree.
  * Find the lexicographically smallest string in the tree larger
than x, if there is one.

It essentially works by storing a compressed pointer to the first
unequal bit in the key. This means comparisons aren't necessary to
traverse the nodes (but they can be needed to add nodes!)

http://cr.yp.to/critbit.html
http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Tree/PATRICIA/



Asynchronous read isn't necessary, but vectored reads are. Consider
readv() POSIX 1003.1-2001 -- in fact, you could probably make
result-fields return a struct iovec * that would "point" to the value
within the database.


Finally, it may be worthwhile to finally allow some portions of the
database to be stored outside the main file and only store indexes in
the main file. This can give huge performance increases for large blobs
and wouldn't (necessarily) require a file format change if older
programs were prepared for the fact that they might not be able to do
anything useful to the value returned :)





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [sqlite] escaping " and % in sql statements

2004-01-27 Thread Mrs. Brisby
Calling sqlite_create_function() overrides builtin functions.


On Tue, 2004-01-27 at 07:36, Drew, Stephen wrote:
> So how does SQLite distinguish between the built-in:
> 
> { "like",   2, SQLITE_NUMERIC, likeFunc   },
> 
> and a user-defined function such as 
> 
> { "like",   2, SQLITE_NUMERIC, my_like_function   },
> 
> then?
> 
> -Original Message-
> From: Mrs. Brisby [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, January 27, 2004 12:27 PM
> To: [EMAIL PROTECTED]
> Cc: D. Richard Hipp; [EMAIL PROTECTED]
> Subject: Re: [sqlite] escaping " and % in sql statements
> 
> 
> On Mon, 2004-01-26 at 09:08, [EMAIL PROTECTED] wrote:
> > "D. Richard Hipp" <[EMAIL PROTECTED]> writes:
> > 
> > > As an interim workaround, you could defined your own
> > > "like()" function using the sqlite_create_function() API
> > > that implemented built-in escapes.  Any user-defined
> > > function named "like()" is used to implement the LIKE
> > > keyword of SQL.
> > 
> > That seems pretty dangerous, since no one would expect that function name
> to
> > exist as a public symbol in a library.  I would suggest changing the name
> of
> > the internal like() function to sqlite_like() or some equivalent, to avoid
> > potential problems.  Being able to overload internal functions could be a
> nice
> > feature (which should be documented) but it seems the names should be
> clearly
> > associated with sqlite.
> 
> I do not think you understand.
> 
> SQLite has no "deep magic" allowing it to figure out what you call your
> own symbols: The built-in like() function is a static symbol in func.c
> called "likeFunc", and your code has nothing at all to do with this.
> 
> You make a call like this:
> 
> sqlite_create_function(p, "like", 2, my_like_func, 0);
> 
> after defining your own function; in this example it's called
> "my_like_function" - but you can certainly call it whatever you like,
> including sqlite_like, if you're so inclined.
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] escaping " and % in sql statements

2004-01-27 Thread Mrs. Brisby
On Mon, 2004-01-26 at 09:08, [EMAIL PROTECTED] wrote:
> "D. Richard Hipp" <[EMAIL PROTECTED]> writes:
> 
> > As an interim workaround, you could defined your own
> > "like()" function using the sqlite_create_function() API
> > that implemented built-in escapes.  Any user-defined
> > function named "like()" is used to implement the LIKE
> > keyword of SQL.
> 
> That seems pretty dangerous, since no one would expect that function name to
> exist as a public symbol in a library.  I would suggest changing the name of
> the internal like() function to sqlite_like() or some equivalent, to avoid
> potential problems.  Being able to overload internal functions could be a nice
> feature (which should be documented) but it seems the names should be clearly
> associated with sqlite.

I do not think you understand.

SQLite has no "deep magic" allowing it to figure out what you call your
own symbols: The built-in like() function is a static symbol in func.c
called "likeFunc", and your code has nothing at all to do with this.

You make a call like this:

sqlite_create_function(p, "like", 2, my_like_func, 0);

after defining your own function; in this example it's called
"my_like_function" - but you can certainly call it whatever you like,
including sqlite_like, if you're so inclined.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [sqlite] Subject: Re: [sqlite] Let us make SQLite more powerful

2003-12-30 Thread Mrs. Brisby
On Tue, 2003-12-30 at 06:46, Steve O'Hara wrote:
> My understanding is that logically, by their very nature, joins are always
> likely to be slower than single table queries - I thought that this was a
> given downside to normalisation or am I incorrect?  Maybe the downside is
> not pronounced in SQLite but I'm sure that you see it in other RDBMS (if you
> crank up the results set volume enough).

Not necessarily; JOIN is an algebraic operation. Most RDBMS can handle
JOINs efficiently - especially with the aid of indexes. Some RDBMS even
translate "multi-occurrence columns" into JOIN operations transparently.

Some database engines cannot optimize all forms of JOIN, nor can they
optimize all syntax for a join, but these are deficiencies in the
database engine, not in SQL itself.


> > > > If you think this is easier to read, then consider creating VIEWs
> > > > to
> > > > store intermediate queries.
> > >
> > > I have never considered this as I am not a power user of SQLite.
> > > Sorry...!
> >
> > VIEWs are quite normal in other SQL providers. That said, you may want
> > to become more familiar with SQL before you start suggesting extensions
> > to it- at the very least, be aware that syntactical additions may very
> > well be fine, but you will face criticism if your reasoning is flawed.
> >
> 
> In SQLite, VIEWs are simply an expedient for expressing your query - there
> is no performance upside to using them opposed to a fully specified SQL
> command.

Agreed. They were covered because another reason for the "new syntax"
was because they're "easier to read" (something I happen to disagree
with).

> > If you have a table which has a value that can return multiple values
> > you MUST alter the API of the callback function or of the fetch
> > function. Otherwise how is older code supposed to deal with a table
> > having multiple values in an attribute?
> >
> 
> Why?
> Most systems will return you the whole column value with delimiter
> character(s) separating the sub-fields.
> I think the alterations to the API will not be in the callback at all, but
> WILL be in the query processor and index engine.
> We are talking about changing the search engine to delineate certain field
> values when querying.  It's easier for the indexer, which simply creates
> multiple postings for the same record but with different terms (each
> sub-field).

No, we weren't talking about that. That's one solution that I suggested
(using a user-defined function to split out various portions of the
value). We were talking about adding new syntax to SQL(ite) that
performs implicit joins making it possible to return and store multiple
_values_ within a single column/row.


> I fully understand your point about normalisation and fully understanding
> issues before pronouncing on them.
> However, I think that your not seeing the full picture of the usefulness of
> multi-occurance columns with the simple example mentioned previously.



> Take a trivial names and address scenario - each person can have multiple
> addresses, phone numbers, emails, fax numbers.
> In a normalised world where we have a table for each of these, you soon run
> into a complicated query if you want to find a person who has a particular
> address, a particular phone number, a particular email etc.  Don't we have
> to do unions of a number of joins?
> Whereas, in a multi-occurrence schema it's simply
>   select * from names where address='vghg' and phone='776' and email='jhjh'
> etc.

and how is the callback called for each _phone_ where:
select * from names where address='yghg';

How many times is callback called? for which rows? How do we tell which
inner-structure we are in? Is the callback going to get a folded string?
an incomplete one? or just some dummy blob to be passed to another
sqlite function?

These are very important questions that _must_ be resolved. If they
aren't- then compatibility is broken.

Thus I suggested moving it entirely into functions, that is:

select * from names where any_equals('address','yghg');

where any_equals() is a function that splits/checks values as
appropriate.

besides not making any changes to the callback API, this method is
future proof, although as you might have noticed, it cannot presently
take advantage of indexes (sqlite doesn't presently index functions,
IIRC).

if the question is simple enough- and all your operations are like
any_equals() you can optimize this to use the LIKE operator which
doesn't require any new functions AND can utilize indexes (my other
suggestion).



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Subject: Re: [sqlite] Let us make SQLite more powerful

2003-12-29 Thread Mrs. Brisby
On Sat, 2003-12-27 at 07:16, Yogesh Vachhani wrote:
> > Why exactly do you think the JOIN operator is the wrong tool for
> > this?
> 
> I feel JOIN operations are taxing on the Process as well as on RAM
> and HDD (in form of temporary files)

All right then. SQLite doesn't produce temporary files for JOIN, and
your query _can_ be expressed relationally as a JOIN, so it should never
be any faster than a JOIN operation- except due to parsing. Can you
verify that parsing is taking "too long" here?


> > If you think this is easier to read, then consider creating VIEWs
> > to
> > store intermediate queries.
> 
> I have never considered this as I am not a power user of SQLite.
> Sorry...!

VIEWs are quite normal in other SQL providers. That said, you may want
to become more familiar with SQL before you start suggesting extensions
to it- at the very least, be aware that syntactical additions may very
well be fine, but you will face criticism if your reasoning is flawed.


> > > So one can see that one would not lose anything by using SQLite's
> > > extended storage model (if implemented). Even though the
> > LineItems
> > > information is embedded into the Invoices table, one can still
> > write
> > > a simple query to access it as if it were in an independent
> > table.
> > 
> > Backwards compatibility gets lost.
> > API changes so programs need to be altered.
> I did say at the start that we should extend the existing
> functionality so that who are comfortable with the existing on will
> keep on using this one and other can try out the other one!
> 
> This not lead to lost in BACKWARD compatibility?!

If you have a table which has a value that can return multiple values
you MUST alter the API of the callback function or of the fetch
function. Otherwise how is older code supposed to deal with a table
having multiple values in an attribute?


> > That's an idea
> > anyway, it can already be done with SQLite by supplying your own
> > function. Additionally, you could add lists like this yourself
> > (pick a terminator, and make a function that indexes)
> 
> By the way does  anyone know whether SQLite supports User Defined
> Functions? If such a thing is possible then all such functionality
> can be implementd out side SQLite in a separate DLL. What do u all
> think?

Yes, SQLite supports "user-defined" functions. The Wiki has a great deal
of documentation about this...

> > Generally speaking, I think that trying to store an unknown number
> > of distinct values in one record column is a bad idea, and it
> > violates level 1 normalization.  You really should use a separate
> > table for the phone numbers and/or addresses, one record per
> > instance.
> 
> It does not violate some of the normalization rules but then how many
> follow this pracitcally. In fact I have noticed that in practicaly
> implementations many times a developer has to avoid normalization
> rules to implement some pecular requirements of their customers

I do not think you know what you are talking about. "level 1
normalization" comes from graph theory and describes a manner in which
every graph can be translated into another graph as utilizing no nested
functions - which are called "level 1 normal forms".

I don't think the responder quite spoke correctly either. The structure
you desire _can_ be folded down (normalized) into the view normal people
have of SQL: You aren't suggesting that anything be made available that
isn't presently available- but you want to write these systems without
learning SQL, or because you believe giving syntactical hints will make
queries run faster.

I do think that this reasoning is erroneous. Profile, don't speculate,
and get the very basics of relational calculus down before you decide
SQL is too primitive to support the data structures you need (hint:
there _are_ many structures that map very poorly to SQL. the one you
selected however, maps very _nicely_ to SQL).


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] BLOB support and SQLite 3.0

2003-12-15 Thread Mrs. Brisby
On Mon, 2003-12-15 at 09:37, D. Richard Hipp wrote:
>(2) The ".dump" command will only show the binary data through
>the first \000 character.  If the binary data contains no
>\000 characters, the ".dump" command might segfault.

Add an SQL function like TOCHAR or something like that. Then make dump's
output look like this:

INSERT INTO foo ('Hello'||TOCHAR(32)||'World!');




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [sqlite] Let us make SQLite more powerful

2003-12-15 Thread Mrs. Brisby
All I was demonstrating was how to use a single field. It's obviously
better to use a second table and build a relationship using JOIN.

I don't see where "multi-occurrence" fields can do something that
JOIN-ed relationships can't.

SELECT LastName FROM Contacts JOIN Phones USING (phoneID) HAVING
PhoneNumber='2064814442';

or perhaps:

SELECT LastName FROM Contacts WHERE phoneID IN (SELECT phoneID FROM
Phones WHERE PhoneNumber='2064814442';

Thus it would seem that this can simply be implemented in terms of
syntax (automatic table creation, rewriting queries for implicit joins,
etc).

You'll note I never said that this feature is "uncommon" - merely
unnecessary, kludgy, and queries written in this manner are difficult to
read, and as at least you've noticed, difficult to port. It would be
nice for the engine to do it automatically for you.

But it would make the engine more complex- for a direction that isn't
standard, that doesn't add the ability to answer questions that
previously could not, nor the ability to answer questions faster than
before. That is, as near as I can tell, it only provides the ability to
answer questions with less typing.

Even if BASIS is slower for these implicit joins, that doesn't mean
other systems are (PostgreSQL isn't, for example). What it does mean is
that BASIS has a crappy query optimizer.

That said, I really would like to see user-defined "structures"- but I
suspect this will have to wait for SQLite 3.0 or whenever we get
non-null-terminating values...


On Mon, 2003-12-15 at 07:25, Steve O'Hara wrote:
> Here's my tuppence worth;
> 
> I've been working with BASIS for donkeys years and it is one of the few
> RDBMS that fully supports multi-occurence fields.
> To search for a value in such a field you simply say
> SELECT LastName FROM Contacts
> WHERE Phones = "2064814442"
> This means search in any occurance within Phones
> 
> If you want a specific entry then you say
> SELECT LastName FROM Contacts
> WHERE Phones(3) = "2064814442"
> 
> Interestingly, if you have a number of fields that have matched values e.g.
> Phone, Extension etc you can do this
> SELECT LastName FROM Contacts
> WHERE Phones="2064814442" WITH Extension="234"
> This means find records where the same occurance of Phones and Extension
> meets the criteria.
> 
> Although BASIS is fully relational, this feature is very commonly used
> instead of creating linked lookup tables and doing a join, because of the
> poorer performance of joins. The techniques shown here by Mrs Brisby are
> fine but because they use LIKE, they don't use an index and so will be very
> slow for large databases.
> 
> However, there are some disadvantages to this stuff;
> 
>   Multi-occurance fields are not common to RDBMS
> 
>   Sorting can be interesting (you might have to do an "exploding" sort) e.g.
> SELECT LastName FROM Contacts
> WHERE Phones = "2064814442" ORDER BY EVERY Phones
>    or
> SELECT LastName FROM Contacts
> WHERE Phones = "2064814442" ORDER BY Phones(1)
> 
>   More complexity in SQLite
> 
> Steve
> 
> 
> 
> 
> -Original Message-
> From: Mrs. Brisby [mailto:[EMAIL PROTECTED]
> Sent: 12 December 2003 13:56
> To: Yogesh Vachhani
> Cc: [EMAIL PROTECTED]
> Subject: Re: [sqlite] Let us make SQLite more powerful
> 
> 
> On Fri, 2003-12-12 at 08:09, Yogesh Vachhani wrote:
> > For example, if the Phones column is a collection, one can find all
> > rows that contain the phone number 206-481-4442 by specifying the
> > following SELECT statement:
> >
> > SELECT LastName
> > FROM Contacts
> > WHERE Phones(ANY) = "2064814442"
> 
> Why not encode Phones as ":phonenumber:"
> that way you can already do:
> SELECT LastName FROM Contacts WHERE Phones LIKE '%:phonenumber:%'
> 
> (btw, your quoting is wrong)
> 
> > If one always store fax numbers in the third element of the Phones
> > collection, one could find all rows that contain a particular fax
> > number
> > like this:
> >
> > SELECT LastName
> > FROM Contacts
> > WHERE Phones(3) = "2064814442"
> 
> SELECT LastName FROM Contacts WHERE Phones LIKE ':%::%::phonenumber:%';
> 
> 
> > For example, lets return to the contact database, which contains a
> > Phones column whose
> > data type is PhoneStructure, so the Phones column contains the
> > columns PhoneType and PhoneNum. The following query could returns
> > just the 

Re: [sqlite] Let us make SQLite more powerful

2003-12-12 Thread Mrs. Brisby
On Fri, 2003-12-12 at 08:09, Yogesh Vachhani wrote:
> For example, if the Phones column is a collection, one can find all
> rows that contain the phone number 206-481-4442 by specifying the
> following SELECT statement:
> 
> SELECT LastName 
>   FROM Contacts 
>   WHERE Phones(ANY) = "2064814442"

Why not encode Phones as ":phonenumber:"
that way you can already do:
SELECT LastName FROM Contacts WHERE Phones LIKE '%:phonenumber:%'

(btw, your quoting is wrong)

> If one always store fax numbers in the third element of the Phones
> collection, one could find all rows that contain a particular fax
> number
> like this:
> 
> SELECT LastName
>   FROM Contacts
>   WHERE Phones(3) = "2064814442"

SELECT LastName FROM Contacts WHERE Phones LIKE ':%::%::phonenumber:%';


> For example, lets return to the contact database, which contains a
> Phones column whose
> data type is PhoneStructure, so the Phones column contains the
> columns PhoneType and PhoneNum. The following query could returns
> just the phone number 206-402-9400 and not the phone type for that
> number:
> 
> SELECT Phones(*)!PhoneNum
>   FROM Contacts
>   WHERE LastName = "Calder"

This didn't require any special syntax.

> The following query would return the phone numbers for every contact
> who has a fax number:
> 
> SELECT Phones(*)!PhoneNum
>   FROM Contacts
>   WHERE Phones(ANY).PhoneType = "fax"

Instead of encoding phone numbers as :phonenumber:, you could also use
:type=phonenumber:

> Suppose that we put the PhoneType and PhoneNum in a separate table
> instead of in a PhoneStructure. To tie rows in the Phones table to a
> contact in the Contacts table, we then add a RefTo collection column
> called PhoneRef to the Contacts table. This RefTo column contains the
> Row IDs of one or more rows in the Phones table. To access phone
> numbers in the Phones table, one uses an identical query (except for
> minor name changes for clarity):
> 
> SELECT PhoneRef(*)!PhoneNum
>   FROM Contacts
>   WHERE PhoneRef(ANY).PhoneType = "fax"

This is redundant.


> One can query data in a collection or structure-valued column just as
> one can query data in a conventional table. To the query interface, a
> collection of structures should appears as a logical table with its
> parent table name automatically prefixed. So, for instance, a query
> to find all LineItems in the Invoices table would look like this:
> 
> SELECT PartNo, QtyOrdered, UnitPrice, QtyOnHand, EditDate
>   FROM Invoices.LineItems

Why exactly do you think the JOIN operator is the wrong tool for this?
If you think this is easier to read, then consider creating VIEWs to
store intermediate queries.


> So one can see that one would not lose anything by using SQLite's
> extended storage model (if implemented). Even though the LineItems
> information is embedded into the Invoices table, one can still write
> a simple query to access it as if it were in an independent table.

Backwards compatibility gets lost.
API changes so programs need to be altered.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Spiders vandalizing the wiki

2003-11-25 Thread Mrs. Brisby
Google won't submit forms. Robots can't read.

Require a challenge before allowing submissions- whether it be an
email-based challenge, or an image that contains distorted text. I don't
know if there are any blind users of SQLite but they would probably
prefer the former.

On Tue, 2003-11-25 at 07:48, D. Richard Hipp wrote:
> In the past couple of days, I've been having problems with
> spiders vandalizing the Wiki at http://www.sqlite.org/cvstrac/wiki.
> The damage (so far) has been relatively minor and easy to fix.
> But I've been monitoring these spiders for a while and notice
> that they are becoming increasingly aggressive.
> 
> If you have any suggestions on what to do about them, I'd
> like to hear from you.
> 
> I'm guessing that these spiders are coming from spammers looking
> to harvest email addresses.  Last nights attack came from
> 61.51.123.205.  No reverse DNS is available.  Can anybody
> provide any insight into where the attacks are coming from?
> 
> I've added some spider detection logic to cvstrac which
> shuts down large portions of the website to spiders.  I
> used to have problems with spiders pulling out every historical
> version of every source code file every night.  I've solved
> that now.  But I've been leaving the wiki open to spiders
> because I *want* Google spiders to visit the wiki.
> 
> My next defense will be to try to make the wiki read-only
> to spiders.  That will prevent the vandalism I'm been seeing
> for the past couple of days, assuming my spider detection
> continues to work.  But if the attackers start doing a better
> job of disguising their spiders, my detection might fail
> and this defense would become ineffective.
> 
> Any other ideas?


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] double email

2003-11-23 Thread Mrs. Brisby
On Sat, 2003-11-22 at 16:23, Bert Verhees wrote:
> > It seems to me that all the people send theiur ewmail double, and that it
> > is a failure in the list software configuration.
> >
> > This email has only one TO: sqlite-users <[EMAIL PROTECTED]>
> > And no CC or BCC, thus, if this is posted twice, it is the list software

> Only one conclusion, the list software is configured wrong. It is annoying
> because I am reading from small bandwith and high price for this moment

Some mailing list software support a "me-to" option. This scans the
To/Cc/Bcc headers and explicitly avoids forwarding to any listed users
that are subscribed. You are probably used to this behavior, even though
it is wrong.

This is a relatively risky thing- it can prevent certain users from
seeing a particular post- by simply putting into the headers that the
user already saw it.

You should instead reply privately to any user you see doing this and
say "please don't post to me _and_ the list. I'm subscribed already."


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Improving concurrency in SQLite

2003-11-23 Thread Mrs. Brisby
On Sat, 2003-11-22 at 12:02, D. Richard Hipp wrote:
> It is more complicated than that.  In order to maintain isolation
> (the "I" in "ACID") the second commit must fail if at any time during
> the transaction it read a value that was written by the first commit,
> or if the first commit read a value that was written by the second
> commit.  Figuring out what the first commit has read or written is
> likely to be difficult since the first commit has already occurred
> and (presumably) its journal has already been deleted.

Not necessarily; You could keep a hash of the "expected row" when
committing changes. If the hash doesn't match, fail the commit.

A reader will also keep a journal- even though they're not making
changes. A journal that contains hashes of every row it returned to the
user [perhaps some optimizations can be done here- maybe it should only
kick into this mode under certain circumstances] - these hashes are
checked when we do ANY update/insert/delete on that process- not just
the hashes of the rows for update.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Improving concurrency in SQLite

2003-11-22 Thread Mrs. Brisby
This is more rambling than anything else. I hope I'm reasonably coherent
before coffee... 

Instead of keeping an "undo" journal- consider using a "changes"
journal-- each writer could use it's own journal then only at commit
would each worker actually start making changes to the database.

DEFER_WRITES isn't necessary; the changes journal could have information
in it diverting certain tables/rows/pages for THIS WRITER ONLY so that
selects can get information about earlier inserts/deletes/updates. No
other writer needs access to this information until committing.

Note that these changes will contain not only the values they are
storing, but some kind of hash of the original value. This is used to
detect whether or not two processes will UPDATE the same value. To make
this simple, COMMIT should be allowed to fail (with no changes made).

ROWID needs special assistance- There wasn't any mention of this on
concurrency.html but what happens if two writers INSERT at the same
time? Would they get the same ROWID? What if the user calls
last_insert_rowid() or whatever?

Ideally, ROWID should be allocated as soon as we SEE the INSERT. It
shouldn't matter much if we just lost a ROWID-- they're relatively cheap
on 32-bit platforms... If the database format changes, they're even
cheaper on 64-bit platforms...





On Sat, 2003-11-22 at 09:09, D. Richard Hipp wrote:
> Feedback is requested on the paper found at:
> 
> http://www.sqlite.org/concurrency.html
> 
> Your analysis and ideas will help to make SQLite a
> better database engine.  Thanks.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] In-memory DB performance tuning

2003-11-18 Thread Mrs. Brisby
On Tue, 2003-11-18 at 00:55, Arthur C. Hsu wrote:
> Hello,
> 
> I have an in-memory DB and try to perform inserts to tables inside.  Seems
> that there are some performance bottlenecks for continuous inserts.  The row
> insertion speed will drop dramatically after the first 6000 rows are
> inserted.  That is
> 
> Time Elapsed  Rows inserted   Rows inserted per 30 seconds
>   
> --
> 30  7920  7920
> 60  107112791
> 90  131472436
> 120 14944   1797
> 150 16598   1654
> 180 17878   1280
>  21019609   1731
>  24020711   1102
> 
> I've already tune my SQL to precompiled execution (using sqlite_compile &&
> sqlite_step).  
> I've modified MAX_PAGES   to 1048576, MAX_BYTES_PER_ROW to 17646 and
> TEMP_STORE to 3, where my projected in-mem DB size is 50M.
> I've tried using PRAGMA cache to add cache size, it helps a little bit but
> not much (say, from 1500 inserts per 30 seconds  to 1510).
> 
> Any clues that I can further squeeze the performance?  Or the limitation is
> by design?  I just can't realize why the first 6000 rows are amazing fast
> but later the speed drops down so dramatically.

It would really help if we could see an example of the queries/code that
generate this problem, including the CREATE TABLE/INDEX definitions.

And yes, this does mean including data (or a generator) that can exploit
this problem.

You also didn't specify when your transactions are being created or
removed.

Your benchmark should also include how much memory is actually being
used. You expect it to be approximately 50M, but you haven't explained
why this is the case.

I'll share my suspicions, but I'm not prepared to stand by them until I
see the above: Allocating that much space for cache means you could be
using more than three times the amount of ram that you expect- possibly
even more.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [sqlite] Jumping to a record

2003-11-11 Thread Mrs. Brisby
You need to register a new function with-state to do this. Something
like this:

void f(sqlite_func *ctx, int arg, const char **argv)
{
int *state = (int *)sqlite_user_data(ctx);
if (atoi(argv[0]) == atoi(argv[1])) *state = 1;
sqlite_set_result_int(ctx, *state);

}
static int state = 0;
sqlite_create_function(db, "f", 2, f, );


Could be used to evaluate an additional where clause:

WHERE f(ROWID, 12345)=1


On Tue, 2003-11-11 at 19:11, Bronislav Klucka wrote:
> > Hi,
> >
> > I have a set of records selected, for example, with:
> >
> > "select * from NewsData where (upper(Zip)>='HX1' and
> > upper(Zip)<'HX2') order by Zip,Address,No"
> >
> > There is an INTEGER PRIMARY KEY field in the database called ID
> >
> > Given that I know I have a record with an ID of 12345, how would
> > it be possible to "jump" to that
> > record, and still maintain my original select parameters and order?
> >
> > I guess that this is down to a select within a select, but I'm
> > not sure how to go about it. Well, I'll
> > qualify that - anything that I've tried hasn't worked yet!
> 
> define "jump" to that record, and still maintain my original select
> parameters and order".
> Be more specific..
> 
> Bronislav Klucka
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] database table is locked

2003-11-07 Thread Mrs. Brisby
On Sat, 2003-11-08 at 07:20, Thiago Mello wrote:
> Hi Ben Carlyle, 
> 
> First of all, thanks for your help!
> 
> I can't brig these two operation togethe causa I need the result of the
> SELECT in a if condition.

You cannot do the UPDATE inside of a SELECT callback. You do not need
the results of a SELECT for an UPDATE. Not for this one.

I think you mean:

UPDATE TABLE1 SET number=number+1 WHERE id="JOHN";

Or maybe you meant:

UPDATE TABLE1 SET number=number+1 WHERE name="JOHN";

Either way, it seems an awful like you have a "bigger goal" that you're
trying to solve that you're not explaining here for whatever reason--
instead you assumed you needed to UPDATE inside the SELECT. You can't
and if you show us what you're really trying to do, I'll bet that you
don't either...


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Performance problem

2003-11-06 Thread Mrs. Brisby
On Thu, 2003-11-06 at 19:00, [EMAIL PROTECTED] wrote:
> How would you handle the lack of ordering associate with hash tables? 
> Sqlite can currently use indicies for three main tests: equals, less than, 
> and greater than. While hash-tables are good at finding equal-to in 
> constant time it usually means linear time (a table-scan) to test for less 
> than or greater than. Do you have a solution to this problem?

You presume that a defining characteristic of a hash-table is that the
function be non-linear. Consider a n->3 bit function:
f(x) -> x mod 8
A very linear function that it is perfectly easy to iterate _in_order_
all values of the table.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Protection of db file

2003-11-03 Thread Mrs. Brisby
You really need to specify _who_ you want to protect the database from;
what kind of access, etc.

I think the answer you're looking for is the SQLite "client" provides
access control to SQLite- not the other way around- that is, you must
control all clients allowed access to the database in order to provide
mandatory access.


On Mon, 2003-11-03 at 03:04, thomas_b_hansen2003 wrote:
> I need a good advice regarding protecting the sqlite db file. 
> 
> I know that sqlite do not support "User Accounts" but someone must 
> have had the same need. 
> 
> If several users must have onlæine access to the db file, how can i 
> do this.
> 
> Encryption is not my favorite solution.
> 
> I hope some of you have a good idea.
> 
> thomas
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Making sqlite support unicode?

2003-10-29 Thread Mrs. Brisby
On Mon, 2003-10-27 at 10:27, [EMAIL PROTECTED] wrote: 
> On Sun, 26 Oct 2003 23:36:55 -0500
> "Mrs. Brisby" <[EMAIL PROTECTED]> wrote:
> > It's good to use null-terminated in many cases; especially in collating
> > and sorting. It helps to understand that in those cases you stop
> > processing _after_ you see the terminator (and treat the terminator as
> > it is: zero.)
> 
> Collating involves with length. If data length is known prior to scanning
> data, in some cases you can skip it if it doesn't match without scanning
> data body. It helps to understand that in those cases you stop processing
> _before_ you see the terminator or anything else.

No it cannot. How are the following tokens collated?

aaa
aab



> > UTF-16 is NOT used in HFS+. HFS+ still uses ASCII with some "tricks".
> > UFS is what's "preferred" in MacOS X, and it doesn't use UTF-16 either.
> > UTF-16 isn't what we're talking about anyway, it's UCS16.
> MacOS X uses "Unicode" as its native encoding. In Unicode encoding
> the most used in MacOS X is UTF-16. Only to call BSD API it uses
> UTF-8. It's kind of hybrid, but UTF-8 is just used for compatibility to
> Unix parts in MacOS X, and other non-Unix pieces in MacOS X, which
> is why MacOS X is Mac, is using UTF-16 internally, including Carbon,
> Cocoa and ATSUI.
> 
> For HFS+, from Apple's Technical Note TN2078 (Migrating to FSRefs & long
> Unicode names from FSSpecs):
> http://developer.apple.com/technotes/tn2002/tn2078.html

See the parts in http://developer.apple.com/technotes/tn/tn1150.html
regarding the HFS wrapper; this was what I thought I was remembering;
my memory on this subject is admittedly spotty.

I don't recall this information being quite so easy to google; thanks for
correcting me on this.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Making sqlite support unicode?

2003-10-26 Thread Mrs. Brisby
On Sun, 2003-10-26 at 14:26, Wayne Venables wrote:
> >Further: I always read statements like "Microsoft C/C++ is the largest
> >most popular language platform in the world" as foolish sentiment. These
> >people obviously don't know what they're talking about and need a good
> >healthy dose of some reality.
> 
> A subtle barb fired in my direction.  I think statements like that are even 
> more foolish and even more out of touch with reality!  Oh sure, TRON is the 
> most used OS in the world.  Does sqlite even compile on TRON?  How many 
> developers program for it?  Windows is installed on 96% of all desktop 
> computers and somewhere around 30% of servers.  That's a very large number 
> of machines, but you dismiss that pretty easily.  And Visual Basic is 
> probably still the most popular programming language in the world! (and 
> yes, that should make you shudder!)

If you're targeting a large platform- however large- even monopolistic
large, you are NOT WRITING PORTABLE CODE. I can dismiss however large
Windows is simply by stating that it is not portable. SQLite is rather
portable: It builds on many different platforms. It even _works_ on many
different platforms.

I have no idea how you got the idea that "portable code" meant runs on
Windows, but you should drop it.

On your other points:

1. SQLite does compile on TRON with very little help. TRON is nearly
POSIX complete (although not certified, AFAIK).

2. The largest number of developers are presently mobilized for unixish
platforms (POSIX). Windows claims to have a POSIX abstraction layer that
doesn't comply with POSIX.1, but because fewer people know this each
year, my statistics may be skewed by this.

3. The popularity of Visual Basic is greatly exaggerated. Using google:
"Visual Basic" programmer resume: 61,200
"C" programmer resume: 165,000
"C++" programmer resume: 92,500
"Java" programmer resume: 112,000
"Javascript" programmer resume: 48,900
"SQL" programmer resume: 84,000

I have no idea how you got the ideas that you have on this subject, but
it really surprises me whenever someone wants to talk about the
popularity of a language. You can find metrics to support whatever you
like, and I have no idea which ones you are using.

I'd probably believe that more dollars are spent on Visual Basic
Training every year, but I also wouldn't consider that a fair indicator
of how many people are actually using it, or how many commercial
programs (or better quality) are being turned out in a year. I wouldn't
suggest that has anything to do with how long it's been around or how
many people that actually HAVE USED IT and know some other programming
language and yet still use Visual Basic.


> But, for the record, I spend 99% of my time developing for unix in a 
> programming language that really knows only ASCII (with some 
> exceptions).  Here's a hint: this language will, in it's next major 
> version, be a very large source of new sqlite users.  The other 1% of the 
> time is spent developing in a USC-16 platform that isn't Windows.  Even if 
> I'm not in the Microsoft camp, I can acknowledge that it has some significance.

I have no idea what you're talking about. Do you?


> >Too many rude users talk about how inconvenient their life now is
> >because here is this wonderful and rather free toolkit that decided to
> >make the life of the author easier- and most of it's users or potential
> >users easier, at the expense of their own.
> 
> BTW, all I had asked is if anyone had done the work of making sqlite 
> unicode-aware (I did ask for UCS16, however).  I hadn't seen anything to 
> indicate it did anything but straight ASCII.  Someone pointed out that it 
> did handle UTF-8 w/ the appropriate #define and that is certainly good 
> enough for my task.  If I hadn't of asked, I wouldn't have known.

No you said:

"Unfortunately that still means there is a performance hit converting
all data in and out of the library from UTF-8 to UCS16.  A large number
of operating systems and programming languages store strings natively as
UCS16."

And I happily gave you suggestions that would be useful in minimizing
your "performance hit" as long as you weren't writing portable code.
Knowing how to avoid a "performance hit" - especially one that a user
hasn't bothered to profile yet is a good thing to know- a good thing to
have in the archives, so that when someone DOES end up profiling and
sees "oh boy, I'm spending a lot of time converting this into UTF-8 and
back again!" they'll know what they can do.

In some rather weak backpedaling, you tried to "justify" your large
number of operating systems and languages by listing three- only one of
which was actually correct:

"Win32 is unicode (UCS16).  Writing C/C++ w/ Win32 generally involves
using wide char strings.  Visual Basic natively stores strings as
unicode.  Java natively stores strings as unicode.  I'd say that covers
a lot more than 1%.  And if you're 

Re: [sqlite] [OT] Digest Attachments: why use them?

2003-10-24 Thread Mrs. Brisby
On Fri, 2003-10-24 at 12:48, M. Fioretti wrote:
> On Fri, Oct 24, 2003 10:02:36 at 10:02:36AM +0100, Steve O'Hara ([EMAIL PROTECTED]) 
> wrote:
> > I get some 50 emails per day from various sources and I can't
> > afford to have SQLite traffic adding to this list.  However, I do want to
> > get a daily digest of what's happening and to contribute if there is a topic
> > on which I think I can usefully comment.
> 
> Since you brought this up: may I ask you (and other list digest users)
> what is exactly that one gains from list digests, or, in other words,
> why this functionality exists in the first place?

Many users want to keep up with list-traffic but do not ordinarily need
to post to individual messages.

Some users filter messages by hand- to make sure they don't miss
anything.

Some users prefer the compact space in their INBOX.

Some MUAs store messages one-per-file. Grouping them reduces the number
of files (inodes) on the system and in some cases, can speed folder
lookups/searches [[ this is one of those retarded, my MUA isn't broken,
my operating system isn't broken arguments, but I've actually seen it.
]]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Moving the SQLite mailing list

2003-10-18 Thread Mrs. Brisby
On Sat, 2003-10-18 at 10:00, D. Richard Hipp wrote:
> The SQLite mailing list is moving off of yahoogroups and onto
> the same server that hosts the SQLite homepage.  To subscribe
> to the new mailing list, send an email to
> 
>   [EMAIL PROTECTED]

Did you perhaps mean <[EMAIL PROTECTED]> ?

If not, then please change the headeradd file's record of List-Subscribe
to be:

List-Subscribe: 

As this behavior confuses my email client.


Anyway, I'm trying both; we'll see which one responds :)


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]