Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-27 Thread Don V Nielsen
I'm sorry gentlemen, but the argument has gotten thick and petulant.

Every complaint and response is resolving down to a mainframe line of
thought (thank God), which few today are willing to accept. That is, the
SQLite software is kept compatible with its root. How many System 370 Cobol
programs can run on to today's hyper-tech mainframes? All of them. Sqlite
was inspired by a need and built at a time when PC's and O/S's were more
primitive. It has some flaws from then that are still with us today. Why?
Because of compatibility. It is more important for this product to be
compatible with its origin because people and machines are dependent on it
being that way.

The error system is what it is because it worked back then. Efforts have
been made to improve things as far as giving the developer more information
to work with and figure things out. The developer knows their version of
SQLite and their operating system(s). It's the developer's responsibility
to match what SQLite provides given the values available in the environment
that it exists in. If the developer's application is going to run atop of
Linux, and Windows, and Android, it is the developers job to create their
application in a way that is sensitive to them.

SQLite is capable of running anywhere. It is not its responsibility for
knowing exactly where it is being run. It doesn't function at that layer.
If the codes are not enough, the amalgamation is out there. Get a copy and
build into it a new layer of error interpretation logic and have it return
what is needed by the O/S that is specific to the application's needs and
wants.

If I'm wrong, I'm sorry. But I got the feeling the original post (the very
first) was a tantrum, and no matter what anyone does to sooth the situation
is working. It is only getting worse.

Again, I apologize for losing my control.

On Wed, Sep 27, 2017 at 12:12 PM, Guy Harris  wrote:

> On Sep 27, 2017, at 10:00 AM, Keith Medcalf  wrote:
>
> > On Wednesday, 27 September, 2017 10:39, Guy Harris 
> wrote:
> >
> >> On Sep 27, 2017, at 6:58 AM, Keith Medcalf  wrote:
> >
> >>> Well, the terminology is correct.  These *ARE* I/O Errors.  The
> >>> system attempted I/O.  It failed.  Hence the term I/O Error.
> >
> >> Just don't call it a "disk I/O error".
> >
> > Well, maybe.  However the I/O that had the error was associated with a
> disk operation (as opposed to a "Video I/O Error", or a "Cardpunch I/O
> Error", "Printer I/O Error", etc.).
>
> Actually, if it had occurred on my machine, it wouldn't have been
> associated with a disk operation; should the application check where the
> data is stored and say "flash memory I/O" error if appropriate? :-)
>
> The point is that the *disk* isn't particularly relevant to some possible
> errors - the problem isn't with the *disk*, which reported no error, the
> problem is with something in the *file system*, such as the amount of space
> available, the permissions on files, etc..
>
> >>> It is irrelevant whether the error was caused because the heads on
> >>> the tape drive need cleaning, access was denied to spool storage, the
> >>> disk was full, someone yanked the cable out of the disk drive, or the
> >>> card reader got jammed up.
> >
> >> I.e., SQLITE_IOERR is equivalent to -1 as a return from various UN*X
> >> system calls, so that, when a program sees it, it needs to get
> >> further error information, such as an errno value, to deal with the
> >> error and, if necessary, to report it.
> >
> > Yes.  An I/O operation of some sort was attempted.  That I/O operation
> involved some sort of "disk" access.  That operation failed with an error.
>
> ...and the next step is to determine what the exact error was.
>
> >> So it *is* relevant to what to do next.
> >
> > Well, in the same sort of way as the message from attempting to send
> Snail mail "Mail Undeliverable" is relevant to what to do next.  You know
> that the error was related to the delivery of the postal item just as the
> "Disk I/O Error" indicates that an I/O operation that involved a disk
> operation failed with an error.
> >
> > In both cases you need to query for the underlying error condition in
> order to determine what to do.
>
> Well, in the first case, the postal service may well say more than just
> "Mail undeliverable", such as "no such person at that address", "no such
> address", etc..
>
> > So in that sense it is relevant to what to do next -- you need to query
> for more particulars.  This is opposed to say a "Syntax Error" in which it
> is pretty clear that the error is a mis-formed statement.
>
> Yes, but even in *that* case, it's often possible to say, for example,
> "there's no operator between the operands "foo" and "bar"" rather than just
> "syntax error".
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> 

Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-27 Thread Guy Harris
On Sep 27, 2017, at 10:00 AM, Keith Medcalf  wrote:

> On Wednesday, 27 September, 2017 10:39, Guy Harris  wrote:
> 
>> On Sep 27, 2017, at 6:58 AM, Keith Medcalf  wrote:
> 
>>> Well, the terminology is correct.  These *ARE* I/O Errors.  The
>>> system attempted I/O.  It failed.  Hence the term I/O Error.
> 
>> Just don't call it a "disk I/O error".
> 
> Well, maybe.  However the I/O that had the error was associated with a disk 
> operation (as opposed to a "Video I/O Error", or a "Cardpunch I/O Error", 
> "Printer I/O Error", etc.).

Actually, if it had occurred on my machine, it wouldn't have been associated 
with a disk operation; should the application check where the data is stored 
and say "flash memory I/O" error if appropriate? :-)

The point is that the *disk* isn't particularly relevant to some possible 
errors - the problem isn't with the *disk*, which reported no error, the 
problem is with something in the *file system*, such as the amount of space 
available, the permissions on files, etc..

>>> It is irrelevant whether the error was caused because the heads on
>>> the tape drive need cleaning, access was denied to spool storage, the
>>> disk was full, someone yanked the cable out of the disk drive, or the
>>> card reader got jammed up.
> 
>> I.e., SQLITE_IOERR is equivalent to -1 as a return from various UN*X
>> system calls, so that, when a program sees it, it needs to get
>> further error information, such as an errno value, to deal with the
>> error and, if necessary, to report it.
> 
> Yes.  An I/O operation of some sort was attempted.  That I/O operation 
> involved some sort of "disk" access.  That operation failed with an error.

...and the next step is to determine what the exact error was.

>> So it *is* relevant to what to do next.
> 
> Well, in the same sort of way as the message from attempting to send Snail 
> mail "Mail Undeliverable" is relevant to what to do next.  You know that the 
> error was related to the delivery of the postal item just as the "Disk I/O 
> Error" indicates that an I/O operation that involved a disk operation failed 
> with an error.
> 
> In both cases you need to query for the underlying error condition in order 
> to determine what to do.

Well, in the first case, the postal service may well say more than just "Mail 
undeliverable", such as "no such person at that address", "no such address", 
etc..

> So in that sense it is relevant to what to do next -- you need to query for 
> more particulars.  This is opposed to say a "Syntax Error" in which it is 
> pretty clear that the error is a mis-formed statement.

Yes, but even in *that* case, it's often possible to say, for example, "there's 
no operator between the operands "foo" and "bar"" rather than just "syntax 
error".
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-27 Thread Keith Medcalf
On Wednesday, 27 September, 2017 10:39, Guy Harris  wrote:

>On Sep 27, 2017, at 6:58 AM, Keith Medcalf  wrote:

>> Well, the terminology is correct.  These *ARE* I/O Errors.  The
>> system attempted I/O.  It failed.  Hence the term I/O Error.

> Just don't call it a "disk I/O error".

Well, maybe.  However the I/O that had the error was associated with a disk 
operation (as opposed to a "Video I/O Error", or a "Cardpunch I/O Error", 
"Printer I/O Error", etc.).

>> It is irrelevant whether the error was caused because the heads on
>> the tape drive need cleaning, access was denied to spool storage, the
>> disk was full, someone yanked the cable out of the disk drive, or the
>> card reader got jammed up.

>I.e., SQLITE_IOERR is equivalent to -1 as a return from various UN*X
>system calls, so that, when a program sees it, it needs to get
>further error information, such as an errno value, to deal with the
>error and, if necessary, to report it.

Yes.  An I/O operation of some sort was attempted.  That I/O operation involved 
some sort of "disk" access.  That operation failed with an error.

>So it *is* relevant to what to do next.

Well, in the same sort of way as the message from attempting to send Snail mail 
"Mail Undeliverable" is relevant to what to do next.  You know that the error 
was related to the delivery of the postal item just as the "Disk I/O Error" 
indicates that an I/O operation that involved a disk operation failed with an 
error.

In both cases you need to query for the underlying error condition in order to 
determine what to do.  So in that sense it is relevant to what to do next -- 
you need to query for more particulars.  This is opposed to say a "Syntax 
Error" in which it is pretty clear that the error is a mis-formed statement.

In both cases only the underlying error code from the "Operating System" can 
assist you in what to do next.  In the case of Snail Mail, the underlying error 
code of "No Such Address" entails a completely different response than 
"Delivery Vehicle Exploded and Your Message Burned Up en-route to Delivery" or 
"Delivery Location Not Found -- Destroyed by Hurricaine".  Similarly the return 
code from the OS in the case of an I/O is relevant to determining the next step 
in recovery -- "Device Not Found" is different from "Filesystem is Corrupt" 
which is different from "Access is Denied" which is different from "General 
Failure Reading ..." (who is General Failure and why is he trying to read my 
files ... I should hope that such attempts fail :) )





___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-27 Thread Guy Harris
On Sep 27, 2017, at 6:58 AM, Keith Medcalf  wrote:

> Well, the terminology is correct.  These *ARE* I/O Errors.  The system 
> attempted I/O.  It failed.  Hence the term I/O Error.

Just don't call it a "disk I/O error".

> It is irrelevant whether the error was caused because the heads on the tape 
> drive need cleaning, access was denied to spool storage, the disk was full, 
> someone yanked the cable out of the disk drive, or the card reader got jammed 
> up.

I.e., SQLITE_IOERR is equivalent to -1 as a return from various UN*X system 
calls, so that, when a program sees it, it needs to get further error 
information, such as an errno value, to deal with the error and, if necessary, 
to report it.

So it *is* relevant to what to do next.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-27 Thread Keith Medcalf

Well, the terminology is correct.  These *ARE* I/O Errors.  The system 
attempted I/O.  It failed.  Hence the term I/O Error.  It is irrelevant whether 
the error was caused because the heads on the tape drive need cleaning, access 
was denied to spool storage, the disk was full, someone yanked the cable out of 
the disk drive, or the card reader got jammed up.
 
The program attempted to perform an I/O operation (of some kind).
That operation failed.

Now it is up to you, the application programmer, to figure out what to do.  
There are quite a few facilities available to help you do this.  SQLite itself 
has Extended error codes that can help point to where the trouble is.  You can 
ask the Operating System for its abend code.  You can sacrifice chickens or 
baby's or perhaps read the tea leaves.

Personally I think we need a reversion to the old days when there were only 
four status codes:  OK, What?, How?, and Where?

This is far more effective than niggling over what an error code means.  It 
means there was an error.  Full-stop end of sentence, paragraph, page, chapter, 
section, story and book.  There are more than adequate was of determining the 
nature and localization of the error.  Use them.  Love them.

---
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users [mailto:sqlite-users-
>boun...@mailinglists.sqlite.org] On Behalf Of Jens Alfke
>Sent: Tuesday, 26 September, 2017 21:49
>To: SQLite mailing list
>Subject: Re: [sqlite] bug: failure to write journal reported as "disk
>I/O error"
>
>
>
>> On Sep 26, 2017, at 3:17 PM, Guy Harris <g...@alum.mit.edu> wrote:
>>
>> It shows a whole bunch of codes, none of which are "something that
>distinguishes EIO from other errors such as EFBIG, EDQUOT, etc.".
>>
>> I'm not asking for something that indicates what xXYZZY method
>reported the error.  I'm asking for something that indicates what the
>underlying problem causing the I/O error is, to the extent that
>information is available from the OS, i.e. *why* did the I/O
>operation not succeed?
>
>Yes, you’re right — I hadn’t looked at the definitions of those
>extended codes, and they seem … um, not super useful. As a client of
>SQLite, I want to know what specifically went wrong, not which
>internal bit of SQLite reported the error.
>
>—Jens
>___
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Nico Williams
On Tue, Sep 26, 2017 at 01:37:42PM -0700, Jens Alfke wrote:
> > On Sep 26, 2017, at 1:17 PM, Guy Harris  wrote:
> > A user wouldn't know what to do with "you've exceeded your stored data 
> > quota”?
> 
> A Turkish or Chinese user likely wouldn’t. (SQLite’s error messages
> are not localized.) And there are plenty of messages that are much
> less understandable to a lay user than the one you picked out.

They could be.  And regardless, more detail in the error _code_ is
better for the applicaton developer.

EIO is definitely an I/O error.  Could be all sorts of things.  E.g.,
you're using iSCSI and the network is timing out.

ENOSPC is very, very different.  Reporting ENOSPC as an I/O error means
that the app or the user must now use df(1) or strace(1) or similar to
work it out, when SQLite3 could just have reported that the FS is full.
Ditto EDQUOT.

EROFS is also very different.

And so on.

These are ancient error codes.

> > The *number* might annoy the support staff; right off the top of
> > your head, what's the error number for "file system quota exceeded"
> > or "I/O error"?  (No cheating by looking it up in a man page or
> > include file!)
> 
> On the contrary, error numbers are a lot easier for support. They’re
> independent of locale, they don’t get re-worded from one version of
> the app to the next, and they’re very short and easy to dictate over
> the phone. Of course, these shouldn’t be the primary error information
> given to the user! But the user-level error message should be
> something specific to the application, like “an unexpected database
> error occurred (19)” instead of "Abort due to constraint violation”.
> The number would appear only for support purposes.

As long as you can resolve them to symbolic names and/or messages.

Nico
-- 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Jens Alfke


> On Sep 26, 2017, at 3:17 PM, Guy Harris  wrote:
> 
> It shows a whole bunch of codes, none of which are "something that 
> distinguishes EIO from other errors such as EFBIG, EDQUOT, etc.".
> 
> I'm not asking for something that indicates what xXYZZY method reported the 
> error.  I'm asking for something that indicates what the underlying problem 
> causing the I/O error is, to the extent that information is available from 
> the OS, i.e. *why* did the I/O operation not succeed?

Yes, you’re right — I hadn’t looked at the definitions of those extended codes, 
and they seem … um, not super useful. As a client of SQLite, I want to know 
what specifically went wrong, not which internal bit of SQLite reported the 
error.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 3:11 PM, Simon Slavin  wrote:

> On 26 Sep 2017, at 10:53pm, Guy Harris  wrote:
>> 
>> I *would* suggests an additional API to get a *separate* extended error 
>> code, so that if, for example, a write() fails and that failure is turned 
>> into SQLITE_IOERR, you can get something that distinguishes EIO from other 
>> errors such as EFBIG, EDQUOT, etc..
> 
> You know about this, right ?
> 
> 
> 
> 

Yes.  I do.

You know about this, right?

https://www.sqlite.org/rescode.html#ioerr_access

It shows a whole bunch of codes, none of which are "something that 
distinguishes EIO from other errors such as EFBIG, EDQUOT, etc.".

I'm not asking for something that indicates what xXYZZY method reported the 
error.  I'm asking for something that indicates what the underlying problem 
causing the I/O error is, to the extent that information is available from the 
OS, i.e. *why* did the I/O operation not succeed?

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Simon Slavin


On 26 Sep 2017, at 10:53pm, Guy Harris  wrote:
> 
> I *would* suggests an additional API to get a *separate* extended error code, 
> so that if, for example, a write() fails and that failure is turned into 
> SQLITE_IOERR, you can get something that distinguishes EIO from other errors 
> such as EFBIG, EDQUOT, etc..

You know about this, right ?





Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 2:22 PM, Jens Alfke  wrote:

>> On Sep 26, 2017, at 1:57 PM, Guy Harris  wrote:
>> 
>> Which means "for stuff that would be shown to the user, for the user to 
>> read, either localize your error messages, or make sure your API returns 
>> error codes that the application can turn into localized error messages".
> 
> Um, that’s what I said.
> 
>> And none of this argues against presenting to the user, in their native 
>> language, a message saying "you've exceeded your file system quota", if that 
>> is, in fact, what happened.
> 
> This thread isn’t about filesystem quotas. Why do you keep bringing them up 
> as an example?

Because the thread brings up the general question of folding multiple types of 
errors into a single error code, and because it's an example of an error you 
*would* want to show to the user, just as SQLITE_FULL is.

>> *don't* tell the user anything that might convince them that their disk is 
>> failing if you didn't get EIO or the equivalent on some other OS - and don't 
>> tell them something that, when relayed to tech support, would lead the 
>> support person to believe that, either.
> 
> As we’ve been saying, error messages produced by SQLite are not meant to be 
> shown to end users, for all the reasons previously discussed.
> 
> SQLite’s error numbers ought to be sufficiently detailed once you enable 
> extended error codes, and/or get the OS errno. The original set of error 
> codes is inadequate to be sure, for historical reasons, but compatibility 
> rules out breaking that API; that’s why the extended error codes exist.

Yes, which is why I wasn't suggesting changing the error codes.

I *would* suggests an additional API to get a *separate* extended error code, 
so that if, for example, a write() fails and that failure is turned into 
SQLITE_IOERR, you can get something that distinguishes EIO from other errors 
such as EFBIG, EDQUOT, etc..  I would also suggest that the documentation say 
that, if you don't have to run on a version of SQLite that doesn't support the 
new API, the new API be used by applications and libraries running atop SQLite 
in their error-reporting code, rather than, for example, just using 
sqlite3_errstr().
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 2:08 PM, Scott Robison  wrote:

> There are physical errors and there are logical errors. If an error is
> generated from write, it's not unreasonable to classify it as an
> "output error". From read as an "input error".

"Output error", yes, although it'd be useful to provide more information.

"Disk I/O error", no; it'd be unreasonable to classify "out of file system free 
space", "over quota", "permission error", "file bigger than 2GB-1 bytes", etc. 
as "disk I/O errors".

> There is a lot of sqlite source code that already exists and has been
> written to work with the current interface. That's probably one of the
> reasons why extended errors were created, to provide finer
> granularity. Regardless of whether it is ideal or not, changing sqlite
> in a way that would break existing code is unlikely to happen.

I was not suggesting that.  I didn't suggest adding SQLITE_OVERQUOTA or 
SQLITE_WRITE_PERMISSION_ERROR.

> Ultimately it doesn't matter when error codes were added to a given
> operating system or which predates what. A decision was made in the
> past. The options are to live with decisions that were made in the
> past (one I've seen espoused multiple times in this mailing list),
> come up with an approach that allows old code to work but exposes new
> information (probably the genesis of extended error codes), or break
> older code (which I've not seen done deliberately).

I'm advocating a better version of the second of those choices than the current 
"here's the raw operating system error code" version that's currently provided. 
 (sqlite3_system_errno() also has the problem that if SQLITE_IOERR is provided 
for something *other* than a failure that provides a system errno value, it 
doesn't do the job.)

> That being said, I don't know any non-technical users who are going to
> panic that IOERR means their hard drive is dying specifically because
> of that text being displayed. Panic perhaps, but not that a hard drive
> is about to die. Most people I know don't have that level of
> understanding to correlate IO / ERR / hard drive failure rates.

They don't treat "disk I/O error" as an indication that their disk is having a 
problem?  That doesn't need an understanding of hard drive failure rates.

I have no reason to dismiss the original writer's notion that "disk I/O error" 
might "[scare] the hell out of the poor sysadmin who suspects a filesystem 
corruption might be going on".

> They
> just think the stupid program is broken and not letting them get their
> work done. As for the experienced technical people I know (or at least
> me), their first thought would be to investigate the problem, not to
> assume their hard drive is failing.

Less investigative work is needed if the software gives a more detailed error 
report.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 2:16 PM, Simon Slavin  wrote:

> On 26 Sep 2017, at 9:57pm, Guy Harris  wrote:
> 
>> On Sep 26, 2017, at 1:37 PM, Jens Alfke  wrote:
>> 
 On Sep 26, 2017, at 1:17 PM, Guy Harris  wrote:
 
 A user wouldn't know what to do with "you've exceeded your stored data 
 quota”?
>>> 
>>> A Turkish or Chinese user likely wouldn’t. (SQLite’s error messages are not 
>>> localized.)
>> 
>> Which means "for stuff that would be shown to the user, for the user to 
>> read, either localize your error messages, or make sure your API returns 
>> error codes that the application can turn into localized error messages".
> 
> No.  It means that you should present /your/ error messages to your users, 
> not error messages generated by SQLite.  SQLite is a programmer’s tool.  Its 
> users are programmers, and that’s who its error messages are addressed to.  
> You should not be letting your users see error message intended for you, and 
> you should not be making your users worry about what to do about them.

"You" in "either localize your error messages, *or* make sure your API returns 
error codes that the application can turn into localized error messages", 
refers to SQLite.  It ultimately doesn't *need* have have error messages - it 
could leave that entirely up to the application - but it provides them 
nonetheless.

And there's an "or" in my statement; providing a way to get error codes more 
fine-grained than SQLITE_IOERR - so that you don't say "disk I/O error" for 
errors that have nothing to do with a disk reporting an I/O error - is 
something that the application would need in order to provide an appropriate 
error to end users and to the people to whom the end user might report an 
error.  And, no, "that error occurred on this operation" is not the sort of 
fine-grained to which I'm referring.

So just provide a way to get an indication of what *particular* type of error 
generated SQLITE_IOERR - permission error, quota error, actual disk I/O error, 
etc. - and recommend that this *always* be used for SQLITE_IOERR.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Jens Alfke


> On Sep 26, 2017, at 1:57 PM, Guy Harris  wrote:
> 
> Which means "for stuff that would be shown to the user, for the user to read, 
> either localize your error messages, or make sure your API returns error 
> codes that the application can turn into localized error messages".

Um, that’s what I said.

> And none of this argues against presenting to the user, in their native 
> language, a message saying "you've exceeded your file system quota", if that 
> is, in fact, what happened.

This thread isn’t about filesystem quotas. Why do you keep bringing them up as 
an example?

> *don't* tell the user anything that might convince them that their disk is 
> failing if you didn't get EIO or the equivalent on some other OS - and don't 
> tell them something that, when relayed to tech support, would lead the 
> support person to believe that, either.

As we’ve been saying, error messages produced by SQLite are not meant to be 
shown to end users, for all the reasons previously discussed.

SQLite’s error numbers ought to be sufficiently detailed once you enable 
extended error codes, and/or get the OS errno. The original set of error codes 
is inadequate to be sure, for historical reasons, but compatibility rules out 
breaking that API; that’s why the extended error codes exist.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 1:43 PM, Simon Slavin  wrote:

> On 26 Sep 2017, at 9:17pm, Guy Harris  wrote:
> 
>> The *number* might annoy the support staff; right off the top of your head, 
>> what's the error number for "file system quota exceeded" or "I/O error"?  
>> (No cheating by looking it up in a man page or include file!)
> 
> My support staff are allowed to look things up.

Just don't force them to ask, *before* the look it up, whether the user's 
running Linux or macOS or FreeBSD or Solaris or Windows.

> My users, when faced with a result which means "permission error" will 
> probably grant all permissions to all apps and all users because that’s the 
> simplest way to make a permission error message go away.  My users don’t 
> understand the Posix permission model, because they’re not computer experts, 
> they are financial sector specialists, or psychologists, or tailors.  I don’t 
> want them thinking about computer problems.  If they knew enough about 
> computer problems to fix a permission problem the right way, they wouldn’t be 
> paying me.

And, when faced with a result that says "disk I/O error", your users will 
probably think their disk is broken and take it in to be fixed.

So:

for errors where the user *can* perhaps fix the problem, such as "out 
of file system space" (which already has its own error) and "out of disk quota" 
(which doesn't, and which is different from "out of file system space"), tell 
the user what the problem is (and, at the application level, offer a suggestion 
such as "delete some of those cat videos you've saved");

for errors where the user probably *can't* fix the problem, tell them 
that there's a problem for which they need to talk to support, and tell them 
what to say to the support staff so that the support staff knows that, for 
example, a disk hasn't gone bad.

(And there are places where "you don't have permission to do that" *is* the 
appropriate thing to tell the user, e.g. if they're trying to open a document 
to which they haven't been given read permission, or trying to write to a 
document to which they haven't been given write permission, etc..  I suspect 
your support staff have better things to do with their time than explain to a 
user that they're not allowed to read somebody else's private files.)
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Simon Slavin


On 26 Sep 2017, at 9:57pm, Guy Harris  wrote:

> On Sep 26, 2017, at 1:37 PM, Jens Alfke  wrote:
> 
>>> On Sep 26, 2017, at 1:17 PM, Guy Harris  wrote:
>>> 
>>> A user wouldn't know what to do with "you've exceeded your stored data 
>>> quota”?
>> 
>> A Turkish or Chinese user likely wouldn’t. (SQLite’s error messages are not 
>> localized.)
> 
> Which means "for stuff that would be shown to the user, for the user to read, 
> either localize your error messages, or make sure your API returns error 
> codes that the application can turn into localized error messages".

No.  It means that you should present /your/ error messages to your users, not 
error messages generated by SQLite.  SQLite is a programmer’s tool.  Its users 
are programmers, and that’s who its error messages are addressed to.  You 
should not be letting your users see error message intended for you, and you 
should not be making your users worry about what to do about them.

If your software wants to react to a SQLite result code by presenting one of 
its own error messages to its users, that’s fine.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Scott Robison
There are physical errors and there are logical errors. If an error is
generated from write, it's not unreasonable to classify it as an
"output error". From read as an "input error".

There is a lot of sqlite source code that already exists and has been
written to work with the current interface. That's probably one of the
reasons why extended errors were created, to provide finer
granularity. Regardless of whether it is ideal or not, changing sqlite
in a way that would break existing code is unlikely to happen.

Ultimately it doesn't matter when error codes were added to a given
operating system or which predates what. A decision was made in the
past. The options are to live with decisions that were made in the
past (one I've seen espoused multiple times in this mailing list),
come up with an approach that allows old code to work but exposes new
information (probably the genesis of extended error codes), or break
older code (which I've not seen done deliberately).

I'm not trying to tell you that your point is invalid. It makes sense
in many ways. Short of a time machine I doubt anything will change
(though those decisions are above my pay grade).

That being said, I don't know any non-technical users who are going to
panic that IOERR means their hard drive is dying specifically because
of that text being displayed. Panic perhaps, but not that a hard drive
is about to die. Most people I know don't have that level of
understanding to correlate IO / ERR / hard drive failure rates. They
just think the stupid program is broken and not letting them get their
work done. As for the experienced technical people I know (or at least
me), their first thought would be to investigate the problem, not to
assume their hard drive is failing.


On Tue, Sep 26, 2017 at 2:17 PM, Guy Harris  wrote:
> On Sep 26, 2017, at 1:05 PM, Simon Slavin  wrote:
>
>> On 26 Sep 2017, at 8:47pm, Guy Harris  wrote:
>>
>>> On Sep 26, 2017, at 8:22 AM, Jens Alfke  wrote:
>>>
 The basic error code is SQLITE_IOERR, which just means "Some kind of disk 
 I/O error occurred” according to the comment. Which is true in this case; 
 an I/O operation returned an error.
>>>
>>> But the *disk* didn't - the *operating system* did, so if SQLITE_IOERR 
>>> really means "Some kind of disk I/O error occurred", it's *not* the right 
>>> error to return for a *permission* error.
>>
>> Those error codes were devised in a day when OS error codes were more simple.
>
> EDQUOT was introduced in 1982, with 4.2BSD; when was SQLITE_IOERR devised?
>
>> Also please note that those error codes are addressed to programmers.  Your 
>> users should never see the text explanation of the number.  Because your 
>> users wouldn’t know what to do about them.
>
> A user wouldn't know what to do with "you've exceeded your stored data 
> quota"?  If so, your site has failed to explain to the users that they've 
> been given a quota, limiting the amount of space on the server that they can 
> use, and that if they exceed their quota, they either need to delete stuff 
> they no longer need, move stuff they might *someday* need but don't need 
> *now* to some archival medium, or ask their system administrator to increase 
> their quota?
>
>> At most the user can be shown the number returned to they can quote it in a 
>> support call.
>
> The *number* might annoy the support staff; right off the top of your head, 
> what's the error number for "file system quota exceeded" or "I/O error"?  (No 
> cheating by looking it up in a man page or include file!)
>
> And, yes, there needs to be *some* way to get the underlying problem reported 
> to somebody in a position to do something about it - where "the underlying 
> problem" includes "what did the OS say?" as much as it includes "what SQLite 
> operation got the error?".
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
Scott Robison
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 1:37 PM, Jens Alfke  wrote:

>> On Sep 26, 2017, at 1:17 PM, Guy Harris  wrote:
>> 
>> A user wouldn't know what to do with "you've exceeded your stored data 
>> quota”?
> 
> A Turkish or Chinese user likely wouldn’t. (SQLite’s error messages are not 
> localized.)

Which means "for stuff that would be shown to the user, for the user to read, 
either localize your error messages, or make sure your API returns error codes 
that the application can turn into localized error messages".

And none of this argues against presenting to the user, in their native 
language, a message saying "you've exceeded your file system quota", if that 
is, in fact, what happened.

> And there are plenty of messages that are much less understandable to a lay 
> user than the one you picked out.

"I got a permission error trying to write to the journal" isn't something you'd 
directly say to the lay user, but *don't* tell the user anything that might 
convince them that their disk is failing if you didn't get EIO or the 
equivalent on some other OS - and don't tell them something that, when relayed 
to tech support, would lead the support person to believe that, either.

I.e., Richard Krekel is 100% correct when he says that "disk I/O error" is an 
inappropriate message for a permission error - the *disk* had no problem, the 
*OS* had a problem when the disk returned file system data that, among other 
things, indicated that the user didn't have permission to do something.  
Replacing the disk and restoring from a backup probably won't fix that problem 
(unless the user had that permission when the backup was done).

>> The *number* might annoy the support staff; right off the top of your head, 
>> what's the error number for "file system quota exceeded" or "I/O error"?  
>> (No cheating by looking it up in a man page or include file!)
> 
> On the contrary, error numbers are a lot easier for support. They’re 
> independent of locale,

But the error reported by sqlite3_system_errno() isn't independent of the OS on 
which the user is running, so *that* error wouldn't be easy for support.  You'd 
need a platform-independent error code, meaning, in this case, one supplied by 
SQLite, not by the OS.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Simon Slavin


On 26 Sep 2017, at 9:17pm, Guy Harris  wrote:

> The *number* might annoy the support staff; right off the top of your head, 
> what's the error number for "file system quota exceeded" or "I/O error"?  (No 
> cheating by looking it up in a man page or include file!)

My support staff are allowed to look things up.

My users, when faced with a result which means "permission error" will probably 
grant all permissions to all apps and all users because that’s the simplest way 
to make a permission error message go away.  My users don’t understand the 
Posix permission model, because they’re not computer experts, they are 
financial sector specialists, or psychologists, or tailors.  I don’t want them 
thinking about computer problems.  If they knew enough about computer problems 
to fix a permission problem the right way, they wouldn’t be paying me.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Jens Alfke


> On Sep 26, 2017, at 1:17 PM, Guy Harris  wrote:
> 
> A user wouldn't know what to do with "you've exceeded your stored data quota”?

A Turkish or Chinese user likely wouldn’t. (SQLite’s error messages are not 
localized.) And there are plenty of messages that are much less understandable 
to a lay user than the one you picked out.

> The *number* might annoy the support staff; right off the top of your head, 
> what's the error number for "file system quota exceeded" or "I/O error"?  (No 
> cheating by looking it up in a man page or include file!)

On the contrary, error numbers are a lot easier for support. They’re 
independent of locale, they don’t get re-worded from one version of the app to 
the next, and they’re very short and easy to dictate over the phone. Of course, 
these shouldn’t be the primary error information given to the user! But the 
user-level error message should be something specific to the application, like 
“an unexpected database error occurred (19)” instead of "Abort due to 
constraint violation”. The number would appear only for support purposes.

I say this as someone who’s worked on a number of end-user GUI applications 
over the years.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 1:05 PM, Simon Slavin  wrote:

> On 26 Sep 2017, at 8:47pm, Guy Harris  wrote:
> 
>> On Sep 26, 2017, at 8:22 AM, Jens Alfke  wrote:
>> 
>>> The basic error code is SQLITE_IOERR, which just means "Some kind of disk 
>>> I/O error occurred” according to the comment. Which is true in this case; 
>>> an I/O operation returned an error.
>> 
>> But the *disk* didn't - the *operating system* did, so if SQLITE_IOERR 
>> really means "Some kind of disk I/O error occurred", it's *not* the right 
>> error to return for a *permission* error.
> 
> Those error codes were devised in a day when OS error codes were more simple.

EDQUOT was introduced in 1982, with 4.2BSD; when was SQLITE_IOERR devised?

> Also please note that those error codes are addressed to programmers.  Your 
> users should never see the text explanation of the number.  Because your 
> users wouldn’t know what to do about them.

A user wouldn't know what to do with "you've exceeded your stored data quota"?  
If so, your site has failed to explain to the users that they've been given a 
quota, limiting the amount of space on the server that they can use, and that 
if they exceed their quota, they either need to delete stuff they no longer 
need, move stuff they might *someday* need but don't need *now* to some 
archival medium, or ask their system administrator to increase their quota?

> At most the user can be shown the number returned to they can quote it in a 
> support call.

The *number* might annoy the support staff; right off the top of your head, 
what's the error number for "file system quota exceeded" or "I/O error"?  (No 
cheating by looking it up in a man page or include file!)

And, yes, there needs to be *some* way to get the underlying problem reported 
to somebody in a position to do something about it - where "the underlying 
problem" includes "what did the OS say?" as much as it includes "what SQLite 
operation got the error?".
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Simon Slavin


On 26 Sep 2017, at 8:47pm, Guy Harris  wrote:

> On Sep 26, 2017, at 8:22 AM, Jens Alfke  wrote:
> 
>> The basic error code is SQLITE_IOERR, which just means "Some kind of disk 
>> I/O error occurred” according to the comment. Which is true in this case; an 
>> I/O operation returned an error.
> 
> But the *disk* didn't - the *operating system* did, so if SQLITE_IOERR really 
> means "Some kind of disk I/O error occurred", it's *not* the right error to 
> return for a *permission* error.

Those error codes were devised in a day when OS error codes were more simple.  
Also please note that those error codes are addressed to programmers.  Your 
users should never see the text explanation of the number.  Because your users 
wouldn’t know what to do about them. At most the user can be shown the number 
returned to they can quote it in a support call.

Can you find out which extended result code is returned ?





That will let us know what’s really going on.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Guy Harris
On Sep 26, 2017, at 8:22 AM, Jens Alfke  wrote:

> The basic error code is SQLITE_IOERR, which just means "Some kind of disk I/O 
> error occurred” according to the comment. Which is true in this case; an I/O 
> operation returned an error.

But the *disk* didn't - the *operating system* did, so if SQLITE_IOERR really 
means "Some kind of disk I/O error occurred", it's *not* the right error to 
return for a *permission* error.

And, on UN*X, a write() call can return ENOSPC; a write() is an I/O operation, 
and "returns -1 with errno set to ENOSPC" is an error, but that presumably gets 
reported as SQLITE_FULL, not as SQLITE_IOERR.

Sadly, the name chosen for that error code

1) suggests an "I/O error" in the sense of "a device reported an error 
trying to read or write it"

and

2) is probably part of the API and thus unchangeable.

However, if SQLITE_IOERR is returned for *anything* other than, on UN*X, an EIO 
errno:

1) The documentation should *really really really really really* avoid 
calling it an "I/O error", as "I/O error" has a connotation of "the device 
reported an error" (which is what EIO signifies) rather than "an I/O operation 
got some sort of error, not necessarily an error from the device from which we 
were trying to read data or to which we were trying to write data".

2) The documentation should tell people *always* to use 
sqlite3_system_errno() after an SQLITE_IOERR and report the error based on 
*that*, not just by reporting an "I/O error".  Yes, that means writing 
platform-dependent code; if you want to allow platform-independent code to be 
written atop SQLite, stuff the platform dependency inside SQLite, by providing 
some API to get errors such as, for example, "permission denied" or "disk quota 
exceeded" or "an actual disk I/O error occurred" rather than "write() got some 
error other than ENOSPC".  (Yes, you *can* get "permission denied", e.g. in an 
NFSv2/NFSv3 write to a file to which you had write permission when you opened 
it but to which you no longer have write permission, and, yes, if, for example, 
you're in the remote file system group at Apple, with a home directory on an 
NFS server, you can have an SQLite database being accessed over NFS.)

> If you want more detailed info, use extended error codes by calling 
> sqlite3_extended_result_codes() or sqlite3_extended_errcode(). Then you’ll 
> get a more specific error; in your situation probably SQLITE_IOERR_ACCESS.

Perhaps, in that particular code path, the permission problem would show up in 
an xAccess method call, so that this would happen to be able to give you a 
better error.

However, what matters isn't "what operation got the error?", it's "what 
non-file-system-full error did you get?", and the extended error code won't 
help for errors other than ENOSPC and EIO returned by write().
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread Jens Alfke


> On Sep 25, 2017, at 4:39 AM, KRECKEL Richard (AREVA) 
>  wrote:
> 
> Remove the write permission of a SQLite database's journal file. Then, try 
> write-accessing the database. The error reported is "disk I/O error". (This 
> happened to me when two user tried to share a DB and had their umask set 
> wrong.)

The basic error code is SQLITE_IOERR, which just means "Some kind of disk I/O 
error occurred” according to the comment. Which is true in this case; an I/O 
operation returned an error.

If you want more detailed info, use extended error codes by calling 
sqlite3_extended_result_codes() or sqlite3_extended_errcode(). Then you’ll get 
a more specific error; in your situation probably SQLITE_IOERR_ACCESS.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] bug: failure to write journal reported as "disk I/O error"

2017-09-26 Thread KRECKEL Richard (AREVA)
Remove the write permission of a SQLite database's journal file. Then, try 
write-accessing the database. The error reported is "disk I/O error". (This 
happened to me when two user tried to share a DB and had their umask set wrong.)



The error message reported by SQLite is inappropriate. A "permission denied" 
would be much better and guide the user towards fixing the problem (instead of 
scaring the hell out of the poor sysadmin who suspects a filesystem corruption 
might be going on.)



I'm using SQLite 3.19.3.



All my best,

-rbk.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users