Argument list too long while copying

2012-08-23 Thread Jan Stary
On current/amd64 I created a MSDOS filesystem on a CF card
inserted into a USB card reader

# newfs_msdos -F 32 -L SHERLOCK sd3i 

and tried to copy some files to it

# mount /dev/sd3i /mnt
# cp * /mnt
cp: /mnt/202.the-hounds-of-baskerville.avi: Argument list too long
cp: /mnt/201.a-scandal-in-belgravia.avi: Argument list too long
cp: /mnt/102.the-blind-banker.avi: Argument list too long

These three files are not copied fully.
The other files are copied alright.

On another run, this happens with _other_ files.
Indeed, it is a problem with the card:

sd3(umass1:1:0): Check Condition (error 0x70) on opcode 0x2a
SENSE KEY: Media Error
ASC/ASCQ: Peripheral Device Write Fault

However, it is not the case that the argument list is too long:

# ls -1
101.study-in-pink.avi
101.study-in-pink.srt
102.the-blind-banker.avi
102.the-blind-banker.srt
103.the-great-game.avi
103.the-great-game.srt
104.extras.mkv
201.a-scandal-in-belgravia.avi
201.a-scandal-in-belgravia.srt
202.the-hounds-of-baskerville.avi
202.the-hounds-of-baskerville.srt
203.the-reichenbach-fall.avi
203.the-reichenbach-fall.srt

Somehow, the wrong error condition is reported:
E2BIG instead of a failed write(2).

The code that gives the warning messages is
the following block in /usr/src/bin/cp/utils.c:


int skipholes = 0;
struct stat tosb;
if (!fstat(to_fd, tosb)  S_ISREG(tosb.st_mode))
skipholes = 1;
while ((rcount = read(from_fd, buf, MAXBSIZE))  0) {
if (skipholes  memcmp(buf, zeroes, rcount) == 0)
wcount = lseek(to_fd, rcount, SEEK_CUR) == -1 ? -1 : rcount;
else
wcount = write(to_fd, buf, rcount);
if (rcount != wcount || wcount == -1) {
warn(%s, to.p_path);
rval = 1;
break;
}
}


If I am reading this right, getting the warn() message means
that either the lseek() or the write() must have failed;
but both of them set errno, and not to E2BIG.
So why does the warn() report E2BIG?


Jan



Re: Argument list too long while copying

2012-08-23 Thread Ted Unangst
On Thu, Aug 23, 2012 at 21:17, Jan Stary wrote:
 On current/amd64 I created a MSDOS filesystem on a CF card
 inserted into a USB card reader

 # cp * /mnt
 cp: /mnt/202.the-hounds-of-baskerville.avi: Argument list too long
 cp: /mnt/201.a-scandal-in-belgravia.avi: Argument list too long
 cp: /mnt/102.the-blind-banker.avi: Argument list too long

 However, it is not the case that the argument list is too long:

 Somehow, the wrong error condition is reported:
 E2BIG instead of a failed write(2).
 but both of them set errno, and not to E2BIG.
 So why does the warn() report E2BIG?

Because that is what the msdosfs code in the kernel says.  You are
trying to put too much stuff in the root directory of the filesystem.
I suspect if you create a subdirectory, you'll have more success,
assuming the filesystem is large enough.



Re: Small change to let mg handle localized characters

2012-08-23 Thread Geoff Steckel

On 08/23/2012 03:50 PM, Stefan Sperling wrote:

On Thu, Aug 23, 2012 at 08:58:53PM +0200, Eivind Evensen wrote:

Since version 1.10 of lib/libc/gen/ctype_.c, I've been
unable to use localized characters in mg properly (they're printed
as an octal value only).

I've been using the below change to regain support for printing them
normally.

Best regards, Eivind Evensen


Index: main.c
===
RCS file: /data/openbsd/src/usr.bin/mg/main.c,v
.
Eivind

This kind of change has been proposed before.
In my opinion it is not the right way of solving this problem.

It won't work correctly with multi-byte files (like UTF-8). E.g. typing
backspace to delete one character will delete one byte instead of the
entire character, which messes up the display. To properly support multi-byte
encodings mg needs to learn the difference between a byte and a character.

The locales mechanism and wchar_t are only useful for applications that do
not care about details of character encodings, and which only need to deal
with a single character set at a time. It is not very useful for editors
because they need to handle files in various encodings and be aware of
the current encoding in use.

Some applications in base (less and tmux, for example) have special
support code for UTF-8. This could be done for mg as well, so that
it can support single-byte character sets (ASCII, latin1) and also
UTF-8 (but no other multi-byte character set). You'd activate the
special UTF-8 mode if nl_langinfo(CODESET) returns UTF-8.

To properly support arbitrary multi-byte character sets (UTF-8, UTF-16,
special asian language encodings etc) mg needs iconv which we don't have
in base. I have some work-in-progress iconv code but it's not ready for
the tree yet and I'm not actively working on it at the moment.
If you want to help out with this let me know.

Using iconv in an editor is EXTREMELY dangerous without complex precautions.
Given a file containing characters not valid in the current locale,
it will at minimum prevent viewing the file.
If the file is written out, the file is destroyed.
IMnsHO, that is fatally flawed.

Returning an error for an impossible character translation is specified
in the archaic version of the Unicode standard I read.
That is not useful in an editor.

Geoff Steckel



Re: Small change to let mg handle localized characters

2012-08-23 Thread Stefan Sperling
On Thu, Aug 23, 2012 at 05:32:51PM -0400, Geoff Steckel wrote:
 Using iconv in an editor is EXTREMELY dangerous without complex precautions.
 Given a file containing characters not valid in the current locale,
 it will at minimum prevent viewing the file.

An editor needs to convert between character sets.
How else are you going to display a latin1 file in a UTF-8 locale,
for example?

If the current character set of the locale cannot display your file
because conversion from file source encoding to output encoding fails,
tough, you'll have display problems. What else is an application
supposed to do in this case? It's being asked to do something impossible.

BTW, vim links to libiconv. For some bizarre reason emacs links to
libossaudio instead ;)

 If the file is written out, the file is destroyed.
 IMnsHO, that is fatally flawed.

Well, yes, using a character set conversion API in stupid ways can
munge data. How does that relate to anything I was saying?



Re: Small change to let mg handle localized characters

2012-08-23 Thread Geoff Steckel

On 08/23/2012 06:55 PM, Stefan Sperling wrote:

On Thu, Aug 23, 2012 at 05:32:51PM -0400, Geoff Steckel wrote:

Using iconv in an editor is EXTREMELY dangerous without complex precautions.
Given a file containing characters not valid in the current locale,
it will at minimum prevent viewing the file.

An editor needs to convert between character sets.
How else are you going to display a latin1 file in a UTF-8 locale,
for example?

If the current character set of the locale cannot display your file
because conversion from file source encoding to output encoding fails,
tough, you'll have display problems. What else is an application
supposed to do in this case? It's being asked to do something impossible.

BTW, vim links to libiconv. For some bizarre reason emacs links to
libossaudio instead ;)


If the file is written out, the file is destroyed.
IMnsHO, that is fatally flawed.

Well, yes, using a character set conversion API in stupid ways can
munge data. How does that relate to anything I was saying?

As long as iconv is only used to display data, not to change file
contents, you're perfectly right.

A real example is a L***x editor using iconv. Open a 5000 line file,
change line 100, line 500 contains a non-conforming character,
file is truncated there.

Not pretty.

Another real example. Bring up line containing non-conforming character.
Line appears blank.

I agree that it takes a great deal of care to implement a multi-character
set editor such that it works on all useful files while displaying in
a particular locale's character set.

Geoff Steckel



Re: Small change to let mg handle localized characters

2012-08-23 Thread Mike Belopuhov
On Fri, Aug 24, 2012 at 12:55 AM, Stefan Sperling s...@openbsd.org wrote:
 For some bizarre reason emacs links to libossaudio instead ;)


yeah, an sndio backend is yet to be written...
another reason to migrate to the superior pulse-audio framework!