[dev] [vis] dw near the end of the line

2016-02-13 Thread Random832

Recently, the question of the correctness of vim's behavior of 2dw on
the first of three lines of one word each came up on the vim mailing
list (it turns out that it's not correct according to POSIX, but is
shared with traditional vi).

At that time, I wasn't able to build vis to see what it does. I've since
figured out my build problem, and tested vis's behavior in this
situation.

When you delete the last word of a line in vis with the dw command, it
always deletes the newline and all following spaces and newlines
(i.e. placing the content of the next non-blank line on the current
one). This behavior differs from most other vi clones, matching only
elvis-tiny. Is this behavior intended?




[dev] Re: st: keys and infocmp

2015-12-14 Thread Random832
Greg Reagle  writes:

> Hello.  If there are any man pages or articles or FAQs about this topic
> that would be good to read, please refer to them.
>
> Running Xubuntu 12.04 and the latest st on a ThinkPad laptop, these are
> the results I get, correlated with the results of infocmp.  I got the
> output from the keys by running cat and hitting the keys.
>
> st, TERM is st-256color
>
> | home | end   | insert | delete | up   | down | left | right |
> | ^[[H | ^[[4~ | ^[[4h  | ^[[P   | ^[[A | ^[[B | ^[[D | ^[[C  |
> | home | kc1, kend | smir   | dch1   | cuu1 |  |  | cuf1  |
>
> Why do the escape sequences produced by down and left arrow keys have no
> match in infocmp?  Why does home key not produce khome (\E[1~) escape
> sequence?

The character sequences in the terminfo entry are meant to match
those which are sent when keypad mode (tput smkx) is
enabled. Try "tput smkx; cat".

I note that you called out down and left, I suspect this is
because you've incorrectly matched up and right against cuu1 and
cuf1, whereas the latter are control sequences, not input
sequences. You should be looking exclusively at terminfo strings
whose name begins with "k" (khome, kend, kich1, kdch1, kcuu1,
kcud1, kcub1, kcuf1).




[dev] Re: Bug in join.c

2015-12-14 Thread Random832
Mattias Andrée  writes:
> I think this patch should be included. But I don't see
> how it is of substance. It will never occur with two's
> complement or ones' complement. Only, signed magnitude
> representatiion. Any sensible C compiler for POSIX
> systems will only use two's complement; otherwise
> int[0-9]*_t cannot be implemented.

I had assumed that comparing an unsigned value with a negative
number resulted in a comparison that is unconditionally false,
rather than converting one to the type of the other. Maybe
that's because I've gotten too used to non-C languages that
don't have fixed-size integers.

Sorry for the confusion.




[dev] Bug in join.c

2015-12-14 Thread Random832
I was going through sbase checking with -Wall -Wextra -pedantic
-Werror, and among a bunch of noise errors relating to
signed/unsigned comparisons, I found one with actual substance:
the result of getline is being converted to size_t before
comparing to -1 to check for error.

diff --git a/join.c b/join.c
index 1a08927..6828cf4 100644
--- a/join.c
+++ b/join.c
@@ -261,7 +261,8 @@ static int
 addtospan(struct span *sp, FILE *fp, int reset)
 {
char *newl = NULL;
-   size_t len, size = 0;
+   ssize_t len;
+   size_t size = 0;
 
if ((len = getline(, , fp)) == -1) {
if (ferror(fp))


I also couldn't quite figure out if this line of tail.c is
correct or not.

n = MIN(llabs(estrtonum(numstr, LLONG_MIN + 1, MIN(LLONG_MAX, SIZE_MAX))), 
SIZE_MAX);




[dev] Re: [sbase] Portability

2015-11-26 Thread Random832
Dimitris Papastamos  writes:
> sbase should only contain code that runs on POSIX systems (with some
> minor exceptions) and fallback implementations for non-standardized
> interfaces that can be implemented portably on top of POSIX interfaces.

So there's no place for fallback implementations _of_ POSIX interfaces
on top of either older POSIX interfaces or non-standard ones?

Anyway, here's a patch for some data type issues that came up - more to
do with compiling with all warnings, though the fact that clock_t is
unsigned on OSX helped catch one of them.

diff --git a/du.c b/du.c
index 41e4380..3dc3545 100644
--- a/du.c
+++ b/du.c
@@ -25,7 +25,7 @@ printpath(off_t n, const char *path)
 	if (hflag)
 		printf("%s\t%s\n", humansize(n * blksize), path);
 	else
-		printf("%ju\t%s\n", n, path);
+		printf("%jd\t%s\n", (intmax_t)n, path);
 }
 
 static off_t
diff --git a/split.c b/split.c
index f15e925..ee24556 100644
--- a/split.c
+++ b/split.c
@@ -48,7 +48,7 @@ int
 main(int argc, char *argv[])
 {
 	FILE *in = stdin, *out = NULL;
-	size_t size = 1000, n;
+	off_t size = 1000, n;
 	int ret = 0, ch, plen, slen = 2, always = 0;
 	char name[NAME_MAX + 1], *prefix = "x", *file = NULL;
 
@@ -69,7 +69,7 @@ main(int argc, char *argv[])
 		break;
 	case 'l':
 		always = 0;
-		size = estrtonum(EARGF(usage()), 1, MIN(LLONG_MAX, SIZE_MAX));
+		size = estrtonum(EARGF(usage()), 1, MIN(LLONG_MAX, OFF_MAX));
 		break;
 	default:
 		usage();
diff --git a/time.c b/time.c
index 4af0352..60a8c8d 100644
--- a/time.c
+++ b/time.c
@@ -36,7 +36,7 @@ main(int argc, char *argv[])
 	if ((ticks = sysconf(_SC_CLK_TCK)) <= 0)
 		eprintf("sysconf _SC_CLK_TCK:");
 
-	if ((r0 = times()) < 0)
+	if ((r0 = times()) == (clock_t)-1)
 		eprintf("times:");
 
 	switch ((pid = fork())) {
@@ -52,7 +52,7 @@ main(int argc, char *argv[])
 	}
 	waitpid(pid, , 0);
 
-	if ((r1 = times()) < 0)
+	if ((r1 = times()) == (clock_t)-1)
 		eprintf("times:");
 
 	if (WIFSIGNALED(status)) {


[dev] [sbase] Portability

2015-11-25 Thread Random832
I downloaded and built sbase for my OSX system to test the cal program,
and noticed (and fixed locally) several issues.

Before posting any patches, I wanted to ask - philosophically speaking,
how much effort should sbase put towards supporting systems that don't
support the latest-and-greatest POSIX functions? Three functions were
missing (utimensat, clock_gettime, and fmemopen), and fmemopen in
particular required an extensive implementation, which I found online
(https://github.com/NimbusKit/memorymapping) rather than writing myself.

Also, if these are added should they go in libutil or a new "libcompat"?




[dev] Re: [farbfeld] announce

2015-11-18 Thread Random832
FRIGN  writes:
> I guess a better way to do that would be to use greyscale-farbfeld
> files

There doesn't appear to be such a thing, unless you mean just have R=G=B
and A=65535. Which, to me, seems to suck about as much as using ASCII
for a header that can be parsed with fscanf.

I think it'd be more elegant to _only_ have a "grayscale" format, and
store RGBA images as a quartet of these.




[dev] Re: [farbfeld] announce

2015-11-17 Thread Random832
FRIGN  writes:
> Hello fellow hackers,
>
> I'm very glad to announce farbfeld to the public, a lossless image
> format as a successor to "imagefile" with a better name and some
> format-changes reflecting experiences I made since imagefile has
> been released.

(snip description of format)

How is this better than PPM?




[dev] Re: a suckless hex editor

2015-11-13 Thread Random832
Greg Reagle  writes:
> I agree that it is a "poor man's" hex editor.  I am having fun with it, even 
> if
> it is a toy.  I don't have the desire to write a sophisticated hex editor
> (besides they already exist).
>
> I like that the small shell script can turn any editor into a hex editor.  
> BTW,
> if od is replaced with hexdump -C or xxd or GNU od -tx1z, then the ascii will
> be in the dump too.

It being in the dump isn't really "enough" - in a real hex editor, you
can make changes on the ASCII side and expect them to be reflected in
the hex side (and ultimately the binary file), whereas using xxd [etc]
means the ASCII side is static and is ignored when read back in.

This does have its place, though... It's basically an editor-portable
version of the recipe that vim provides for using xxd to "edit" binary
files. Which is itself a compelling enough use case for xxd to be
included with vim in the first place (as far as I know xxd has no other
vim-related purpose). But it's not a hex editor.




Re: [dev] st: selecting text affects both primary and clipbaord

2015-02-20 Thread random832
On Fri, Feb 20, 2015, at 12:38, sta...@cs.tu-berlin.de wrote:
 * k...@shike2.com 2015-02-20 17:39
  I agree here, it shouldn't modiy the CLIPBOARD seletction. Sometime
  is good to have different things in both selections. If nobady claims
  about it I will apply your patch.
 
 I'd leave it as is, in order not to break scrips which expect to read
 something from CLIPBOARD. 
 
 In other programms, you might have the choice to send something to
 selection or CLIPBOARD by different means. In st, however, you don't.
 Thus, the current behaviour seems to me more consistent and intuitive.

Another reason to leave it as-is is that, while in other applications it
is reasonable to select text for some purpose other than copying it
(e.g. to delete it or replace it), people will not want their clipboard
obliterated in this case. However, in a terminal emulator, the only
thing you can do with selected text is copy it.

PuTTY on MS Windows puts selected text immediately in the clipboard and
apparently no-one has ever objected to this behavior - if anyone had I'm
sure they would have added it to the dozens of configurable options it
already has.



Re: [dev] surf questions

2015-01-23 Thread random832
On Thu, Jan 22, 2015, at 16:47, Raphaël Proust wrote:
 When you have a vertical line in a text, indicating where the
 character you type will appear: it's called a caret.

Or, more relevantly to a (mostly) read-only application like a web
browser, to enable you to precisely position it with the keyboard to
begin a selection for copying. The arrow keys move the caret rather than
scrolling the page.



Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-10 Thread random832
On Fri, Jan 9, 2015, at 18:39, FRIGN wrote:
 C3B6 is 'ö' and makes sense to allow specifying it as \50102 (in the pure
 UTF-8-sense of course, nothing to do with collating).

Why would someone want to use the decimal value of the UTF-8 bytes,
rather than the unicode codepoint?

Why are you using decimal for a syntax that _universally_ means octal?

UTF-8 is an encoding of Unicode. No-one actually thinks of the character
as being C3B6 - it's 00F6, even if it happens to be encoded as C3 B6
or F6 00 whatever. Nobody thinks of UTF-8 sequences as a single integer
unit.

The sensible thing to do would be to extend the syntax with \u00F6 (and
\U0001 for non-BMP characters) the way many other languages have
done it) This also avoids repeating the mistake of variable-length
escapes - \u is exactly 4 digits, and \U is exactly 8.

 Well, probably I misunderstood the matter. Sometimes this stuff gets
 above my head. ;)
 At the end of the day, you want software to work as expected:
 
 GNU tr:
 $ echo ελληνική | tr [α-ω] [Α-Ω]
 ®
 
 our tr:
 $ echo ελληνικη | ./tr [α-ω] [Α-Ω]
 ΕΛΛΗΝΙΚΗ

And that's fine. Actually I think POSIX actually _requires_ for it to
work the way yours does, and GNU fails to comply. As a data point, OSX
and FreeBSD both work the same way as sbase for this test case.

GNU actually has a history of being behind the curve on UTF-8/multibyte
characters, so it's not a great example of what POSIX requires. Cut is
another notable command with the same problem.



Re: [dev] [sbase] [PATCH-UPDATE] Rewrite tr(1) in a sane way

2015-01-10 Thread random832
On Sat, Jan 10, 2015, at 16:47, Markus Wichmann wrote:
 You wanted to be Unicode compatible, right? Because in that case I
 expect [:alpha:] to be the class of all characters in General Category L
 (that is, Lu, Ll, Lt, Lm, or Lo). That includes a few more characters
 than just A-Z and a-z. And I don't see you add any other character to
 that class later.

Note that translating between [:upper:] and [:lower:] requires using the
toupper and tolower mapping, rather than just dumping the character
classes (since otherwise you'll run into there being something like ß
that is in [:lower:] and has no counterpart in [:upper:], or they're in
a different order)



Re: [dev] [sbase] [PATCH-UPDATE] Rewrite tr(1) in a sane way

2015-01-10 Thread random832


On Sat, Jan 10, 2015, at 19:11, Ian D. Scott wrote:
 On Sat, Jan 10, 2015 at 06:56:45PM -0500, random...@fastmail.us wrote:
 Actually, ẞ, capital of ß, was added in Unicode 5.1.  There are probably
 others letters with this issue, however.

My main point was that you've got to be careful that the order of the
classes matches the counterparts with each other, which there is not
otherwise a guarantee of. A naive interpretation's main problem is that
ß puts everything after it off by one.



Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread random832
On Fri, Jan 9, 2015, at 18:08, FRIGN wrote:
 
 This is madness. If you want the bytes to be collated,

I don't see where you're getting that either of us want the bytes to be
collated. I don't even know what you mean by collated, since collating
is not what tr does, except when ordering ranges.

 you just write the
 literal \50102. 

Even if octal values could be more than three digits, I have no idea
what you think 50102 is. Its decimal value is 20546. Its hex value is
0x5042. I have no idea what it has to do with character U+00F6 whose
UTF-8 representation is 0xC3 0xB6. I just realized what you're
doing, 0xC3B6 has the _decimal_ value 50102, I have no idea why you
would think _that_ is a representation people would want to use. If
you're so pro-unicode, make it accept \u00F6 - that's a valid extension.
But reusing the syntax POSIX uses for three-digit octal literals, for
arbitrarily long decimal literals that aren't even unicode code points,
makes no sense at all. In what universe is that intuitive?

 POSIX often is a solution to a problem that doesn't exist
 in the first place when you just use UTF-8.
 
  They have nothing to do with UTF-8.
 
 That's exactly the point. Collating elements are depending on the current
 locale which is too much of a mess to deal with.

Huh?

 So when the Spanish ll collates before m and after l in a given
 locale, we don't give a fuck.
 So please give me the point why you are torturing me with this
 information.

Because collating elements are the thing POSIX forbids which you appear
to have _misinterpreted_ as forbidding multibyte characters. Otherwise I
have _no idea_ what in POSIX you interpret as preventing reasonable
behavior with UTF-8 multibyte characters.

 I stated that I did not implement collating elements into this tr(1) at
 the beginning and that it's a POSIX-nightmare to do so, bringing harm
 to anybody who is interested in a consistent, usable tool.

tl;dr:

Collating elements = POSIX forbids them = You don't want them anyway.
Multibyte characters = POSIX allows/requires them = You like them too.
What is the problem?
I don't know what you want to do that you think POSIX doesn't allow.



Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread random832


On Fri, Jan 9, 2015, at 17:48, FRIGN wrote:
 Did you read what I said? I explicitly went away from POSIX in this
 regard,
 because no human would write tr '\303\266o' 'o\303\266'.

POSIX doesn't require people to write it, it just requires that it
works. POSIX has no problem with also allowing a literally typed
multibyte character to refer to itself. It's basically saying that if
someone _does_ write '\303\266o' 'o\303\266', you have to treat it the
same as öo oö, and not as the individual bytes.

 The reason why POSIX prohibits collating elements is only because they
 are
 inhibited by their own overload of different character sets and locales.
 Given assuming a UTF-8-locale is a very sane way to go (see Plan 9), this
 limit can easily be thrown off and makes life easier.

I don't think you're understanding the difference between
multi-character collating elements and multibyte characters.

Multi-character collating elements are things like ch in some Spanish
locales. They have nothing to do with UTF-8.



Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread random832
On Fri, Jan 9, 2015, at 16:44, Nick wrote:
 Quoth FRIGN:
   - UTF-8: not allowed in POSIX, but in my opinion a must. This
finally allows you to work with UTF-8 streams without
problems or unexpected behaviour.
 
 I fully agree (unsurprisingly). Anything that relies on the POSIX 
 behaviour to do weird things involving multibyte characters is 
 insane.

Er... http://pubs.opengroup.org/onlinepubs/009696899/utilities/tr.html
has very little mention of the issue one way or another, but does use
the term characters rather than bytes in all relevant places, and
talks about multi-byte characters in a tone that suggests they should
be supported properly when LC_CTYPE has them.

The only _questionable_ bits are some of the language surrounding the
use of octal sequences:

For single characters: Multi-byte characters require multiple,
concatenated escape sequences of this type, including the leading '\'
for each byte.

I read this as meaning that multi-byte characters are supported, and in
fact that tr '\303\266o' 'o\303\266' means that \303\266 [two escape
sequences representing one multi-byte character] and o will be swapped -
and that it is not possible to specify multibyte characters with octal
values a dash-separated range specification (but they can be included as
literals).

Or, is it possible that FRIGN misinterpreted the prohibition on
multi-character collating elements ?



Re: [dev] problem report for sbase/cal

2014-12-15 Thread random832
On Mon, Dec 15, 2014, at 11:47, Greg Reagle wrote:
 January 2015 is supposed to start on a Thursday.  

January 2014 started on a Wednesday - maybe it's worth investigating
whether cal -3 that spans two years isn't using the correct year for
some of the months.



Re: [dev] Object-Oriented C for interface safety?

2014-11-27 Thread random832
On Thu, Nov 27, 2014, at 07:27, koneu wrote:
 Greetings.
 
 The two things that really make OO languages worthwhile in my opinion
 are polymorphism and inheritance. Doing polymorphism and data/code
 hiding in C is easy enough with const function pointers. You can just
 define public interfaces in their own header like
 
 struct interface {
   void * const this;
   int (* const get_foo)(void *this);
   void (* const set_foo)(void *this, int foo);
   char * (* const get_bar)(void *this);
   void (* const set_bar)(void *this, char *bar);
 };
 
 and implement them in classes like
 
 struct class {
   int foo;
   char *bar;
 };

In general when this is done in real life, you do it the other way
around, so you only need one copy of the interface structure per class.



Re: [dev] [sbase] style

2014-11-20 Thread random832


On Wed, Nov 19, 2014, at 16:44, k...@shike2.com wrote:
  C90, or any version of standard C, does not have a concept of system
  headers, other than giving implementations permission to place their
  own implementation-defined files in places searched by #include
  h-char-sequence.
 
 At this point I was talking about POSIX of course.  C90 doesn't give
 implementations permission to place their own implementation-defined.
 If your program relays on that, and include some ot these
 implementation headers, then your program is not C90 compliant,
 and the behaviour is undefined (from C90 point of view, not from
 POSIX point of view).

Er, by permission I meant it doesn't make the _implementation_
non-compliant.

And implementation-defined is not the same as undefined.

   - Each header declares and defines only those
   identifiers listed in its associated section: If the header includes
   another header then it will break this rule.

I think this is meant as a statement that strictly conforming programs
may not rely on them defining anything else. Most of these identifiers
are reserved, and a strictly conforming program therefore cannot do
anything with them without including the header they are documented as
being defined in.



Re: [dev] why avoid install?

2014-11-20 Thread random832
On Thu, Nov 20, 2014, at 14:40, Markus Wichmann wrote:
 Not always. One thing that reliably gets on people's nerves here is
 shared libraries. And those aren't protected with that ETXTBSY thing.
 
 The reason is that the MAP_DENYWRITE flag became the irrecoverable
 source of a DoS attack and had to be removed from the syscall. It can
 still be used in the kernel, which is why overwriting a running binary
 will fail, but it can't be used in userspace (or rather, is ignored),

Why not give ld-linux.so a capability that allows it? Wait, no, that
wouldn't solve it for dlopen().

Why not allow it for files that have execute permission? What are the
details of the DOS attack?



Re: [dev] why avoid install?

2014-11-19 Thread random832
On Wed, Nov 19, 2014, at 09:55, Dimitris Papastamos wrote:
 Regarding your question on cp -f then the answer is not quite.
 
 cp -f will try to unlink the destination if it fails to open it for
 whatever
 reason.

And if the target is running and writing to a running binary is a
problem, opening it will fail with [ETXTBSY], meaning it will be
unlinked. You can argue about whether that is the purpose or something
else (permission errors within a directory you own) is the purpose, but
it will certainly solve that problem.



Re: [dev] [sbase] style

2014-11-19 Thread random832
On Wed, Nov 19, 2014, at 13:51, k...@shike2.com wrote:
 
  system headers should come first, then a newline, then libc headers
  then a newline then local headers.
 
 
 I usually do just the inverse, first libc headers and later system
 headers.
 
  the libc headers are guaranteed to work regardless of the order of
  inclusion but need to come after the system headers.  From what I
 
 Are you sure about that?.  I know that C90 guarantees that any
 standard header will not include any other standard header (althought
 it is sad that a lot of compilers ignore this rule), but I have never
 read anything about dependences between standard and system headers.

C90, or any version of standard C, does not have a concept of system
headers, other than giving implementations permission to place their
own implementation-defined files in places searched by #include
h-char-sequence.

POSIX does not, as far as I can tell, allow systems to require headers
to be included in any certain order.

I have no idea what the categories system headers and libc headers
refer to in the post you are replying to, what operating system he is
using (certainly not POSIX - I think when I saw the post I got the vague
impression he was talking about Plan 9), or which category the standard
C headers or POSIX headers might fall into. There are such
order-dependencies on some non-POSIX unix systems (I once had to move
sys/types.h above socket.h to get a program to compile on 2.11BSD), and
it may or may not make sense to order headers in line with those as a
matter of tradition.

In general, both standards require all headers to declare, for example,
any typedefs that are present in the signature, without implying the
inclusion of any other header that also defines the same types, and
leaving it up to the implementation to determine how to accomplish this.
For example, unistd.h cannot require that sys/types.h be included first
just because it uses off_t which is also found in sys/types.h; the
author of the header files has to figure out how to make them both
define off_t without any conflict if both are included.

I couldn't find the guarantee you mentioned, that one header shall not
include another header, and I can't think of how doing so would affect
the behavior of any strictly conforming program.



Re: [dev] Patches to fix . for insert and change commands

2014-11-18 Thread random832
On Tue, Nov 18, 2014, at 17:59, Stephen Paul Weber wrote:
 I've written up patches to make it so that I, a, A, s, ce, etc can be 
 repeated properly with .  -- not sure if I'm doing this the Right Way,
 but 
 it seems to work in my tests.  Feedback appreciated.  Patches attached.

Haven't looked at your patch, but vim stores the inserted keystrokes
(not text - it'll happily let you repeat an inserted sequence of
backspaces that deleted over the beginning of the insertion region,
arrows that moved the cursor, etc) in a read-only register named with
the period character. Pasting it with ^R. or ^A in insert-mode plays
back the keystrokes and adds them to the text which will be in the
register the next time you leave insert mode. I don't know offhand if
this register is used for the . command or not.



Re: [dev] fsbm

2014-11-08 Thread random832
On Fri, Nov 7, 2014, at 02:03, k...@shike2.com wrote:
 I disagree, check the size before of calling strcpy. If you want to
 avoid security risk you also have to check the output of strlcpy
 to detect truncations, so you don't win anything. In both cases
 you have to add a comparision, so it is better to use strcpy that
 is standard.

There are numerous scenarios where an overflow has security implications
but a truncation does not. For example, if an attacker can supply any
string, they could supply the shorter one to begin with, and therefore
don't benefit from truncation.



Re: [dev] fsbm

2014-11-08 Thread random832
On Fri, Nov 7, 2014, at 05:11, Dimitris Papastamos wrote:
 It is generally unlikely that the string has been validated to
 be an integer before getting to atoi().  With atoi() you cannot
 distinguish between an invalid integer and 0.
 
 Generally speaking, it should never be used.

What if you don't care?



Re: [dev] c++-style comments [was fsbm]

2014-11-06 Thread random832
On Thu, Nov 6, 2014, at 12:34, Louis Santillan wrote:
  In a color syntax highlighting editor, doSomething(); takes on normal
  highlighting when enabled, and takes on comment colored highlighting
  when
  disabled.  Visually, that's slightly improved over something like
 
#ifdef DEBUG
doSomething();
#endif

In the editor *I* use, it has comment colored highlighting for #if 0,
and for the #else of #if 1, and the same for anything with #if 0  and
#if 1 ||.



Re: [dev] c++-style comments [was fsbm]

2014-11-06 Thread random832
On Thu, Nov 6, 2014, at 16:47, Sylvain BERTRAND wrote:
 Linus T. does let closed source modules live (even so the GNU GPLv2 gives
 legal
 power to open the code, or block binary blob distribution, like what
 happens
 with mpeg video or 3D texture compression),

There's a significant amount of debate over what constitutes an 'arms
length' interaction between two pieces of code and what makes them
effectively a single piece of code. GNU takes the position that sharing
the same address space in any way is the latter, and that normal
interaction through files/pipes/sockets is the former (because it would
be politically inconvenient for them to push too far) so long as it's
not a specially defined protocol that only exists for that single pair
of programs. The kernel people as far as I know take the position that
sharing the same address space is okay so long as they only use certain
approved APIs intended for use by modules - and that userspace-kernel
interaction via normal system calls is always okay. None of this has
been examined by a court.



Re: [dev] [sbase] [PATCH 1/2] Fix symbolic mode parsing in parsemode

2014-11-03 Thread random832
On Sun, Nov 2, 2014, at 17:24, Michael Forney wrote:
 I found quite a lot of bugs, so I ended up pretty much rewriting as I
 followed the spec¹.

How about +X? I noticed there were no test cases for that.

+X acts like +x provided either the file is a directory or the file
already has at least one execute bit set. The function doesn't seem to
be able to know if the file is a directory. Same for =X, and -X is
identical to -x.



Re: [dev] [sbase] [PATCH 1/4] tar: Don't crash when get{pw,gr}uid fails

2014-11-01 Thread random832
On Sat, Nov 1, 2014, at 16:57, Dimitris Papastamos wrote:
 On Sat, Nov 01, 2014 at 08:36:37PM +, Michael Forney wrote:
  -   snprintf(h-uname, sizeof h-uname, %s, pw-pw_name);
  -   snprintf(h-gname, sizeof h-gname, %s, gr-gr_name);
  +   snprintf(h-uname, sizeof h-uname, %s, pw ? pw-pw_name : );
  +   snprintf(h-gname, sizeof h-gname, %s, gr ? gr-gr_name : );
 
 The patches look good, thanks!
 
 Just a small clarification on this one, do other tar implementations
 do the same here?

Yes. I looked at heirloom (both tar and cpio), 4.4BSD pax, GNU tar, and
star. Heirloom prints an error, none of the rest do, and all seem to put
in an empty string (or do nothing and the field is initialized earlier
with null bytes).



Re: [dev] [sbase] [PATCH 1/4] tar: Don't crash when get{pw,gr}uid fails

2014-11-01 Thread random832
On Sat, Nov 1, 2014, at 18:01, Michael Forney wrote:
 It looks like GNU tar does¹, but BSD tar uses the string
 representation of the UID/GID.
 
 ¹ http://git.savannah.gnu.org/cgit/tar.git/tree/src/names.c#n66

I didn't think to look at a modern BSD (the relevant function is
name_uid in pax/cache.c). Either way, any tar should be able to cope
with either output, assuming no system has the pathological case of a
user account with a numeric name different from its uid, but a blank
string seems to be more POSIX-correct.

There's another tar (possibly actually called bsdtar) in
contrib/libarchive that I couldn't make heads or tails of (it uses some
kind of modular design and I couldn't find the real implementation of
everything)



Re: [dev] [PATCH] [st] Use inverted defaultbg/fg for selection when bg/fg are the same

2014-10-27 Thread random832
On Mon, Oct 27, 2014, at 08:20, FRIGN wrote:
 There's simply no reason to break consistency for some quirky irc-gag.

But there's no compelling reason in the first place to visualize
selection by inverting the colors. If you want consistency it can be
achieved by having an actual selection color pair, or by _always_ using
the default colors, but that's bikeshed painting. The reason why someone
might have same-fg-as-bg on their screen is beside the point - having no
way to make that text visible is a usability issue.



Re: [dev] [PATCH] [st] Use inverted defaultbg/fg for selection when bg/fg are the same

2014-10-27 Thread random832
On Mon, Oct 27, 2014, at 10:54, Martti Kühne wrote:
 This may sound trivial, but.
 How about you paste it somewhere else?

Requires having another window already open that can accept arbitrary
text (and not attempt to execute it as commands).



Re: [dev] SGI Irix look (4Dwm)

2014-10-22 Thread random832
On Wed, Oct 22, 2014, at 14:01, Peter Hofmann wrote:
 I'm pretty sure that most people on the list will agree on this being
 just plain crazy. :-) It's a hack, it's ugly and it's anything but
 suckless.
 
 I won't go into further detail. This causes many, many problems. The
 only reason why dwm-vain still has these kind of borders is that I don't
 find the time to either turn dwm into a reparenting WM or write a
 reparenting WM from scratch.

Why not just draw the title in a separate window? This is how I always
assumed a suckless window manager with title bars would act.



Re: [dev] [st][PATCH] Add support for utmp in st

2014-10-13 Thread random832
On Sun, Oct 12, 2014, at 14:32, k...@shike2.com wrote:
 If the user doesn't like the key assignation on st he is free of changing
 it
 in his config.h (maybe we could add it to the FAQ).

That doesn't mean that the question of what the default should be is not
worth discussing.

  You didn't comment on the prior/next/find/select issue, either.
 
 I don't unerstand what you mean here.

The fact that you claim that matching the key codes to the labels on the
keys are so important, yet you send the key codes associated with the
VT220 Prior/Next/Find/Select keys when the user presses
PageUp/PageDown/Home/End.

 Quoting from this page:
 
   It is unwise to conflict with certain variables that are
   frequently exported by widely used command interpreters and
   applications:

That is not from the section where COLUMNS and LINES are defined (scroll
further down the page). The table which the sentence you pasted is
attached to also includes other variables that are _definitely_ defined
by the standard, like TZ and HOME. LINES and COLUMNS are, for that
matter, defined in the same section that TERM is defined in. I think we
have been talking at cross purposes, though... I was not saying,
precisely, that it was the terminal emulator's responsibility to _set_
them, merely to ensure they are _not_ set to values inherited from a
different terminal, which you appeared to be rejecting.



Re: [dev] [st][PATCH] Add support for utmp in st

2014-10-13 Thread random832


On Mon, Oct 13, 2014, at 14:38, k...@shike2.com wrote:
  On Sun, Oct 12, 2014, at 14:32, k...@shike2.com wrote:
  That doesn't mean that the question of what the default should be is not
  worth discussing.
 
 Default configuration was discussed here some time ago, and suckless
 developers agreed with current configuration.  Both options, Backspace
 generates BACKSPACE and Backspace generates DELETE have advantages and
 problems, and usually emulators have some way of changing between them.
 Xterm uses 3 resources for it: backarrowKeyIsErase, backarrowKey and
 ptyInitialErase, and has an option in the mouse menu: Backarrow.
 Putty has an option to select BACKSPACE or DELETE.  But for example,
 vt(1) in Plan 9 always generates BACKSPACE, and it is not configurable.
 
 If the user wants another configuration the suckless way is config.h.
 
 But, why do you think is better DELETE than BACKSPACE?
 
  The fact that you claim that matching the key codes to the labels on the
  keys are so important, yet you send the key codes associated with the
  VT220 Prior/Next/Find/Select keys when the user presses
  PageUp/PageDown/Home/End.
 
 What ascii codes are supposed they should send? (Home sends Home, not
 Find).

What is Home?


  That is not from the section where COLUMNS and LINES are defined (scroll
  further down the page). The table which the sentence you pasted is
  attached to also includes other variables that are _definitely_ defined
  by the standard, like TZ and HOME. LINES and COLUMNS are, for that
  matter, defined in the same section that TERM is defined in. 
 
 I just have seen them. You are right, they are defined by the standard,
 but the standard doesn't define how they must be updated (this was
 the part I knew, shells are not forced to set them).
 
  I think we
  have been talking at cross purposes, though... I was not saying,
  precisely, that it was the terminal emulator's responsibility to _set_
  them, merely to ensure they are _not_ set to values inherited from a
  different terminal, which you appeared to be rejecting.
 
 Ok, I understand you now.  Yes I agree in this point with you, and st
 already unsets LINES, COLUMNS and TERMCAP.  If the user needs them he
 must set them in some way (maybe in his profile if he runs a login
 shell), or use some program that sets them.  As we already have said,
 it is imposible for the terminal to set them each time the size is
 changed.
 
 We could say something similar for TERM, but it is impossible for the
 system to put this variable in a terminal emulator (system take it
 fomr /etc/ttys or /etc/inittab in real terminals), so I think terminal
 must set it.
 
 For the original problem, the incorrect setting of SHELL to utmp
 this is the patch:
 
 diff --git a/st.c b/st.c
 index c61b90a..bcf96e9 100644
 --- a/st.c
 +++ b/st.c
 @@ -1146,7 +1146,7 @@ die(const char *errstr, ...) {
  
  void
  execsh(void) {
 -   char **args, *sh;
 +   char **args, *sh, *prog;
   const struct passwd *pw;
   char buf[sizeof(long) * 8 + 1];
  
 @@ -1158,13 +1158,15 @@ execsh(void) {
   die(who are you?\n);
   }
  
 -   if (utmp)
 -   sh = utmp;
 -   else if (pw-pw_shell[0])
 -   sh = pw-pw_shell;
 +   sh = (pw-pw_shell[0]) ? pw-pw_shell : shell;
 +   if(opt_cmd)
 +   prog = opt_cmd[0];
 +   else if(utmp)
 +   prog = utmp;
   else
 -   sh = shell;
 -   args = (opt_cmd) ? opt_cmd : (char *[]){sh, NULL};
 +   prog = sh;
 +   args = (opt_cmd) ? opt_cmd : (char *[]) {prog, NULL};
 +
   snprintf(buf, sizeof(buf), %lu, xw.win);
  
   unsetenv(COLUMNS);
 @@ -1172,7 +1174,7 @@ execsh(void) {
   unsetenv(TERMCAP);
   setenv(LOGNAME, pw-pw_name, 1);
   setenv(USER, pw-pw_name, 1);
 -   setenv(SHELL, args[0], 1);
 +   setenv(SHELL, sh, 1);
   setenv(HOME, pw-pw_dir, 1);
   setenv(TERM, termname, 1);
   setenv(WINDOWID, buf, 1);
 @@ -1184,7 +1186,7 @@ execsh(void) {
   signal(SIGTERM, SIG_DFL);
   signal(SIGALRM, SIG_DFL);
  
 -   execvp(args[0], args);
 +   execvp(prog, args);
   exit(EXIT_FAILURE);
  }
  
 Guys, what do you think about it?
 
 Regards,
 
 


-- 
Random832



Re: [dev] [st][PATCH] Add support for utmp in st (DISREGARD LAST)

2014-10-13 Thread random832
Sorry I accidentally hit shift-enter and apparently that makes my email
client send.

On Mon, Oct 13, 2014, at 14:38, k...@shike2.com wrote:
 But, why do you think is better DELETE than BACKSPACE?

Because that is the character sent by the key in this position with this
expected function (i.e. the x] key above enter) on DEC terminals. It's
also, for better or worse, what modern linux systems have standardized
on (I haven't done a survey of bsd, solaris, etc).

 What ascii codes are supposed they should send? (Home sends Home, not
 Find).

What is Home?

{ XK_Home,  ShiftMask,  \033[2J,   0,   -1,   
0},
{ XK_Home,  ShiftMask,  \033[1;2H, 0,   +1,   
0},
{ XK_Home,  XK_ANY_MOD, \033[H,0,   -1,   
0},
{ XK_Home,  XK_ANY_MOD, \033[1~,   0,   +1,   
0},
{ XK_End,   ControlMask,\033[J,   -1,0,   
0},
{ XK_End,   ControlMask,\033[1;5F,+1,0,   
0},
{ XK_End,   ShiftMask,  \033[K,   -1,0,   
0},
{ XK_End,   ShiftMask,  \033[1;2F,+1,0,   
0},
{ XK_End,   XK_ANY_MOD, \033[4~,   0,0,   
0},

Find is ESC[1~, Select is ESC[4~. No other codes here are from DEC
terminals.



Re: [dev] [st][PATCH] Add support for utmp in st

2014-10-12 Thread random832
On Sun, Oct 12, 2014, at 03:48, k...@shike2.com wrote:
 And the profile runs in the same tty that st opens.  St by default
 executes a non login shell, so profile is not loaded, but utmp executes
 a login shell (because it creates the utmp session, so it is more
 logical for it to execute a login shell). 

Why shouldn't a non-login shell have a utmp session? And if this option
is to use a login shell, rather than merely using utmp, then I don't
think it should be a compile-time option - just because someone
sometimes wants a login shell (which could be done before, if desired,
by running e.g. sh -l) doesn't mean they always want one.

 I work with systems where BACKSPACE deletes the previous
 character, and it is really painful you cannot generate a BACKSPACE
 character with some terminal emulators. The position of St developers
 is very clear about this topic, st must generates the correct ascii
 value for each key.

What character does DEL generate? I'm assuming it generates either
DELETE or Remove Here, and either way it's going to be equally painful
that you can't generate one of the sequences that a VT220 does.
Meanwhile, the VT220 has no key that generates Backspace.

You didn't comment on the prior/next/find/select issue, either.

  Their meaning is defined in the standard. The method of obtaining
  default values is not, but that means it's the implementation's
  responsibility, not that it doesn't mean anything at all.
 
 Can you put here in which part of POSIX they are defined?. I'm
 sorry, but they are not standard (although are commons), and
 even there are some shells (dash for example) that don't set them.

http://pubs.opengroup.org/onlinepubs/007908799/xbd/envvar.html


  But unsetting it, along with the initial call to cresize, should be fine
  on most systems, so maybe I've been too harsh about this.
 
 I'm sorry, but this is a work of the shell, because it is not possible
 for a terminal (a real one, not emulated) to set variables.

A real terminal has a fixed size, which is known in termcap/terminfo. If
a terminal supports multiple sizes, you would historically have had to
alter the variables manually.



Re: [dev] [st][PATCH] Add support for utmp in st

2014-10-11 Thread random832
On Sat, Oct 11, 2014, at 04:07, k...@shike2.com wrote:
 Value of erase key for example, or in general the configuration
 of line kernel driver

These can't come from the profile either; since st opens a new tty that
is not the same device the user logged in on.

 (stty(1)). Backspace key in st generates
 BACKSPACE, but almost all terminals generate DELETE instead
 (read FAQ for more details).

It's not clear why the position of the key and the intent of the user
typing it is less important than the label of the key.

What's st's position on prior/next/find/select vs pgup/pgdn/home/end?
And speaking of the editing keypad keys, the code that most terminals
send for the del key is that for the VT220 remove here key. The delete
key above enter on the VT220 is labeled x], so it clearly has a meaning
of delete left despite sending ^?, and thus establishes an association
between ^? and delete left long before linux. The fact that the PC
keyboard key in this position is labeled Backspace is the historical
accident.

 I also have some adittionals
 configurations like for example 'tput smkx' (set keypad on),
 or `tabs`.About the three variables you tell, TERM is the only
 that a terminal must set, LINES and COLUMNS are shell stuff
 and are not even standard.

Their meaning is defined in the standard. The method of obtaining
default values is not, but that means it's the implementation's
responsibility, not that it doesn't mean anything at all.

But unsetting it, along with the initial call to cresize, should be fine
on most systems, so maybe I've been too harsh about this.



Re: [dev] [st][PATCH] Add support for utmp in st

2014-10-10 Thread random832
On Tue, Sep 23, 2014, at 01:18, Roberto E. Vargas Caballero wrote:
 St runs an interactive shell and not a login shell, and it means
 that profile is not loaded. The default terminal configuration
 in some system is not the correct for st, but since profile is
 not loaded there is no way of getting a script configures the
 correct values.

What exactly does terminal configuration mean here? TERM, LINES, and
COLUMNS? Shouldn't st itself be responsible for setting these? They
certainly don't belong in the profile.

What is utmp doing, exactly, and why does st want to run the user's
default shell instead of the SHELL that's passed in to st's environment
by its parent? Is it appropriate to be setting SHELL to utmp? Why set
SHELL at all? What program does utmp execute, and is it intentional that
utmp is not executed if the user specifies a command?



Re: [dev] Ideas for using sic

2014-10-01 Thread random832
On Wed, Oct 1, 2014, at 12:57, q...@c9x.me wrote:
 On Mon, Sep 29, 2014 at 09:55:07PM -0700, Eric Pruitt wrote:
  rlwrap ./sic -h $IRC_HOST | tee -a irc-logs | grcat sic.grcat
 
 Hi,
 
 how does rlwrap deal with random text that gets inserted by sic
 when some data arrives on the channel?  This was my main problem
 with sic, to prevent that and enable multichannel I have written
 http://c9x.me/irc/.

It occurs to me that a line input program (that would work along the
lines of a mud client, with a separate editable input line from where
output goes, and maybe managing scrollback) would be a good candidate
for a do one thing utility.



Re: [dev] [RFC] Design of a vim like text editor

2014-09-25 Thread random832
On Thu, Sep 25, 2014, at 08:57, Raphaël Proust wrote:
 I actually have  my vimrc setting K as an upward J (i.e., join current
 line with the previous one) (although I haven't made the effort to
 make it work in visual mode because then I just use J):
 nnoremap K :.-,.joinCR

Why not just map it to kJ?



Re: [dev] [RFC] Design of a vim like text editor

2014-09-24 Thread random832
On Wed, Sep 24, 2014, at 15:21, Marc André Tanner wrote:
  x should not delete the end of line character (but this might be solved 
  with the placement issue above)
 
 I (and a few others? Christian Neukirchen?) actually like the fact that 
 the newline is treated like a normal character.

You might consider an option like whichwrap [which can make vim delete
newline with x - well, not x, but 2x.] to enable and disable this
behavior.



Re: [dev] [RFC] Design of a vim like text editor

2014-09-24 Thread random832
On Wed, Sep 24, 2014, at 15:36, Marc André Tanner wrote:
  - 'J' in visual mode is not implemented
 
 Why would one use it?

To be able to select lines to be joined interactively instead of having
to count the lines by hand (since there's no Jmovement, only
countJ). I do this all the time.



Re: [dev] [st] Understading st behaviour

2014-04-16 Thread random832
On Wed, Apr 16, 2014, at 4:19, Amadeus Folego wrote:
 It works! As I am using tmux just for the scrollback and paste
 capabilities I am not worried with losing sessions.
 
 Maybe I'll write a suckless multiplexer for this sometime.

Eh - multiplexing refers to the multiple session capability, not to
the scrolling.

The basic issue is that tmux provides three relatively distinct
features: scrolling, multiplexing, and detachability. A program
providing any one of these capabilities essentially has to be a terminal
emulator - you can take some shortcuts, like passing through the
keyboard, and passing through output rather than reinterpreting it, but
you've got to parse all output control sequences to know what's on the
screen. For scrolling, you need it in order to understand what has
scrolled off the screen and in order to restore the main screen when
you're done with scrolling. For multiplexing, you need it in order to
effectively switch between windows. For detaching, you need it to
restore the content when reattaching.

I've actually used a detaching program that doesn't track screen
contents (it discards all output while detached, and sends SIGWINCH or
control-L on reattach to make the program redraw itself) - it's not
pleasant to deal with for non-fullscreen programs. You could do
multiplexing the same way, in principle, but it's intractable for
scrolling.

A truly suckless design would have the three features in separate
programs. And since they all have to do essentially the same thing
(maintain their own idea of the screen state and redraw it on demand),
this functionality could be in a library. Or you could just have it in
the scrolling program and the other two programs don't care, which would
make it a somewhat unpleasant experience to try to use them without
being in conjunction with the scrolling program.

That's also three separate programs you have to control from the
keyboard.



Re: [dev] What is bad with Python

2014-03-13 Thread random832
On Wed, Mar 12, 2014, at 15:04, FRIGN wrote:
 Impressive, but better use
  $ LD_TRACE_LOADED_OBJECTS=1 t
 instead of
  $ ldd t
 next time to prevent arbitrary code-execution[1] in case you're dealing
 with unknown binaries.

I don't know if it was here and you or somewhere else or someone else,
but someone said this before and I pointed out the problems with this
argument. It's even worse in this case because you propose using
LD_TRACE_LOADED_OBJECTS=1 t [which won't actually work, incidentally,
without . in PATH] instead of LD_TRACE_LOADED_OBJECTS=1
/lib/ld-linux.so.2 ./t - your proposed command doesn't actually prevent
the exploit (it actually makes it easier, by making it possible to
exploit with a mere statically-linked program rather than a fancy ELF
interpreter trick)

Also, wanting to do this with an unknown, untrusted executable is, in
practice, _incredibly rare_. And since this is an executable he just
built himself, it obviously doesn't apply here. The 'safe' command
[which, remember, you got wrong] is onerously long for a suggestion that
people should use every time. Maybe the best way forward is to make ldd
default to the safe way and require user confirmation (with a warning)
before the unsafe one.



Re: [dev] [sbase] move mknod(1) to ubase

2014-01-28 Thread random832
On Sat, Jan 25, 2014, at 17:46, Roberto E. Vargas Caballero wrote:
 Uhmm, it looks bad. If we want to be 100% POSIX complaint then we have to
 move
 mknod to ubase, and change the mknod system call of tar (and next
 archivers that
 could be implemented in sbase) to a system(mknod ...).

The mknod utility isn't in POSIX either. POSIX permits tar
implementations to ignore block and character device entries:
http://pubs.opengroup.org/onlinepubs/7908799/xcu/pax.html



Re: [dev] portable photoshop-like lite application based on C?

2013-12-03 Thread random832
On Tue, Dec 3, 2013, at 9:50, Markus Teich wrote:
 Mihail Zenkov wrote:
  ldd /usr/bin/gimp-2.8
 
 Heyho,
 
 http://www.catonmat.net/blog/ldd-arbitrary-code-execution/

Considering that he probably _actually_ executes the very same gimp-2.8
binary all the time, your concern is misplaced. This attack is highly
situational, requiring the attacker to cause someone to encounter a
binary that they would not otherwise execute and to be curious about
what libraries it uses.

Don't run ldd on an unknown binary you wouldn't execute becomes don't
run ldd ever on anything - the cargo cult at its finest. I propose not
allowing untrusted binaries to be placed in /usr/bin in the first place.



Re: [dev] suckless shell prompt?

2013-11-26 Thread random832
On Mon, Nov 25, 2013, at 5:26, Martti Kühne wrote:
 Announcing a shell prompt and including git.h indeed makes no sense
 whatsoever. What part of git is useful when writing a shell
 interpreter? I'm sorry, I can't possibly imagine how this isn't
 apparent to you.

Do you understand the difference between a prompt and an interpreter?

This is a program that is meant to be called to print stuff before each
command you type (Several shells include the ability to call such a
program). Not a shell interpreter. This is _so_ blindingly obvious
that your failure to recognize it calls your ability to have basic
reading comprehension into question.



Re: [dev] suckless shell prompt?

2013-11-26 Thread random832
On Tue, Nov 26, 2013, at 12:09, Bryan Bennett wrote:
 And sending that email calls into question your ability to either read
 a full thread or to recognize human names.

In my defense, you'd already had it pointed out to you once and
continued in your misconception without even understanding the
correction. That you would then miraculously realize your mistake and
sent a later email (which you have no reason to assume that I'd received
at the time I wrote my response) retracting it is not something I could
reasonably have been expected to predict.



Re: [dev] suckless shell prompt?

2013-11-22 Thread random832
On Thu, Nov 21, 2013, at 13:44, Martti Kühne wrote:
 Staring at the code in horror.
 Something about git and nyancat.
 Without running the code - I have trust issues from similar occasions
 - you're kidding, right?

The nyancat thing is clearly just a little joke. As for git... you can't
_possibly_ be serious about being horrified that a program written for
the specific purpose of displaying git repository information uses git.



Re: [dev] Suckless remote shell?

2013-11-07 Thread random832
On Tue, Nov 5, 2013, at 9:43, Szabolcs Nagy wrote:
 you don't have large file support,

The lack of large file support is entirely an artifact of the fact that
the lseek listed on that page uses an int instead of an off_t. The
existence of special APIs for large file support on e.g. Linux and
Solaris is an artifact of the fact that OSes made before a certain time
period used a 32-bit type for off_t. A modern OS does not need any more
system calls for large file support, since you can simply discard the
non-large-file-supporting versions of those system calls.



Re: [dev] Mailing list behavior - was: Question about arg.h

2013-11-07 Thread random832
On Thu, Nov 7, 2013, at 11:42, Calvin Morrison wrote:
 Why do I top post? yes i am lazy! After being with gmail since it was
 in beta, I still don't have an option to god damned bottom-post by
 default!!

Top posting or bottom posting isn't an option, it's determined by
_where you click the mouse_. You're not supposed to just start typing
where the cursor drops, you're supposed to edit out the bits of the
quote that you're not replying to.

I'm sick of people blaming their email clients and other people taking
this at face value. What the hell would such an option _do_?



Re: [dev] st: bracketed paste mode

2013-09-19 Thread random832
On Thu, Sep 19, 2013, at 10:51, Nick wrote:
 To check, how does this work exactly? Does X send the escape code to
 any window when pasting with middle click, and those which don't
 understand it just ignore it? And then once st has done the
 appropriate stuff with the pasted text, vim (for example) will
 detect that and behave as though :paste is enabled for the duration
 of the paste?

The application has to request it be enabled with a private mode escape
sequence. I don't believe vim presently has any built-in support for it,
but I could be wrong - and you could probably hack it by putting the
mode in t_is or t_ti, putting the end escape sequence in :set paste, and
setting a keybinding for the start sequence.



Re: [dev] [st] Implementing non-latin support...

2013-06-15 Thread random832
On Fri, Jun 14, 2013, at 23:22, Eon S. Jeon wrote:
 I'm not used to IRC, but I'll try to stay in the channel. It'll be nice
 to talk about this topic.
 
 By the way, would you give me some information about your patch? I
 started working on this, because I had not been able to find actual
 works.
 
 Well, instead, I found some mails posted by you in April. I kinda agree
 with what you were talking about. It does feel awkward to store utf8
 stream
 instead of code points, though I decided to bear it. lol

I stored utf8 because it already stores utf8; but then I ended up not
being able to actually come up with a solution for combining characters,
so what do I know?

There's a copy of my st.c attached to one of those emails, I think.

-- 
Random832



Re: [dev] [st] Implementing non-latin support...

2013-06-15 Thread random832
On Sat, Jun 15, 2013, at 0:35, Eon S. Jeon wrote:
 Thanks for your interest.
 
 Would you explain how you tested? I've done only few tests: echo  vim.
 The cursor handling should be incomplete, because I used a very hacky
 method to workaround the innate ASCII-ism structure.

For cursor behavior, generally what other terminals do is allow the
cursor to actually be in either of the two cells (and movement
commands can place it in either one), but they _draw_ it over the whole
character (moving the cursor from one half of a wide character to the
other therefore has no visual change). When e.g. horizontal movement in
something like a text editor goes one whole wide character at a time,
it's generally because the application is enforcing this by moving it
two columns explicitly.

What you should do is run the command stty cbreak -echo; cat, then do
some typing (and pasting of wide characters), moving around the cursor
with arrows (which send single cursor movement escape sequences), and
type in other escape sequences for anything you're curious about. I've
attached a file I used as a test suite to discover the behavior of other
terminals. Note that you should do this outside of tmux if you use it;
tmux itself has some bugs in this area that can make it hard to
understand what's going on.
#8
Overwrite Tests:
EEEEEEEEEEEEE
EEEEEEEEEEEEE
EEEEEEEEEEEEE
EEEEEEEEEEEEE
EEEEEEEEEEEEE
ABCDEFGHIJK
Wrap tests:
123456789
123456789
123456789
123456789
Deletion Tests:
123
123
123
123
123
123
456
456
456
456
456
456
789
789
789
789
789
789
|
|
|
|
|
|
|
|
|
|



Re: [dev] [sbase] 64-bit type for split

2013-06-14 Thread random832
On Tue, Jun 11, 2013, at 13:35, Galos, David wrote:
 In my implementation of split, the ability to split files into rather
 large chunks is important. However, c89 does not provide a 64-bit int
 type by default. Although I could manually emulate 64-bit counting, a
 uvlong would be far cleaner. Is there a suckless-approved way of using
 such an integer in a c89 environment?

c89 provides whatever size types it wants to.

How exactly do you think you are going to be able to work with / create
files larger than whatever off_t type is provided by the environment
supports? Or are you limiting this to pure ansi instead of posix?



Re: [dev] [st] Implementing non-latin support...

2013-06-14 Thread random832


On Fri, Jun 14, 2013, at 17:24, esj...@lavabit.com wrote:
 I'm currently working on non-latin character support. I uploaded my
 progress to github.
 
 Github URL: https://github.com/esjeon/st/tree/stable-nonlatin
 (branch 'stable-nonlatin', meaning it's based on stable(?) release 0.4.1)
 ... and here's my test string: #54620;#44544; #28450;#23383;
 #12402;#12425;#12364;#12394;
 
 Everything looks just okay. Basically, wide characters are displayed
 correctly, and can be selected and copied. I have not tested with input
 methods, because I don't use them.

I already had a wide character patch a few weeks ago, and did some
fairly extensive testing of what other terminals do with them in various
overwriting/insertion/deletion situations. Are you on IRC?



[dev] [sbase] changes to test

2013-05-30 Thread random832
I had partially implemented the test/[ command a while ago, and then got
distracted with other things and never came back to it - I remembered
about it when I saw this other sbase patch.

This version has binary operators (e.g. = != -gt -lt) implemented
(limited to the r, and properly handles being called as /bin/[ (previous
version required argv[0] to be == [ to invoke [ behavior, this one
simply checks the last character of argv[0])

It still only supports the POSIX single-test syntax, with no support for
the XSI ( ) -a -o operators.
/* See LICENSE file for copyright and license details. */
#include stdbool.h
#include stdio.h
#include stdlib.h
#include string.h
#include unistd.h
#include sys/stat.h
#include util.h

static bool unary(const char *, const char *);
static bool binary(const char *, const char *, const char *);
static void usage(void);
bool is_bracket = false;

int
main(int argc, char *argv[])
{
bool ret = false, not = false;

argv0 = argv[0];
if(*argv0  argv0[strlen(argv0)-1] == '[') {
/* checks if argv[0] ends with [
 * for [ or /bin/[ etc */
is_bracket = true;
if(strcmp(argv[argc-1], ]) != 0)
usage();
argc--;
}

if(argc  2  !strcmp(argv[1], !)) {
not = true;
argv++;
argc--;
}
switch(argc) {
case 2:
ret = *argv[1] != '\0';
break;
case 3:
ret = unary(argv[1], argv[2]);
break;
case 4:
ret = binary(argv[1], argv[2], argv[3]);
break;
default:
usage();
}
if(not)
ret = !ret;
return ret ? EXIT_SUCCESS : EXIT_FAILURE;
}

bool
unary(const char *op, const char *arg)
{
struct stat st;
int r;

if(op[0] != '-' || op[1] == '\0' || op[2] != '\0')
usage();
switch(op[1]) {
case 'b': case 'c': case 'd': case 'f': case 'g':
case 'p': case 'S': case 's': case 'u':
if((r = stat(arg, st)) == -1)
return false; /* -e */
switch(op[1]) {
case 'b':
return S_ISBLK(st.st_mode);
case 'c':
return S_ISCHR(st.st_mode);
case 'd':
return S_ISDIR(st.st_mode);
case 'f':
return S_ISREG(st.st_mode);
case 'g':
return st.st_mode  S_ISGID;
case 'p':
return S_ISFIFO(st.st_mode);
case 'S':
return S_ISSOCK(st.st_mode);
case 's':
return st.st_size  0;
case 'u':
return st.st_mode  S_ISUID;
}
case 'e':
return access(arg, F_OK) == 0;
case 'r':
return access(arg, R_OK) == 0;
case 'w':
return access(arg, W_OK) == 0;
case 'x':
return access(arg, X_OK) == 0;
case 'h': case 'L':
return lstat(arg, st) == 0  S_ISLNK(st.st_mode);
case 't':
return isatty((int)estrtol(arg, 0));
case 'n':
return arg[0] != '\0';
case 'z':
return arg[0] == '\0';
default:
usage();
}
return false; /* should not reach */
}

bool
binary(const char *arg1, const char *op, const char *arg2)
{
intmax_t iarg1, iarg2;
if(!strcmp(op,=) || !strcmp(op,==)) {
return strcmp(arg1,arg2) == 0;
}
if(!strcmp(op,!=)) {
return strcmp(arg1,arg2) != 0;
}
/* Note: this does not handle correctly if the values are both
 * out of range in the same direction, it will consider them
 * equal. */
iarg1 = strtoimax(arg1, 0, 10);
iarg2 = strtoimax(arg2, 0, 10);
if(!strcmp(op,-eq)) return iarg1 == iarg2;
if(!strcmp(op,-ne)) return iarg1 != iarg2;
if(!strcmp(op,-gt)) return iarg1  iarg2;
if(!strcmp(op,-ge)) return iarg1 = iarg2;
if(!strcmp(op,-lt)) return iarg1  iarg2;
if(!strcmp(op,-le)) return iarg1 = iarg2;
usage();
}

void
usage(void)
{
const char *ket = is_bracket ?  ] : ;

eprintf(usage: %s string%s\n
   %s [!] [-bcdefghLnprSstuwxz] string%s\n
   %s [!] string1 {=,!=} string2%s\n
   %s [!] int1 -{eq,ne,gt,ge,lt,le} int2%s\n
, argv0, ket, argv0, ket, argv0, ket, argv0, ket);
}


Re: [dev] [sbase] changes to test

2013-05-30 Thread random832
On Thu, May 30, 2013, at 10:09, random...@fastmail.us wrote:
 This version has binary operators (e.g. = != -gt -lt) implemented
 (limited to the r

My client ate part of this sentence - It was limited to the range of
intmax_t.



Re: [dev] [sbase] changes to test

2013-05-30 Thread random832
On Thu, May 30, 2013, at 15:31, Christoph Lohmann wrote:
 Please  make  this  a  diff  or  patch and include the manpage too. Just
 throwing out code pieces does not really keep maintainers motivated.

Okay - I'll get it in patch format later today, but it might be this
weekend before I have time to write a manpage - test has a _lot_ of
options.

-- 
Random832



Re: [dev] Re: Why HTTP is so bad?

2013-05-27 Thread random832


On Sun, May 26, 2013, at 9:21, Dmitrij Czarkoff wrote:
 May it owe to the fact that this particular IPC protocol is *the*
 protocol
 used for nearly all IPC in the system?

I have no idea what protocol you are talking about.



Re: [dev] Re: Why HTTP is so bad?

2013-05-26 Thread Random832

On 05/25/2013 12:55 AM, Strake wrote:
Yes. Thus I can easily swap out any component, or insert mediators 
between components. For example, I could write my own fetcher to scrub 
the HTTP headers, or block ads; and I wouldn't need plug-ins to view 
PDFs or watch movies. 
Why is the requirement that it conform to your IPC protocol* less 
onerous than requiring it to conform to a particular in-process API that 
would make it a plug-in?


*which has to handle navigation on both ends, A] what happens when you 
click a link in your viewer and B] what happens to your viewer when the 
user navigates away from it. Also, is the browser required to download 
the whole file before opening the viewer, or can for example a PDF 
viewer display the first page before the last page is downloaded? Also 
for large files (highly relevant to a movie viewer) with a file format 
that allows it, you could take advantage of range fetching, but in both 
of these cases the viewer has to speak HTTP and just be told a URL by 
the navigation component.




Re: [dev] upload via html?

2013-05-26 Thread Random832

On 05/25/2013 07:29 PM, Nicolas Braud-Santoni wrote:

Well, SFTP requires you to create a user account. (I'm aware that it may
not be one with which you can SSH in).
Some people might not want this.
Everything runs as a user. You could use www-data, whatever anonymous 
FTP uses, or simply nobody. There's no fundamental reason you couldn't 
write an SFTP daemon that allows anonymous access.


However, this doesn't exist by default. Also, and this is something many 
people may not know, it's non-trivial to make an account that cannot be 
used for _port forwarding_ - simply making it impossible to log in with 
a shell [e.g. shell set to /bin/false] doesn't accomplish this.




Re: [dev] Re: Why HTTP is so bad?

2013-05-24 Thread random832
On Fri, May 24, 2013, at 16:02, Strake wrote:
 Yes. A web browser ought to have a component to fetch documents and
 start the appropriate viewer, as in mailcap. The whole monolithic web
 browser model is flawed.

And you spend a day on wikipedia or tvtropes and you've got two hundred
HTML viewers open?

You need _something_ monolithic to manage a linear (or, rather,
branching only when you choose to, via open new window or new tab)
browsing history, even if content viewers aren't part of it. When you
click a link within the appropriate viewer, it needs to be _replaced_
with the viewer for the content at the link you clicked on.

And if you don't like the way people normally browse a site like
wikipedia or tvtropes, then... well, you've missed the point of
hypertext, and what you're building isn't a web browser.



Re: [dev] upload via html?

2013-05-14 Thread random832
On Mon, May 13, 2013, at 23:20, Sam Watkins wrote:
 HTTP PUT with ranges would be useful, could mount filesystems over HTTP.

There's no standard HTTP directory listing.



[libutf] Re: [dev] [st][patch] not roll our own utf functions

2013-05-05 Thread Random832

On 05/05/2013 01:06 PM, Nick wrote:

Hmm, I'm not sure that's the right decision. Maybe include the
appropriate .c  .h file for libutf in the source tree? That's what
I do in a couple of projects. I don't have strong feelings about it,
but libutf is pretty reasonable and I'm not convinced it should be
avoided.


I also have to wonder what's the point of libutf at all if it's not 
going to be used as the UTF-8 library for suckless projects.




Re: [dev] [st] RFC halt function

2013-04-25 Thread random832
On Thu, Apr 25, 2013, at 14:15, Christoph Lohmann wrote:
 Nice  joke.  Try to implement a scrollback buffer without bugs and flaw‐
 lessly.
 
 
 Sincerely,
 
 Christoph Lohmann

The buffer's the easy part. What's hard is actually implementing
scrolling.

I've been tempted to hack in a way to have the mouse wheel transparently
tell tmux to scroll up and down through its modal scrolling feature.



Re: [dev] st: Large pile of code

2013-04-24 Thread random832
On Wed, Apr 24, 2013, at 9:32, Carlos Torres wrote:
 I like the seperation of term.c from st.c,  I agree that makes reading
 st.c clearer.  I can't comment on the removal of forward declarations,
 typedefs and static vars though the resulting difference is legible as
 well.  (frankly code in alphabetical order makes me want to sort it
 according to code flow and surrounding context...) i think the choice
 of using the fontconfig utf8 functions was a good idea.   I frowned
 when you switched to 'gnu99' from 'c99' (i pictured a lot of flames on
 that)

If it _can_ be compiled in c99 mode, no reason it shouldn't be - then
people can compile it using LLVM/clang, tendra, pcc, etc.

How hard is it going to be to merge these changes with what changes have
been made to the main version since he branched off from it?



Re: [dev] [st] double-width usage

2013-04-23 Thread Random832

On 04/23/2013 03:07 AM, Christoph Lohmann wrote:

Hello comrades,

Here’s some RFC for people using double‐width characters in terminals in
their daily life.

Which applications do you use that handle double-width as you expect them?

Do these applications use the double-width for the layout?

Any Chinese or Japanese user?

If double-width characters would be drawn to fit the standard cell size of the
terminal (drawing them in half the font size) would this suffice your need?

This question implies that it is possible to simply increase the
average fontsize so the complex glyphs look good. Would this suffice
your need?


Naming the applications would be important so I can test st to their
compatibility.


Sincerely,

Christoph Lohmann


Did you see the st.c I posted a few days ago? The logic for double width 
is mostly complete in it - I just have to fix a few graphical glitches, 
and there are a couple of corner cases (mainly regarding erasing a 
double width character by overwriting with a single width when 
background colors are involved) that different terminals don't handle 
the same way, that it's not clear which terminal we should follow or if 
it's even important to emulate one particular behavior.




Re: [dev] [st] [PATCH] 8bit-meta like xterm

2013-04-23 Thread Random832

On 04/23/2013 04:50 AM, Christoph Lohmann wrote:

I  am considering making this the default behaviour of st. Are there any
arguments against it?
I'm actually confused by what he means by most of the apps I tried 
didn't recognize the escape sequence, because every app I've ever used 
recognizes it (which is simply to prefix a character with \033 to 
represent meta), no app I've ever used recognizes the 8-bit behavior, 
and an app expecting the 8th bit (which I've never seen) would not 
handle non-ascii text input properly.




Re: [dev] [st] [PATCH] 8bit-meta like xterm

2013-04-23 Thread random832
On Tue, Apr 23, 2013, at 7:51, Otto Modinos wrote:

It means first of all vim. I have also tried mocp, and that didn't work
either. What apps are you using that work?



Huh? Vim doesn't have any keybindings that use meta.



Wait a minute... what exactly do you _expect_ meta to do? Using (for
example) meta-a to type 0xE1 a with acute is _not_, in fact, the
expected or intended behavior; it is a bug. And I don't think it will
even work with UTF-8 applications, and st is an exclusively UTF-8
terminal.



What I expect meta to do is for example in irssi meta-a goes to the
next window with activity, meta-1 goes to window 1, etc.



--
Random832


Re: [dev] [st] [PATCH] 8bit-meta like xterm

2013-04-23 Thread random832
On Tue, Apr 23, 2013, at 10:30, Thorsten Glaser wrote:
 random...@fastmail.us dixit:
 
 Wait a minute... what exactly do you _expect_ meta to do? Using (for
 example) meta-a to type 0xE1 a with acute is _not_, in fact, the
 expected or intended behavior; it is a bug. And I don't think it will
 
 No, it is the intended behaviour.
 http://fsinfo.noone.org/~abe/typing-8bit.html

The fact that someone discovered it, _thought_ it was intended, and
showed other people how to do it does not mean that it actually was
intended.

 even work with UTF-8 applications, and st is an exclusively UTF-8
 terminal.
 
 XTerm handles that transparently: when in UTF-8 mode, Meta-d
 is still CHR$(ASC(d)+128) = ä, just U+00E4 instead of a
 raw '\xE4' octet.

If this were an intended feature why would it elevate latin-1 over other
unicode characters? This only proves my point.

 This is *extremely* useful – especially as it leads people
 away from national keyboard layouts towards QWERTY while
 retainig the ability to write business eMails, which require
 correct spelling.


And what the heck is wrong with national keyboard layouts that it's
useful to lead people away from them?



Re: [dev] [st] double-width usage

2013-04-23 Thread random832
On Tue, Apr 23, 2013, at 11:01, Silvan Jegen wrote:
 I saw, compiled and tested it but when using mutt only half of the
 (Japanese Kanji) characters would be drawn (so presumably only one
 character cell of a two-cell double character). If I wasn't at a
 conference I would deliver some screenshots but as things stand, I can
 only get back to you after I am back home in a few days.

Don't worry about screenshots, I know what it looks like. That is the
graphical glitch I was referring to (along with being left-aligned).

I'm considering possible ways to fix it - but all I can think of is to
rewrite the entire drawregion function to draw all backgrounds first and
then all characters.



[dev] [st] colors and attributes, general question

2013-04-23 Thread random832
I'm planning on reworking xdraws and drawregion to draw the background
and text as separate functions. To do this I need to understand some
things:

As I understand it, the behavior is to have all attribute effects on
color (e.g. bold brightening, italic/underline colors) affect only the
foreground and not the background when in normal mode, and affect only
the background and not the foreground in reverse (ATTR_REVERSE) mode. Is
this understanding correct?

As I understand it, the behavior for MODE_REVERSE is to use RGB color
inversion, and not bg/fg swapping (so yellow-on-red becomes
blue-on-cyan, not red-on-yellow) , on all colors _except_ for the
default bg/fg colors. Is this understanding correct?

Is the above outlined behavior actually correct by the standards /
desirable / does it match the behavior of other terminals? config.def.h
says  Another logic would only make the simple feature too complex.
but I find that in making this change supporting the current behavior
(my understanding as outlined above) is actually more complex (because
it requires me to duplicate attribute color mapping in both functions)

There is also, as far as I can tell, no support for brightening the
background for ATTR_BLINK.

-- 
Random832



Re: [dev] [st] colors and attributes, general question

2013-04-23 Thread random832
On Tue, Apr 23, 2013, at 11:53, Christoph Lohmann wrote:
 It’s  the  simple  way of doing all the brigthening and reversing. St is
 keeping to what other terminals do. But since none of them keeps to  any
 standard colors or good behaviour is this what makes st being what it is
 – a simple terminal.

My point was, it's only the simple way when you've already got both
colors calculated because the function draws both. But if drawing
backgrounds is moved into a separate function, as I was planning to do,
that function would be simpler if it didn't have to think about the
effects that bold/italic/underline have on the foreground color. My
question was whether this behavior is like this because other terminals
do the same thing, or _only_ because it's simpler.



Re: [dev] [PATCH] Fix selecting clearing and BCE

2013-04-23 Thread random832
On Tue, Apr 23, 2013, at 14:34, Roberto E. Vargas Caballero wrote:
 From: Roberto E. Vargas Caballero k...@shike2.com
 
 The commit b78c5085f72 changed the st behaviour enabling BCE capability,
 that means erase regions using background color. Problem comes when you
 clear a region with a selection, because in this case the real mode of
 the
 Glyph is not the value of term.line[y][x], due in drawregion we had
 enabled
 the ATTR_REVERSE bit.

I don't understand the issue. How is this desired behavior? It looks
like your change makes it toggle the _real_ ATTR_REVERSE bit on the
selected region, making the selection appear to vanish, and it'll end up
in the wrong colors once the selection is actually removed.



Re: [dev] [PATCH] Fix selecting clearing and BCE

2013-04-23 Thread random832
On Tue, Apr 23, 2013, at 16:21, Roberto E. Vargas Caballero wrote:
 In drawregion you have:
 
 3172 bool ena_sel = sel.bx != -1;
 3173
 3174 if(sel.alt ^ IS_SET(MODE_ALTSCREEN))
 3175 ena_sel = 0;
 ...
 3190 if(ena_sel  *(new.c)  selected(x, y))
 3191 new.mode ^= ATTR_REVERSE;
 
 in selclear:
 
 937 sel.bx = -1;
 938 tsetdirt(sel.b.y, sel.e.y);
 
 in bpress:
 
 822 if (sel.snap != 0) {
 823 tsetdirt(sel.b.y, sel.e.y);
 824 draw();
 825 }
 
 
 
 It means when you select something you modify the attribute of the
 selected
 region.

That's not true. Only the attribute passed to xdraws() is altered - the
real attribute stored in the character cell is left alone.

Line 3189   new = term.line[y][x]; makes a _copy_ of
the Glyph structure, and line 3191 only modifies the copy, not the
original.



Re: [dev] [PATCH] Fix selecting clearing and BCE

2013-04-23 Thread random832
On Tue, Apr 23, 2013, at 17:05, Roberto E. Vargas Caballero wrote:
 

What _exactly_ is the behavior you are observing? Are you sure it's not
_actually_ staying selected, rather than simply drawing that way?

If you click the mouse somewhere else, does the original selection go
away, or does it stay reversed? If it goes away, what is the problem?
Are you expecting it should go away immediately when it is erased? There
is some merit to the idea that the selection should go away if any
character within it is modified - maybe we should be talking about that.

-- 
Random832



Re: [dev] [PATCH] Fix selecting clearing and BCE

2013-04-23 Thread Random832

On 04/23/2013 05:27 PM, Roberto E. Vargas Caballero wrote:

It is very confusing see a hightlight blank line, that really is selecting
the previous content of the line. If the selecting mark keeps in the screen
it is only some garbage in it. If you can find other terminal emulator with
this behaviour please let me know it.

Maybe the behavior is wrong - but if the problem is that it is _still 
selected_ (i.e. hilight goes away when you select something else), it's 
not something that can be solved with anything to do with visual 
attributes only.


That was why I was asking for clarification whether it is _still 
selected_, or just _still hilighted_.


I wasn't able to view the video or run st at the time when you posted 
the video... now I've run st and confirmed that the problem is that it 
is _still selected_. I can work on a patch to fix this today.


This really has nothing to do with the visual attribute, it's that the 
logic for removing the selection when its content changes (whether by 
erasing or by text being printed within it) is broken or missing.




[dev] Need help implementing combining characters

2013-04-21 Thread Random832
I've got the logic fully implemented (both for maintaining multiple 
characters in a single cell and for copying the selection) but I can't 
figure out how to make them draw correctly. I don't understand what the 
xdraws and drawregion function is doing.


Current behavior is it draws the combining glyph in a cell of its own 
next to the base glyph.




Re: [dev] [st] Need help implementing combining characters

2013-04-21 Thread Random832

On 04/21/2013 12:45 PM, Carlos Torres wrote:


Maybe send out what you have and others can better grok what you 
intend, and see how it may fit?


I could do that, but I haven't actually modified the drawing code 
substantially yet, except to include the combining characters themselves 
in the buffer that drawregion passes to xdraws.


The general design I am using is to expand the 'c' array within the 
Glyph struct, and store multiple UTF-8 characters (zero-padded) in it.


I'm also having drawing issues with double-width characters that I don't 
know how to fix, so maybe it would be best if I just send what I have. 
Should I send it in the form of a patch or just my st.c?




Re: [dev] [st] wide characters

2013-04-15 Thread random832
On Mon, Apr 15, 2013, at 10:58, Martti Kühne wrote:
 On Sun, Apr 14, 2013 at 2:56 AM, Random832 random...@fastmail.us wrote:
  Okay, but why not work with a unicode code point as an int?
 
 -1 from me.
 It is utter madness to waste 32 (64 on x86_64) bits for a single
 glyph.

A. current usage is char[4]

B. int is 32 bits on x86_64. There's no I in LP64.

 According to a quick google those chars can become as wide as 6
 bytes,

No, they can't. I have no idea what your source on this is.

 and believe me you don't want that, as long as there are
 mblen(3) / mbrlen(3)...

I don't know how these functions are relevant to your argument.



Re: [dev] [st] wide characters

2013-04-15 Thread random832
On Mon, Apr 15, 2013, at 15:16, Strake wrote:
 On 15/04/2013, random...@fastmail.us random...@fastmail.us wrote:
  On Mon, Apr 15, 2013, at 10:58, Martti Kühne wrote:
  According to a quick google those chars can become as wide as 6
  bytes,
 
  No, they can't. I have no idea what your source on this is.
 
 In UTF-8 the maximum encoded character length is 6 bytes [1]

What on earth does that have to do with using an int to store the code
point *instead of* the raw UTF-8 bytes (which are used _now_)?

Also, this is out of date; the latest version of unicode (since 2003 at
the latest) limits code points to 0x10 and therefore UTF-8 sequences
to four bytes. Unless your manpage is much older than mine, it states
this clearly and you misread it.



Re: [dev] [st] wide characters

2013-04-15 Thread random832
On Mon, Apr 15, 2013, at 15:36, Thorsten Glaser wrote:
 Actually, wint_t is the standard type to use for this. One
 could also use wchar_t but that may be an unsigned short on
 some systems, or a signed or unsigned int.

Those systems aren't using wchar_t *or* wint_t for unicode, though.

The main reason for wint_t's existence is that wchar_t isn't guaranteed
to be able to represent a WEOF value distinct from all valid character
values. wchar_t can be used just fine for any actual character, but if
the system doesn't use unicode as its wchar type, it could (for example)
be a signed 16-bit int to wchar_t's unsigned 8-bit.

You can use #if __STDC_ISO_10646__ to test whether the implementation
uses unicode for wchar_t (most modern systems do, though some may not
define this constant) - if so, then wchar_t is, naturally, guaranteed to
be able to represent at least the range 0 to 0x10, and wint_t that
plus WEOF (usually -1). They're usually both 32-bit signed ints.

MS Windows uses an unsigned short for both types due to various
historical reasons.



Re: [dev] [st] wide characters

2013-04-14 Thread Random832

On 04/14/2013 02:10 AM, Christoph Lohmann wrote:

Greetings.

On Sun, 14 Apr 2013 08:10:22 +0200 Random832 random...@fastmail.us wrote:

I am forced to ask, though, why character cell values are stored in
utf-8 rather than as wchar_t (or as an explicitly unicode int) in the
first place, particularly since the simplest way to detect a wide
character is to call the function wcwidth. What was the reason for this
design decision? It doesn't save any space, since on most systems
UTF_SIZ == sizeof(int) == sizeof(wchar_t).

That  design decision can change when I’m actually implementing the dou‐
ble‐width and double‐height support in st. The codebase is small  enough
to change such a type in less than 10 minutes. So no religion was intro‐
duced here.


The reason for my question about using codepoints instead of UTF-8 was 
because I thought it might make it easier to support combining 
diacritics, not wide characters. The two problems are broadly related 
because both of them affect the number of character cells occupied by a 
string.



And I don't know the st codebase well enough (or at all, really) to tell
at a glance what would have to be changed to be able to support a
double-width character cell, or to support wrapping to the next line if
one is output at the second-to-last column.

I hadn't yet the time to read all the double-width implementations in other
terminals so st would do the »right thing« in implementing all questionable
cases.

Double‐width characters are like BCE a design decision applications need
adapt to.

Some corner cases I haven't yet found a good answer to:
* Is there any standard for this except for setting the flag in
  terminfo and taking up two cells in the terminal?


I don't know if there's a standard. I can find nothing about character 
cell terminals in any UTR, and ECMA 48 is silent on the question of wide 
characters.


I don't know what terminfo flag you are referring to. I was talking 
about support for east asian characters, not VT100-style stretching of 
ASCII characters. I suspect the widcs/swidm/rwidm capabilities refer to 
the latter (though the only actual instance in the terminfo database is 
a swidm string on the att730).



Observed behavior in various terminals that do support them is:
* cursor position can be in either half of a double character, though 
the whole character is hilighted (all observed terminals)
* outputting one at the end of the line (i.e. where a pair of two narrow 
characters would be split across lines) fails entirely (xterm) or wraps 
to the next line leaving the last cell alone (vte, tmux, mlterm, kterm).
* outputting a narrow character on top of a wide character erases the 
entire wide character (xterm, tmux, mlterm, kterm) or erases only when 
in the left half (vte)


* deleting (e.g. with ESC [ P) part of a character has various different 
behaviors:
** on xterm and kterm, deleting either half of a character replaces the 
remaining half with a single-width blank space.
** tmux's behavior is very buggy: a vertical line drawn across a 
different part of the screen _after_ deleting different parts of wide 
characters on different lines ended up redrawing incorrectly after 
refreshing. As for the wide characters themselves, deleting the left 
half deletes the entire character and deleting the right half has no 
effect, but there is some hidden state involved - a sequence of two 
deletions will delete a single wide character. I suspect the right 
half is filled with some placeholder value that is not output to the 
host terminal, and they are deleted individually. This is consistent 
with all of my observations.
** on mlterm, deleting the left half of a character deletes the entire 
character; deleting the right half replaces it with two spaces.
** on vte, deleting the right half of a character replaces the _next_ 
character with a space. Deleting the left half replaces the present 
character with a space, but seems to leave some hidden state, since the 
cursor on this space is still double width.
* the xterm/kterm behavior seems the most rational, since it yields no 
visual glitches, always keeps the cursor in the same logical position, 
and a deletion always shifts characters right of it by the same amount.


I haven't made any detailed investigation into the actual set of 
characters that are considered wide (or combining) by each terminal and 
by various applications, (except tmux, which has a list of ranges in 
utf8.c). I also haven't investigated whether any of them have 
locale-dependent treatment of ambiguous characters (e.g. greek or 
cyrillic) which are wide in historical east asian fonts (except tmux, 
which does not)


mlterm does have an option that makes it work differently; the above 
results are with -Z enabled.



* If st has double-width default.
* What happens if the application does naive character
  counting? Will layouts break?


My experience is that layouts break

[dev] [st] wide characters

2013-04-13 Thread Random832
I don't mean as in wchar_t, I mean as in characters (generally in East 
Asian languages) that are meant to take up two character cells.


I am forced to ask, though, why character cell values are stored in 
utf-8 rather than as wchar_t (or as an explicitly unicode int) in the 
first place, particularly since the simplest way to detect a wide 
character is to call the function wcwidth. What was the reason for this 
design decision? It doesn't save any space, since on most systems 
UTF_SIZ == sizeof(int) == sizeof(wchar_t).


And I don't know the st codebase well enough (or at all, really) to tell 
at a glance what would have to be changed to be able to support a 
double-width character cell, or to support wrapping to the next line if 
one is output at the second-to-last column.




Re: [dev] [st] wide characters

2013-04-13 Thread Random832

On 04/13/2013 07:07 PM, Aurélien Aptel wrote:

The ISO/IEC 10646:2003 Unicode standard 4.0 says that:

 The width of wchar_t is compiler-specific and can be as small as
8 bits. Consequently, programs that need to be portable across any C
or C++ compiler should not use wchar_t for storing Unicode text. The
wchar_t type is intended for storing compiler-defined wide characters,
which may be Unicode characters in some compilers.

utf-8 is rather straightforward to handle and process.


Okay, but why not work with a unicode code point as an int?



Re: [dev] [st] windows port?

2013-04-11 Thread random832
On Thu, Apr 11, 2013, at 10:59, Max DeLiso wrote:

My aim is to create a minimalist terminal emulator for windows. I want
a project whose relationship to the cmd/conhost/csrss triad is
analogous to the relationship between st and xterm/x. I'm going to try
and lift out of st all of the platform agnostic bits  which I am able
to, and generally use it as a reference for terminal emulation
routines.



If it doesn't work _with_ the cmd/conhost/csrss triad, what programs
are going to run in it? Cygwin, I suppose.

The problem, in general, with unix-ish terminal emulators on windows is
they don't work with applications designed to run in the console.


[dev] [sbase] cp and security

2011-06-23 Thread Random832

I've written most of cp, but one issue keeps bugging me.

I can't figure out how to get rid of race conditions within the
constraints that sbase is implemented in (POSIX 2001, no XSI
extensions).

If we were using POSIX 2008 or XSI extensions, I could use the at()
functions, or at least fchdir(), to reliably solve this problem. As it
is, I'm left with two choices:

Emulate fchdir with a magic cookie struct containing an absolute path,
device, and inode number [stat(.) every time and panic if device and
inode number don't match the cookie]

Do nothing.

Any thoughts?