Re: Query regarding exec in mandocdb.c

2017-08-26 Thread Ted Unangst
Ted Unangst wrote:
> Ingo Schwarze wrote:
> > > this could just be memcmp.
> > 
> > I avoided that over quibbles about the argument type (off_t vs.
> > size_t), though i admit that database files larger than a Gigabyte
> > make no sense at all.
> > 
> > If you consider that an improvement, i'm not opposed to using
> > memcmp(3).  But i don't really have a good idea what to do if
> > st_size does happen to exceed SIZE_MAX.  Maybe just error out?
> 
> in that case, you have already failed to mmap it, so i wouldn't worry too
> much.

actually, maybe not. if it's 4GB + a bit, the truncation will occur in the
mmap call and you'll only map a part of the file. (and then, with the hand
loop, you'll read well past the end of mapped memory and crash. with memcmp,
it will truncate to the mapped length.) so if you want correctness, need to
check both stats and error out.



Re: Query regarding exec in mandocdb.c

2017-08-26 Thread Ted Unangst
Ingo Schwarze wrote:
> > this could just be memcmp.
> 
> I avoided that over quibbles about the argument type (off_t vs.
> size_t), though i admit that database files larger than a Gigabyte
> make no sense at all.
> 
> If you consider that an improvement, i'm not opposed to using
> memcmp(3).  But i don't really have a good idea what to do if
> st_size does happen to exceed SIZE_MAX.  Maybe just error out?

in that case, you have already failed to mmap it, so i wouldn't worry too
much.



Re: Query regarding exec in mandocdb.c

2017-08-26 Thread Ingo Schwarze
Hi Ted,

Ted Unangst wrote on Sat, Aug 26, 2017 at 02:35:01PM -0400:
> Ingo Schwarze wrote:

>> +if ((cp1 = mmap(NULL, sb1.st_size, PROT_READ, MAP_PRIVATE,
>> +fd1, 0)) == NULL) {
>> +say(MANDOC_DB, "");
>> +goto err;
>> +}
>> +if ((cp2 = mmap(NULL, sb2.st_size, PROT_READ, MAP_PRIVATE,
>> +fd2, 0)) == NULL) {
>> +say(tfn, "");
>> +goto err;
>> +}

> mmap returns MAP_FAILED (-1), not null, on failure.

Ooops, indeed.  The check is correct in read.c and dbm_map.c,
but i stumbled here.  Fixed now, thanks.

>> +for (i = 0; i < sb1.st_size; i++)
>> +if (cp1[i] != cp2[i])
>> +goto err;

> this could just be memcmp.

I avoided that over quibbles about the argument type (off_t vs.
size_t), though i admit that database files larger than a Gigabyte
make no sense at all.

If you consider that an improvement, i'm not opposed to using
memcmp(3).  But i don't really have a good idea what to do if
st_size does happen to exceed SIZE_MAX.  Maybe just error out?

Yours,
  Ingo



Re: Query regarding exec in mandocdb.c

2017-08-26 Thread Ted Unangst
Ingo Schwarze wrote:
> + if ((cp1 = mmap(NULL, sb1.st_size, PROT_READ, MAP_PRIVATE,
> + fd1, 0)) == NULL) {
> + say(MANDOC_DB, "");
> + goto err;
> + }
> + if ((cp2 = mmap(NULL, sb2.st_size, PROT_READ, MAP_PRIVATE,
> + fd2, 0)) == NULL) {
> + say(tfn, "");
> + goto err;
> + }

mmap returns MAP_FAILED (-1), not null, on failure.

> + for (i = 0; i < sb1.st_size; i++)
> + if (cp1[i] != cp2[i])
> + goto err;

this could just be memcmp.

> + if (cp1 != NULL)
> + munmap(cp1, sb1.st_size);
> + if (cp2 != NULL)
> + munmap(cp2, sb2.st_size);

may need adjustment after fixing mmap checks, too.



Re: Query regarding exec in mandocdb.c

2017-08-26 Thread George Brown
Thank you for the replies Ingo and the diffs!

George Brown


On 26 August 2017 at 17:04, Ingo Schwarze  wrote:
> Hi George,
>
> George Brown wrote on Thu, Aug 24, 2017 at 02:01:05PM +0100:
>
>> In mandocdb.c it appears cmp(1) and rm(1) are executed in a child
>> process. It seems that if the logic from these programs were duplicated
>> the pledge in mandocdb.c could be further restricted and even not bother
>> with forking.
>
> Done as well, see the commit below.
>
> Thanks again for the suggestion,
>   Ingo
>
>
> Log Message:
> ---
> Do not fork and exec cmp(1); instead, simply fstat(2), mmap(2), and
> compare the files directly, allowing a much stricter pledge(2), at
> very little cost: merely 15 additional lines of very simple code.
> Suggested by George Brown <321 dot george at gmail dot com> on misc@.
>
> Modified Files:
> --
> mandoc:
> mandocdb.c
>
> Revision Data
> -
> Index: mandocdb.c
> ===
> RCS file: /home/cvs/mandoc/mandoc/mandocdb.c,v
> retrieving revision 1.254
> retrieving revision 1.255
> diff -Lmandocdb.c -Lmandocdb.c -u -p -r1.254 -r1.255
> --- mandocdb.c
> +++ mandocdb.c
> @@ -19,8 +19,8 @@
>  #include "config.h"
>
>  #include 
> +#include 
>  #include 
> -#include 
>
>  #include 
>  #include 
> @@ -319,7 +319,7 @@ mandocdb(int argc, char *argv[])
> int   ch, i;
>
>  #if HAVE_PLEDGE
> -   if (pledge("stdio rpath wpath cpath fattr flock proc exec", NULL) == 
> -1) {
> +   if (pledge("stdio rpath wpath cpath", NULL) == -1) {
> warn("pledge");
> return (int)MANDOCLEVEL_SYSERR;
> }
> @@ -440,15 +440,6 @@ mandocdb(int argc, char *argv[])
>  * The existing database is usable.  Process
>  * all files specified on the command-line.
>  */
> -#if HAVE_PLEDGE
> -   if (!nodb) {
> -   if (pledge("stdio rpath wpath cpath fattr 
> flock", NULL) == -1) {
> -   warn("pledge");
> -   exitcode = (int)MANDOCLEVEL_SYSERR;
> -   goto out;
> -   }
> -   }
> -#endif
> use_all = 1;
> for (i = 0; i < argc; i++)
> filescan(argv[i]);
> @@ -2119,9 +2110,10 @@ dbprune(struct dba *dba)
>  static void
>  dbwrite(struct dba *dba)
>  {
> -   char tfn[33];
> -   int  status;
> -   pid_tchild;
> +   struct stat  sb1, sb2;
> +   char tfn[33], *cp1, *cp2;
> +   off_ti;
> +   int  fd1, fd2;
>
> /*
>  * Do not write empty databases, and delete existing ones
> @@ -2160,39 +2152,59 @@ dbwrite(struct dba *dba)
> say("", "&%s", tfn);
> return;
> }
> -
> +   cp1 = cp2 = NULL;
> +   fd1 = fd2 = -1;
> (void)strlcat(tfn, "/" MANDOC_DB, sizeof(tfn));
> if (dba_write(tfn, dba) == -1) {
> -   exitcode = (int)MANDOCLEVEL_SYSERR;
> say(tfn, "_write");
> -   goto out;
> -   }
> -
> -   switch (child = fork()) {
> -   case -1:
> -   exitcode = (int)MANDOCLEVEL_SYSERR;
> -   say("", " cmp");
> -   return;
> -   case 0:
> -   execlp("cmp", "cmp", "-s", tfn, MANDOC_DB, (char *)NULL);
> -   say("", " cmp");
> -   exit(0);
> -   default:
> -   break;
> -   }
> -   if (waitpid(child, , 0) == -1) {
> -   exitcode = (int)MANDOCLEVEL_SYSERR;
> -   say("", " cmp");
> -   } else if (WIFSIGNALED(status)) {
> -   exitcode = (int)MANDOCLEVEL_SYSERR;
> -   say("", "cmp died from signal %d", WTERMSIG(status));
> -   } else if (WEXITSTATUS(status)) {
> -   exitcode = (int)MANDOCLEVEL_SYSERR;
> -   say(MANDOC_DB,
> -   "Data changed, but cannot replace database");
> +   goto err;
> }
> +   if ((fd1 = open(MANDOC_DB, O_RDONLY, 0)) == -1) {
> +   say(MANDOC_DB, "");
> +   goto err;
> +   }
> +   if ((fd2 = open(tfn, O_RDONLY, 0)) == -1) {
> +   say(tfn, "");
> +   goto err;
> +   }
> +   if (fstat(fd1, ) == -1) {
> +   say(MANDOC_DB, "");
> +   goto err;
> +   }
> +   if (fstat(fd2, ) == -1) {
> +   say(tfn, "");
> +   goto err;
> +   }
> +   if (sb1.st_size != sb2.st_size)
> +   goto err;
> +   if ((cp1 = mmap(NULL, sb1.st_size, PROT_READ, MAP_PRIVATE,
> +   fd1, 0)) == NULL) {
> +   say(MANDOC_DB, "");
> +   

Re: Query regarding exec in mandocdb.c

2017-08-26 Thread Ingo Schwarze
Hi George,

George Brown wrote on Thu, Aug 24, 2017 at 02:01:05PM +0100:

> In mandocdb.c it appears cmp(1) and rm(1) are executed in a child
> process. It seems that if the logic from these programs were duplicated
> the pledge in mandocdb.c could be further restricted and even not bother
> with forking.

Done as well, see the commit below.

Thanks again for the suggestion,
  Ingo


Log Message:
---
Do not fork and exec cmp(1); instead, simply fstat(2), mmap(2), and
compare the files directly, allowing a much stricter pledge(2), at
very little cost: merely 15 additional lines of very simple code.
Suggested by George Brown <321 dot george at gmail dot com> on misc@.

Modified Files:
--
mandoc:
mandocdb.c

Revision Data
-
Index: mandocdb.c
===
RCS file: /home/cvs/mandoc/mandoc/mandocdb.c,v
retrieving revision 1.254
retrieving revision 1.255
diff -Lmandocdb.c -Lmandocdb.c -u -p -r1.254 -r1.255
--- mandocdb.c
+++ mandocdb.c
@@ -19,8 +19,8 @@
 #include "config.h"
 
 #include 
+#include 
 #include 
-#include 
 
 #include 
 #include 
@@ -319,7 +319,7 @@ mandocdb(int argc, char *argv[])
int   ch, i;
 
 #if HAVE_PLEDGE
-   if (pledge("stdio rpath wpath cpath fattr flock proc exec", NULL) == 
-1) {
+   if (pledge("stdio rpath wpath cpath", NULL) == -1) {
warn("pledge");
return (int)MANDOCLEVEL_SYSERR;
}
@@ -440,15 +440,6 @@ mandocdb(int argc, char *argv[])
 * The existing database is usable.  Process
 * all files specified on the command-line.
 */
-#if HAVE_PLEDGE
-   if (!nodb) {
-   if (pledge("stdio rpath wpath cpath fattr 
flock", NULL) == -1) {
-   warn("pledge");
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   goto out;
-   }
-   }
-#endif
use_all = 1;
for (i = 0; i < argc; i++)
filescan(argv[i]);
@@ -2119,9 +2110,10 @@ dbprune(struct dba *dba)
 static void
 dbwrite(struct dba *dba)
 {
-   char tfn[33];
-   int  status;
-   pid_tchild;
+   struct stat  sb1, sb2;
+   char tfn[33], *cp1, *cp2;
+   off_ti;
+   int  fd1, fd2;
 
/*
 * Do not write empty databases, and delete existing ones
@@ -2160,39 +2152,59 @@ dbwrite(struct dba *dba)
say("", "&%s", tfn);
return;
}
-
+   cp1 = cp2 = NULL;
+   fd1 = fd2 = -1;
(void)strlcat(tfn, "/" MANDOC_DB, sizeof(tfn));
if (dba_write(tfn, dba) == -1) {
-   exitcode = (int)MANDOCLEVEL_SYSERR;
say(tfn, "_write");
-   goto out;
-   }
-
-   switch (child = fork()) {
-   case -1:
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   say("", " cmp");
-   return;
-   case 0:
-   execlp("cmp", "cmp", "-s", tfn, MANDOC_DB, (char *)NULL);
-   say("", " cmp");
-   exit(0);
-   default:
-   break;
-   }
-   if (waitpid(child, , 0) == -1) {
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   say("", " cmp");
-   } else if (WIFSIGNALED(status)) {
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   say("", "cmp died from signal %d", WTERMSIG(status));
-   } else if (WEXITSTATUS(status)) {
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   say(MANDOC_DB,
-   "Data changed, but cannot replace database");
+   goto err;
}
+   if ((fd1 = open(MANDOC_DB, O_RDONLY, 0)) == -1) {
+   say(MANDOC_DB, "");
+   goto err;
+   }
+   if ((fd2 = open(tfn, O_RDONLY, 0)) == -1) {
+   say(tfn, "");
+   goto err;
+   }
+   if (fstat(fd1, ) == -1) {
+   say(MANDOC_DB, "");
+   goto err;
+   }
+   if (fstat(fd2, ) == -1) {
+   say(tfn, "");
+   goto err;
+   }
+   if (sb1.st_size != sb2.st_size)
+   goto err;
+   if ((cp1 = mmap(NULL, sb1.st_size, PROT_READ, MAP_PRIVATE,
+   fd1, 0)) == NULL) {
+   say(MANDOC_DB, "");
+   goto err;
+   }
+   if ((cp2 = mmap(NULL, sb2.st_size, PROT_READ, MAP_PRIVATE,
+   fd2, 0)) == NULL) {
+   say(tfn, "");
+   goto err;
+   }
+   for (i = 0; i < sb1.st_size; i++)
+   if (cp1[i] != cp2[i])
+   goto err;
+   goto out;
+
+err:
+   exitcode = (int)MANDOCLEVEL_SYSERR;
+   say(MANDOC_DB, 

Re: Query regarding exec in mandocdb.c

2017-08-26 Thread Ingo Schwarze
Hi George,

George Brown wrote on Thu, Aug 24, 2017 at 02:01:05PM +0100:

> In mandocdb.c it appears cmp(1) and rm(1) are executed in a child
> process.

Indeed, i never really liked that, yet i didn't spend much time
thinking about it either.

> It seems that if the logic from these programs were duplicated
> the pledge in mandocdb.c could be further restricted and even not
> bother with forking.

You have a point that the need for the "proc exec" pledges is ugly
in particular, and it would be nice to get rid of those pledges.

> Would such a change be pointless churn however?

At least the removal of the forking for "rm -rf" was not
pointless churn.  Simplification is never pointless.

> Both cmp(1) and rm(1)
> are simple programs and are pledge'd themselves. Not to mention the
> creation of the mandoc database is in itself a short lived process.
> 
> To be clear I'm not proposing a change (indeed I have no diff) but
> rather I am simply curious to the opinion of others in the OpenBSD
> community.

See below for what i committed so far.  I shall look into removing
the fork and exec for cmp(1) as well.  I'm not completely sure yet
that it will work out, but i suspect it won't make the code
substantially longer and only duplicate such a minimal and simple
part of cmp(1) that the duplication won't matter, and be outweighed
by the stricter pledge.

Thanks for reading the code!
  Ingo


Log Message:
---
No need to fork and exec rm(1) -rf, we know that we have exactly
one file and exactly one directory to remove.  While here, increase
the size of the buffer such that the file name actually fits.  
Minus 17 lines of code, no functional change.

Opportunity for simplification reported by George Brown <321 dot 
george at gmail dot com> on misc@.

Modified Files:
--
mandoc:
mandocdb.c

Revision Data
-
Index: mandocdb.c
===
RCS file: /home/cvs/mandoc/mandoc/mandocdb.c,v
retrieving revision 1.253
retrieving revision 1.254
diff -Lmandocdb.c -Lmandocdb.c -u -p -r1.253 -r1.254
--- mandocdb.c
+++ mandocdb.c
@@ -2119,7 +2119,7 @@ dbprune(struct dba *dba)
 static void
 dbwrite(struct dba *dba)
 {
-   char tfn[32];
+   char tfn[33];
int  status;
pid_tchild;
 
@@ -2193,26 +2193,9 @@ dbwrite(struct dba *dba)
}
 
 out:
+   unlink(tfn);
*strrchr(tfn, '/') = '\0';
-   switch (child = fork()) {
-   case -1:
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   say("", " rm");
-   return;
-   case 0:
-   execlp("rm", "rm", "-rf", tfn, (char *)NULL);
-   say("", " rm");
-   exit((int)MANDOCLEVEL_SYSERR);
-   default:
-   break;
-   }
-   if (waitpid(child, , 0) == -1) {
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   say("", " rm");
-   } else if (WIFSIGNALED(status) || WEXITSTATUS(status)) {
-   exitcode = (int)MANDOCLEVEL_SYSERR;
-   say("", "%s: Cannot remove temporary directory", tfn);
-   }
+   rmdir(tfn);
 }
 
 static int



Re: code replication (was: Re: Query regarding exec in mandocdb.c)

2017-08-26 Thread Raul Miller
On Sat, Aug 26, 2017 at 4:36 AM,   wrote:
> The greater the body of code is, the smaller our understanding, or at
> least our ability to grok the code.
>
> Even in the UNIX world, 'duckspeak' code -- just doing what seems right
> without realizing the longer-term implications -- is unfortunately very
> common.
>
> I don't think that we can really afford that in the modern world.

Could you be more specific?

What problem are you trying to solve?

Thanks,

-- 
Raul



code replication (was: Re: Query regarding exec in mandocdb.c)

2017-08-26 Thread leo_tck
Hi,

rauldmil...@gmail.com wrote:
> But replication also gives robustness in the face of failure, so it
> can also be a security asset. Still an issue, just not a security
> problem. (Or, a problem, but for people trying to defeat security.)

Yes, but especially in cases of untested, new ways of doing things.

In other cases I'm not so sure the risk isn't more evenly balanced (in
which case I'd of course argue for less replication, for its other
benefits -- see below).

> That said, replication is intrinsic in the nature of computer
> programming. Patterns are useful and, therefore, replicated. But even
> more than that we start with a [relatively] small set of primitive
> instructions and build up from there.

There's another angle to consider: code size and complexity. Not for the
computer (though the compiler often gets a great deal less confused if
the code contains lots of calls than if it contains lots of actual
logic!), but for us, the programmers.

The greater the body of code is, the smaller our understanding, or at
least our ability to grok the code.

Even in the UNIX world, 'duckspeak' code -- just doing what seems right
without realizing the longer-term implications -- is unfortunately very
common.

I don't think that we can really afford that in the modern world.

> But getting rid of all replication is an impossible rabbit hole that
> you really do not want to go down.

One can, indeed, overdo things. But imnsho efforts to reduce replication
(or even writing such repetetetive code in the first place!) could be
turned up a notch or two :)

--schaafuit.



Re: Query regarding exec in mandocdb.c

2017-08-26 Thread Raul Miller
"Replicated similar functionality" is indeed a security issue.

It's a security problem, sometimes - the whole buffer overflow being
replicated everywhere thing, for example.

But replication also gives robustness in the face of failure, so it
can also be a security asset. Still an issue, just not a security
problem. (Or, a problem, but for people trying to defeat security.)

That said, replication is intrinsic in the nature of computer
programming. Patterns are useful and, therefore, replicated. But even
more than that we start with a [relatively] small set of primitive
instructions and build up from there.

Unnecessary replication, on the other hand, is indeed something that's
not so good.

But getting rid of all replication is an impossible rabbit hole that
you really do not want to go down.

Thanks,

-- 
Raul


On Fri, Aug 25, 2017 at 8:13 PM,   wrote:
> [now I'm subscribed, might as well respond to some recent stuff from the
>  archives...]
>
> 321.geo...@gmail.com wrote:
>> In mandocdb.c it appears cmp(1) and rm(1) are executed in a child
>> process. It seems that if the logic from these programs were duplicated
>> the pledge in mandocdb.c could be further restricted and even not bother
>> with forking.
>>
>> Would such a change be pointless churn however? Both cmp(1) and rm(1)
>> are simple programs and are pledge'd themselves. Not to mention the
>> creation of the mandoc database is in itself a short lived process.
>>
>> To be clear I'm not proposing a change (indeed I have no diff) but
>> rather I am simply curious to the opinion of others in the OpenBSD
>> community.
>
> Okay, in that case, please forgive me if I go off on a little bit of a
> tangent.
>
> I've used UNIX for quite a while now. Not being satisfied with just
> using anything, I've (not deeply) poked at the luserspace internals
> quite a bit over time.
>
> Almost each time I read the source code of any UNIX program, whether it
> came w/ the system or not, I find duplicated functionality.
>
> As I see it, this is not just inefficient, but also a huge security
> issue: if the same operation is stated differently in many different
> places, how can we make sure that we squash all instances of a
> particular bad habit or bug?
>
> The only real solution that I've come up w/ over time is to put the
> actual logic in libraries and leave the programs to be luser interfaces
> to that logic.
>
> Perhaps something not quite so extreme is needed. I wouldn't know.
>
> It would certainly make it easier to execute the suggestion you make in
> the first paragraph of your message.
>
> --schaafuit.
>
> [so, the spacing issue does not appear today, but the subject lines
>  are fscked. g!]
>



RE: Query regarding exec in mandocdb.c

2017-08-25 Thread leo_tck
[now I'm subscribed, might as well respond to some recent stuff from the
 archives...] 

321.geo...@gmail.com wrote:
> In mandocdb.c it appears cmp(1) and rm(1) are executed in a child
> process. It seems that if the logic from these programs were duplicated
> the pledge in mandocdb.c could be further restricted and even not bother
> with forking.
>
> Would such a change be pointless churn however? Both cmp(1) and rm(1)
> are simple programs and are pledge'd themselves. Not to mention the
> creation of the mandoc database is in itself a short lived process.
>
> To be clear I'm not proposing a change (indeed I have no diff) but
> rather I am simply curious to the opinion of others in the OpenBSD
> community.

Okay, in that case, please forgive me if I go off on a little bit of a
tangent.

I've used UNIX for quite a while now. Not being satisfied with just
using anything, I've (not deeply) poked at the luserspace internals
quite a bit over time.

Almost each time I read the source code of any UNIX program, whether it
came w/ the system or not, I find duplicated functionality.

As I see it, this is not just inefficient, but also a huge security
issue: if the same operation is stated differently in many different
places, how can we make sure that we squash all instances of a
particular bad habit or bug?

The only real solution that I've come up w/ over time is to put the
actual logic in libraries and leave the programs to be luser interfaces
to that logic.

Perhaps something not quite so extreme is needed. I wouldn't know.

It would certainly make it easier to execute the suggestion you make in
the first paragraph of your message.

--schaafuit.

[so, the spacing issue does not appear today, but the subject lines
 are fscked. g!]