Can libc funcs be optimizable for no return value?

2022-06-06 Thread Luke Small
I noticed that when I ran strlcpy in cc with both directly from libc and
copied from source: “with and without needing a return value”, the libc
strlcpy didn’t change the runtime, but the one from source did;
dramatically (like 50% runtime difference over a several run loop with
15-20 or so characters), where the compiler obviously optimized stuff out.

Is it possible that there’s a way you haven’t considered, that could permit
the compiler to optimize the functions…

(or pre-optimize for the instances with no return value and when there are
different needs, it could choose between the two; if there aren’t dual
needs, then only write one to static programs; I don’t know if this can be
done trivially.)

…which could improve performance without breaking security of functions
which may or may not need a return value to function correctly?!

Or is all the extra runtime from returning from the libc wrapper?
-- 
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-21 Thread Luke Small
Marc, all you all have to do is say is that you all refuse to provide it.

I was asked to at least provide evidence for correctness. I did so; and I’d
say I did a stellar job aside from getting some kind of statistical program.

The following has an attached source code for my test (along with
referencing another post) in case you missed it in this ginormous thread:

https://marc.info/?l=openbsd-tech=165306234108572=2


Aside from multithreaded processes, if you provide this, folks might use
this instead of creating their own incorrect custom random number
generators to improve performance for smaller values.

That custom deterministic “arc4random()” earlier in the thread from Damien
was not very good by the way. I tested it.


> You're about one step away from getting into my spam rules.
>
> Whatever you say, the way you say it, *repeatedly* when all the people
> I know (me included) tell you to get your facts straight so you don't
> sound like a wacko, makes reading you a total waste of time.
>
-- 
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-21 Thread Luke Small
Perhaps it was rude sending off list stuff to the list. Your email sounded
"less than friendly" and more of a professional challenge that you were
definitely in the works to produce; much like Damien Miller’s challenge to
prove correctness. So, whatever.

Aside from that unpleasantness:
I worked in __builtin_clz(upperbound-1) mentioned earlier in the thread;
instead of my binary search. It made the knuth sort simulation run even
faster. arc4random_uniform now takes 130% of the time mine takes for a
series of random numbers decreasing from 65535 to 2.

“__builtin_clz((upperbound - 1) | 1)” is only needed if upperbound can be 1
and that possibility is eliminated by first checking to see that upperbound
is >= 2.

static uint32_t
arc4random_uniform_small_unlocked(uint32_t upper_bound)
{
static uint64_t rand_holder = 0;
static size_t rand_bits = 0;

static size_t upper_bound0 = 2;
static size_t bitmask = 0x01;
static size_t bits = 1;

const size_t ub = upper_bound;
size_t ret;

if (ub != upper_bound0) {

if (ub < 2) {

/*
 * reset static cache for if a process needs to
fork()
 * to make it manually fork-safe
 */
if (ub == 0) {
rand_holder = 0;
rand_bits = 0;
}
return 0;
}

bits = 32 - __builtin_clz(ub - 1);
bitmask = ((size_t)1 << bits) - 1;

upper_bound0 = ub;
}

do {
if (rand_bits < bits) {
rand_holder |= ((uint64_t)arc4random()) <<
rand_bits;

/*
 * rand_bits will be between 0 and 31 here
 * so the 0x20 bit will be empty
 * rand_bits += 32;
 */
rand_bits |= 32;
}

ret = rand_holder & bitmask;
rand_holder >>= bits;
rand_bits -= bits;

} while (ret >= ub);


return (uint32_t)ret;
}


> Luke,
>
> It's very bad etiquette to deliberately re-post a private, off-list comment
> to a public mailing list.
>

Also, please fix your email client to respect the Mail-Followup-To: header,
> this is another lack of etiquette on your part.
>

I am either using gmail app on a phone or gmail.com, so I don't know if I
can help you there.


Re: Picky, but much more efficient arc4random_uniform!

2022-05-20 Thread Luke Small
Crystal: You can prove that for random, repetitive, correct, database
record name generation using small upperbounds, the demonstrated 1/3-1/2
runtime isn’t worth it for an upperbound like 26 - 92 in a business context
that fights for every last millisecond?

Bring it.

Prove the correctness of whatever algorithm you’re using while you’re at it.

…unless a lot of those processes are threaded of course. :/

On Fri, May 20, 2022 at 7:27 PM Crystal Kolipe 
wrote:

>
> I've actually written a program to demonstrate how pointless your chacha
> bit
> saving routine is :).  I just haven't posted it yet because I'm too busy
> with
> other, (more useful), things to bother finishing it off.
>
> Your thread on -tech has already wasted more bits than your random number
> routine would save in a very long time, ha ha ha.
>
-- 
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-20 Thread Luke Small
I appreciate your response, Damien.

I did do the bare minimum of correctness testing and it was the post right
after Guenther was congratulated on proving incorrectness:

https://marc.info/?l=openbsd-tech=165259528425835=2

The post includes software to reproduce the results.



I wrote a highly dynamically allocated program to test at intervals of
intervals to show at various stages to show the degree that the output
remains random

This is an example of some output:

testing arc4random_uniform(5) and arc4random_uniform_small_unlocked(5):

 256X 2048X16384X   131072X
 1048576X

 
   0 original246  2053 16400131115
1048306
 mine263  2042 16312131258
1046989

   1 original234  2013 16337130894
1047810
 mine249  2022 16304131161
1049511

   2 original236  2061 16367130802
1047430
 mine257  2117 16597130718
1046375

   3 original284  2089 16444131092
1050332
 mine266  2058 16379131190
1049877

   4 original280  2024 16372131457
1049002
 mine245  2001 16328131033
1050128



max-min values:

 original 5076   107   655
 2902
 mine 21   116   293   540
 3753


The program is set up to test values 2 through 50. This will create 224KiB
of output.
I suspected that you'd prefer that I didn't attach it.

Some progress output has been relegated to stderr so that you can easily
pipe it to a file.

I added some getrlimit an setrlimit functions to maximize memory limits if
necessary
In the case that I created for you, this will not be needed.

It uses 1276K RAM in the current configuration.


> Just to bring this back to where we came in: even putting thread-safety
> aside (which you absolutely can't): it doesn't matter how much faster
> it is, your replacement function isn't useful until you do the work to
> demonstrate correctness.
>
> You have done literally zero so far, not even the bare minimum of
> testing the output. As a result your first version was demonstrated to
> be completely broken by literally the most basic of possible tests, a
> mere 10 lines of code.
>
> That you left this to others to do tells me that you fundamentally don't
> understand the environment in which you're trying to operate, and that
> you don't respect the time of the people you're trying to convince.
>
> Please stop wasting our time.
>
> -d

-- 
-Luke
#include 
#include 
#include 
#include 
#include 
#include 
#include 

extern char *malloc_options;


/*
cc arc4random_uniform_small.c -O2 -march=native -mtune=native -flto=full \
-static -p -fno-inline && ./a.out && gprof a.out gmon.out

cc arc4random_uniform_small.c -O2 && ./a.out
 */


/* 
 * uint32_t
 * arc4random_uniform(uint32_t upper_bound);
 * 
 * for which this function is named, provides a cryptographically secure:
 * uint32_t arc4random() % upper_bound;
 * 
 * this function minimizes the waste of randomly generated bits,
 * for small upper bounds.
 * 
 * 'bits' is the requested number of random bits and it needs to be
 * within the following parameters:
 * 
 *   1 << bits >= upper_bound > (1 << bits) >> 1
 * 
 * I make a possibly incorrect presumption that size_t is
 * at the very least a uint32_t, but probably a uint64_t for speed
 */
static uint32_t
arc4random_uniform_small_unlocked(uint32_t upper_bound)
{	
	static uint64_t rand_holder = 0;
	static uint64_t rand_bits = 0;
	
	static uint64_t upper_bound0 = 2;
	static uint64_t bits0 = 1;

	uint64_t bits = 16, i = 1, n = 32, ret, ub = upper_bound;

	if (ub != upper_bound0) {
		
		if (ub < 2) {
			
			/*
			 * reset static cache for if a process needs to fork()
			 * to make it manually fork-safe
			 */
			if (ub == 0) { 
rand_holder = 0;
rand_bits = 0;
			}
			return 0;
		}
		
		/*
		 * binary search for ideal size random bitstream selection
		 */
		for (;;) {
			
			if ( ub > ((uint64_t)1 << bits) ) {

if (n - i == 1) {
	bits = n;
	break;
}

i = bits;
bits = (i + n) / 2;
continue;
			}
			if ( ub <= ((uint64_t)1 << bits) >> 1 ) {

if (n - i == 1) {
	bits = n;
	break;
}

n = bits;
bits = (i + n) / 2;
continue;
			}
			
			break;
		}
		upper_bound0 = ub;
		bits0 = bits;
	} else
		bits = bits0;
		
	const uint64_t bitmask = ((uint64_t)1 << bits) - 1;
		
	do {
		if (rand_bits < bits) {
			rand_holder |= ((uint64_t)arc4random()) << rand_bits;
			
			/* 
			 * rand_bits will be a number between 0 and 31 here
			 * so the 0x20 bit will be empty
			 * rand_bits += 32;
			 */ 
			rand_bits |= 32;
		}
		
		

Fwd: Picky, but much more efficient arc4random_uniform!

2022-05-20 Thread Luke Small
I appreciate your response, Damien.

I did do the bare minimum of correctness testing and it was the post right
after Guenther was congratulated on proving incorrectness:

https://marc.info/?l=openbsd-tech=165259528425835=2

The post includes software to reproduce the results.



I wrote a highly dynamically allocated program to test at intervals of
intervals to show at various stages to show the degree that the output
remains random

This is an example of some output:

testing arc4random_uniform(5) and arc4random_uniform_small_unlocked(5):

 256X 2048X16384X   131072X
 1048576X

 
   0 original246  2053 16400131115
1048306
 mine263  2042 16312131258
1046989

   1 original234  2013 16337130894
1047810
 mine249  2022 16304131161
1049511

   2 original236  2061 16367130802
1047430
 mine257  2117 16597130718
1046375

   3 original284  2089 16444131092
1050332
 mine266  2058 16379131190
1049877

   4 original280  2024 16372131457
1049002
 mine245  2001 16328131033
1050128



max-min values:

 original 5076   107   655
 2902
 mine 21   116   293   540
 3753


The program is set up to test values 2 through 50. This will create 224KiB
of output.
I suspected that you'd prefer that I didn't attach it.

Some progress output has been relegated to stderr so that you can easily
pipe it to a file.

I added some getrlimit an setrlimit functions to maximize memory limits if
necessary
In the case that I created for you, this will not be needed.

It uses 1276K RAM in the current configuration.


> Just to bring this back to where we came in: even putting thread-safety
> aside (which you absolutely can't): it doesn't matter how much faster
> it is, your replacement function isn't useful until you do the work to
> demonstrate correctness.
>
> You have done literally zero so far, not even the bare minimum of
> testing the output. As a result your first version was demonstrated to
> be completely broken by literally the most basic of possible tests, a
> mere 10 lines of code.
>
> That you left this to others to do tells me that you fundamentally don't
> understand the environment in which you're trying to operate, and that
> you don't respect the time of the people you're trying to convince.
>
> Please stop wasting our time.
>
> -d

-- 
-Luke


arc4random_uniform_small.c
Description: Binary data


Re: Picky, but much more efficient arc4random_uniform!

2022-05-16 Thread Luke Small
Yeah, I see your point.

I suppose it depends on how conservative you want to be and whether you
want to supply options to people like getchar_unlocked when it isn’t
essential.

It could be made manually fork-safe if I could make a simple feature where
“arc4random_uniform_unlocked(0);” with a 0 upperbound could trigger a reset
of the static variables rand_bits and rand_holder which would be simple
enough and could be added to a man page. I certainly read the man pages.
(I’m surprised “man getchar” doesn’t “see also” to getchar_unlocked() by
the way.)

If you look at profiling with programs that call it a lot, arc4random()
inside arc4random_uniform() calls the expensive rekey function which makes
it take more time. That’s why I can get around 2X-3X performance on a
typical repetitive small upperbound loop and an extra 20% improvement on a
single 65536 Knuth shuffle, loop even though my function repeats a binary
search for ‘bit’ size every single time and has misses which demands
calling up more data.

Otherwise my function would be particularly useful when there’s a loop with
small upperbound like: 26+26+10 which, if I recall correctly, is in identd,
which would call it frequently.

I have a project that I generate MANY random record names much like that
and I use fork()/fork()-exec() on many processes well before calling
randomizing functions. Maybe I’m not the only one.

I don’t want to even try multithreading, especially as often as you guys
say that it is too unpredictable. I believe you. Using lua scripts in
multi-worker-process redis calls to avoid race conditions is awkward enough
for me.


But otherwise it’s certainly your prerogative to not have it. At least you
can’t legitimately say that it adds too much extra code or is too complex.
It’s pretty simple to understand if you’re familiar with bitwise stuff.

On Mon, May 16, 2022 at 5:35 PM Stuart Henderson 
wrote:

> On 2022/05/16 15:13, Luke Small wrote:
> > If you’re not running a threaded program, my function wouldn’t be “less
> > safe.”
> >
> > I’d imagine that 99% of programs aren’t multithreaded.
>
> code is reused in different places. non threaded programs are sometimes
> turned into threaded programs and when that happens, sometimes
> non-thread-safe calls are missed. so i'd argue that it is still less
> safe.
>
> in some cases there might be benefits that would mean it's worth it,
> especially if the failure modes would be obvious such that they can
> be detected. really not seeing that here. (how often are you even
> calling arc4random_uniform to consider it slow?!)
>
> if the consequence is not a crash but instead subtly broken randomness,
> how long do you think it's going to take to notice and report/fix it?
> even *very* broken randomness in widespread software distributions
> has been known to sit around for a long time before it's noticed:
>
> - predictable rng in a popular os. *serious* bug. introduced 2006,
> discovered/reported nearly 2 years later.
>
> - non-fork-safe rng in a popular ssl library, introduced sometime before
> sept 2018, reported may 2019.
>
> --
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-16 Thread Luke Small
No...am I incorrect, especially on OpenBSD?

Of course since you made such a remark, you seem like the kind of fellow
that would put the nail in the coffin for spite.

...now I sound like an asshole.

On Mon, May 16, 2022 at 4:00 PM Theo de Raadt  wrote:

> hey luke you know you sound like an asshole right?
>
>
> Luke Small  wrote:
>
> > If you’re not running a threaded program, my function wouldn’t be “less
> > safe.”
> >
> > I’d imagine that 99% of programs aren’t multithreaded.
> >
> > On Mon, May 16, 2022 at 1:01 PM  wrote:
> >
> > > > There is the specifically non-threadsafe call getchar_unlocked() on
> > > OpenBSD
> > > > which is presumably available for performance reasons alone, when
> > > getchar()
> > > > is a perfectly viable option and is even an ISO conforming function.
> > > What I
> > > > submitted could be such a higher performance non-threadsafe function.
> > > >
> > > > so, how about arc4random_uniform_unlocked() ?!
> > >
> > > getchar_unlocked is mandated by POSIX.
> > >
> > > OpenBSD has not yet invented an alternate function that only exists to
> > > give away safety for performance. It has only gone in the opposite
> > > direction if anything.
> > >
> > > --
> > -Luke
>


Re: Picky, but much more efficient arc4random_uniform!

2022-05-16 Thread Luke Small
If you’re not running a threaded program, my function wouldn’t be “less
safe.”

I’d imagine that 99% of programs aren’t multithreaded.

On Mon, May 16, 2022 at 1:01 PM  wrote:

> > There is the specifically non-threadsafe call getchar_unlocked() on
> OpenBSD
> > which is presumably available for performance reasons alone, when
> getchar()
> > is a perfectly viable option and is even an ISO conforming function.
> What I
> > submitted could be such a higher performance non-threadsafe function.
> >
> > so, how about arc4random_uniform_unlocked() ?!
>
> getchar_unlocked is mandated by POSIX.
>
> OpenBSD has not yet invented an alternate function that only exists to
> give away safety for performance. It has only gone in the opposite
> direction if anything.
>
> --
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-15 Thread Luke Small
Yeah. It most likely won't go in. From past experience and advice, not
necessarily just from a perceived lack of merit.

However, many, if not all of the arguments are based upon non-facts and
misconceptions from earlier submissions or just not understanding what the
software is doing.

The only real thing that makes it not quite as good as the standard is mine
isn't threadsafe. If it could be accepted as a higher performance,
non-threadsafe call, it would perform better in many typical cases and
perhaps even give a safer return value, especially in large upper_bound
edge cases, I suspect.

There is the specifically non-threadsafe call getchar_unlocked() on OpenBSD
which is presumably available for performance reasons alone, when getchar()
is a perfectly viable option and is even an ISO conforming function. What I
submitted could be such a higher performance non-threadsafe function.

so, how about arc4random_uniform_unlocked() ?!

...other than making upper_bound a uint32_t instead of the submitted
uint64_t. That'd be somewhat of a problem.

-Luke


> You've had several developers tell you this is not going to go in. I'd
> suggest
> "read the room".
>
> If you want this for your own use, just keep it as a local diff. Nobody
> will
> know (or likely care).
>
> -ml
>


Re: Picky, but much more efficient arc4random_uniform!

2022-05-15 Thread Luke Small
I’m not trying to be rude, but you don’t realize what’s going on here:

uuu is a bitmask:

‘uuu’ (or (1 << bits)-1 ) in “ret = rand_holder & uuu;“ , only puts the
lower ‘bit’ quantity of bits of rand_holder into ret, then it right shifts
rand_holder afterward to trash them every time in the loop when it’s done.

So if bits is 8, uuu is going to be 0xff

No, you aren't:
>
> > for (;;) {
> > if (rand_bits < bits) {
> > rand_holder |= ((uint64_t)arc4random()) <<
> > rand_bits;
> >
> > /*
> >  * rand_bits will be a number between 0 and 31
> here
> >  * so the 0x20 bit will be empty
> >  * rand_bits += 32;
> >  */
> > rand_bits |= 32;
> > }
> >
> > ret = rand_holder & uuu;
> > rand_holder >>= bits;
> > rand_bits -= bits;
> >
> > if (ret < upper_bound)
> > return ret;
> > }
>
> This isn't rejection sampling. This is reusing part of the rejected
> sample.

-- 
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-15 Thread Luke Small
https://marc.info/?l=openbsd-tech=165259528425835=2

This one (which is strongly based upon my first of two versions) which I
submitted after Guenther correctly trashed version 2, doesn’t reuse any
part of the sample. It picks up a clean new bitfield upon failure.

I think Guenther didn’t, perhaps like yourself, realize I submitted this
later program. That’s why he said it wasn’t correct. It didn’t occur to me
at the time of responding to him: “correct correct correct.”

On Sun, May 15, 2022 at 7:47 PM Damien Miller  wrote:

> On Sat, 14 May 2022, Luke Small wrote:
>
> > Look at my code. I don’t even use a modulus operator. I perform hit and
> > miss with a random bitstream.
> >
> > How can I have a bias of something I don’t do? I return a bitstream which
> > meets the parameters of being a value less than the upper bound. Much
> like
> > arc4random_buf().
> >
> > If I use arc4random_uniform() repeatedly to create a random distribution
> of
> > say numbers less than 0x1000 or even something weird like 0x1300 will the
> > random distribution be better with arc4random_uniform() or with mine? For
> > 0x1000 mine will simply pluck 12 bits of random data straight from the
> > arc4random() (and preserve the remaining 20 bits for later) on the first
> > try, just like it’s arc4random_buf().
> >
> > arc4random_uniform() will perform a modulus of a 32 bit number which adds
> > data to the bitstream. Does it make it better? Perhaps it makes it harder
> > to guess the source bits.
> >
> > I don’t know; and I’m not going to pretend to be a cryptologist. But I’m
> > looking at modulo bias.
> >
> > I didn’t know what it was, before, but I basically “rejection sample”:
> >
> >
> https://research.kudelskisecurity.com/2020/07/28/the-definitive-guide-to-modulo-bias-and-how-to-avoid-it/
>
> No, you aren't:
>
> > for (;;) {
> > if (rand_bits < bits) {
> > rand_holder |= ((uint64_t)arc4random()) <<
> > rand_bits;
> >
> > /*
> >  * rand_bits will be a number between 0 and 31
> here
> >  * so the 0x20 bit will be empty
> >  * rand_bits += 32;
> >  */
> > rand_bits |= 32;
> > }
> >
> > ret = rand_holder & uuu;
> > rand_holder >>= bits;
> > rand_bits -= bits;
> >
> > if (ret < upper_bound)
> > return ret;
> > }
>
> This isn't rejection sampling. This is reusing part of the rejected
> sample.
>
> Think of it like this: you want to uniformly generate a number in the
> range [2:10] by rolling 2x 6-sided dice. What do you do when you roll
> 11 or 12? You can't just reroll one of the dice because the other dice
> is constrained to be have rolled either 5 or 6, and so proceeding with
> it would force the output to be in the range [6:11] for these ~5.6%
> of initial rolls. Your output is no longer uniform.
>
> BTW the existing code already implements the prefered approach of the
> article you quoted.
>
> -d

-- 
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-15 Thread Luke Small
Do I really have to use specific terminology to make a point?

I'm not educated enough on chacha20 enough to know whether, like I pointed
out, whether choosing 5 bits from the middle of (or even from the tail end
of one and the beginning of another) 32 bit pseudorandom cipher is
"correct."

...correct correct correct. Did I use that buzzword enough?

-Luke


On Sun, May 15, 2022 at 5:26 PM Philip Guenther  wrote:

> On Sun, 15 May 2022, Luke Small wrote:
> > The current implementation is nothing more than a naive arc4random() %
> > upper_bound which trashes initial arc4random() calls it doesn’t like,
> then
> > transforms over a desired modulus. The whole transformation by modulus of
> > perfectly decent random data seems so awkward. It’s not like it is used
> as
> > some majestic artistry of RSA it seems like an ugly HACK to simply meet a
> > demand lacking of something better.
>
> You fail to mention correctness at all or address the fact that your
> version isn't while the current one is.  Meanwhile, you talk about getting
> only just enough random data as if there's some sort of limited supply
> when there isn't.
>
> "My version may be wrong, but at least it doesn't look naive!"
>
> That is utterly the wrong attitude for OpenBSD code.
>
>
> Best wishes.
>
> Philip Guenther
>


Re: Picky, but much more efficient arc4random_uniform!

2022-05-15 Thread Luke Small
The current implementation is nothing more than a naive arc4random() %
upper_bound which trashes initial arc4random() calls it doesn’t like, then
transforms over a desired modulus. The whole transformation by modulus of
perfectly decent random data seems so awkward. It’s not like it is used as
some majestic artistry of RSA it seems like an ugly HACK to simply meet a
demand lacking of something better.

If you understand what I’ve done, it streams in a bitfield into an integer
type like it’s a buffer for just enough or slightly more data to meet the
demands of the upperbound and if it exceeds upperbound-1, it is trashed and
reads in a completely new bitfield to check. It relies on arc4random()
supplying good random data regardless of how many bits are in the bitfield.
If it does so, it should and seems to supply a good distribution across the
length of the bitfield which may often be something like 5 for a common
26+26+10 upper_bound in /usr/src. It seems to me that it should be pretty
good if not superior method; at least in the realm of cleaner results.
Perhaps it’s confusing what I’ve done with all the bitwise operators, but
it isn’t some random hacky thing I’ve cobbled together.

Or does arc4random() only provide decent random data 32 bits at a time; or
an even 8 bits at a time as arc4random_buf() would suggest?

All I would have to prove is that chacha20 provides good or superior random
bitfields regardless of how many bits are needed and regardless of whether
they begin at the beginning of a byte.

I don’t have the education for that, but “I got a ‘puter for Christmas.”
lol. I can perhaps run simulations if I have nothing better to do.


> I think I can say we know here uniformity is only *one* of the
> desirable properties of a secure random generator.
>
> But I don't think anybody else execpt Luke was talking about
> "improving".  The sole purpose of arc4random_uniform() is to give a
> good implementation of a random number function in a specific range
> using arc4random() as the source. This is needed because the naive
> implementation arc4random() % upper_bound is not uniform.
>
> -Otto
>
>
> --
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-15 Thread Luke Small
I have a feeling that making it threadsafe under any quantity of threads
and still perform is likely not feasible, but if I could; or if making a
nonthreadsafe variant was possible:

Creating a mathematical proof that somehow is “simple and probably correct”
enough, would be an impossible task even for a PhD mathematician.

How did someone prove the current implementation was cryptographically
sound? Did they run massive simulations which ran the gamut of the uint32_t
range which demanded tight tolerances over varying length runs?

How was rc4 cipher proven bad for pseudorandom numbers? I’d be willing to
bet it wasn’t from a mathematical proof; it was from bad data.

I’m guessing that large upperbounds approaching 2**32 don’t feed very
soundly in the current implementation using a modulus; although I suspect
that there isn’t much of a call for such things, I’m pretty sure I saw a
3,000,000,000 upperbound in the src partition tonight.

On Sun, May 15, 2022 at 3:15 AM Otto Moerbeek  wrote:

> On Sun, May 15, 2022 at 01:12:28AM -0500, Luke Small wrote:
>
> > This is version 1, which I was pretty sure was secure.
> >
> > I revamped it with a few features and implanted the binary search for
> 'bit'
> >
> > in most cases, which aren't intentionally worst-case, it's pretty darned
> > fast!
> >
> > This is a sample run of my program with your piece of code included:
> >
> >
> >  1  99319 100239
> >  2 100353  99584
> >  3 100032  99879
> >  4  99704 100229
> >  5  99530  99796
> >  6  99617 100355
> >  7 100490 100319
> >  8  99793 100114
> >  9 100426  99744
> > 10 100315 100558
> > 11  99279  99766
> > 12  99572  99750
> > 13  99955 100017
> > 14 100413 15
> > 15 100190 100052
> > 16 101071 100195
> > 17 100322 100224
> > 18  99637  99540
> > 19 100323  99251
> > 20  99841 100177
> > 21  99948  99504
> > 22 100065 100031
> > 23 100026  99827
> > 24  99836  99818
> > 25 100245  99822
> > 26 100088  99678
> > 27  99957  3
> > 28 100065  99961
> > 29 100701 100679
> > 30  99756  99587
> > 31 100220 100076
> > 32 100067 15
> > 33  99547  99984
> > 34 100124 100031
> > 35  99547 100661
> > 36  99801  99963
> > 37 100189 100230
> > 38  99878  99579
> > 39  99864  99442
> > 40  99683 14
> > 41  99907 100094
> > 42 100291  99817
> > 43  99908  99984
> > 44 100044 100606
> > 45 100065 100120
> > 46  99358 100141
> > 47 100152 100442
> > 48 10 100279
> > 49 100486  99848
>
> Sadly a sample run cannot be used to proof your implementation is
> correct.  It can only be used to show it is not correct, like Philip
> did.  To show your implementation produces uniform results in all
> cases, you need to provide a solid argumentation that is easy to
> follow. So far you failed to do that and I do not see it coming, given
> the complexituy of your implementation.  The current implementation
> has a straightforward argumentation as it uses relatively simple
> mathematical properties of modulo arithmetic.
>
> It is also clear your code (as it uses statics) is not thread safe.
>
> So to answer you original question clearly: no, we will not accept this.
>
> -Otto
>
-- 
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-15 Thread Luke Small
This is version 1, which I was pretty sure was secure.

I revamped it with a few features and implanted the binary search for 'bit'

in most cases, which aren't intentionally worst-case, it's pretty darned
fast!

This is a sample run of my program with your piece of code included:


 1  99319 100239
 2 100353  99584
 3 100032  99879
 4  99704 100229
 5  99530  99796
 6  99617 100355
 7 100490 100319
 8  99793 100114
 9 100426  99744
10 100315 100558
11  99279  99766
12  99572  99750
13  99955 100017
14 100413 15
15 100190 100052
16 101071 100195
17 100322 100224
18  99637  99540
19 100323  99251
20  99841 100177
21  99948  99504
22 100065 100031
23 100026  99827
24  99836  99818
25 100245  99822
26 100088  99678
27  99957  3
28 100065  99961
29 100701 100679
30  99756  99587
31 100220 100076
32 100067 15
33  99547  99984
34 100124 100031
35  99547 100661
36  99801  99963
37 100189 100230
38  99878  99579
39  99864  99442
40  99683 14
41  99907 100094
42 100291  99817
43  99908  99984
44 100044 100606
45 100065 100120
46  99358 100141
47 100152 100442
48 10 100279
49 100486  99848








newdata
time = 0.240530757 seconds

newdata_fast
time = 0.073868626 seconds

The original takes 325.620% of the runtime of newdata_fast






newdataTypableFilename
time = 0.236748989 seconds

newdataTypableFilename_fast
time = 0.115842333 seconds

The original takes 204.372% of the runtime of newdataTypableFilename_fast






rand_ideal
time = 0.235832820 seconds

rand_ideal_fast
time = 0.025300798 seconds

The original takes 932.116% of the runtime of rand_ideal_fast






rand_short_ideal
time = 0.236828684 seconds

rand_short_ideal_fast
time = 0.124025922 seconds

The original takes 190.951% of the runtime of rand_short_ideal_fast






rand_short_worst
time = 0.237142869 seconds

rand_short_worst_fast
time = 0.294156486 seconds

The original takes 80.618% of the runtime of rand_short_worst_fast






rand_worst
time = 0.237002775 seconds

rand_worst_fast
time = 0.377148420 seconds

The original takes 62.841% of the runtime of rand_worst_fast






shuffle
time = 0.003044472 seconds

shuffle_fast
time = 0.002590664 seconds

The original takes 117.517% of the runtime of shuffle_fast

If it crashed here, you are trying to profile.
Turn off pledge at the beginning of main().



On Sat, May 14, 2022 at 7:03 PM Philip Guenther  wrote:

> On Sun, 15 May 2022, Steffen Nurpmeso wrote:
> > Stuart Henderson wrote in
> ...
> >  |what's the perceived problem you're wanting to solve? and does that
> >  |problem actually exist in the first place?
> >
> > The problem is that if have a low upper bound then modulo will "remove a
> > lot of randomization".  For example if you have a program which
> > generates Lotto numbers (1..49), then using _uniform() as it is will
> > generate many duplicates.
>
> Wut.  The *WHOLE POINT* of arc4random_uniform() is that it has uniform
> distribution.  Says right so in the manpage.  If an implementation of that
> API fails to do that, it's a broken implementation.
>
> The OpenBSD implementation achieves that.  NetBSD's implementation has the
> same core logic.  Here's my quick lotto pick demo program, doing 490
> picks, so they should all be somewhere near 10:
>
> #include 
> #include 
> #define LIMIT   49
> int
> main(void)
> {
> unsigned counts[LIMIT] = { 0 };
> int i;
> for (i = 0; i < 10 * LIMIT; i++)
> counts[arc4random_uniform(LIMIT)]++;
> for (i = 0; i < LIMIT; i++)
> printf("%2d\t%7u\n", i+1, counts[i]);
> return 0;
> }
>
> And a sample run:
>
> : bleys; ./a.out
>
>  1   100100
>  2   100639
>  399965
>  499729
>  599641
>  699650
>  7   100299
>  8   100164
>  999791
> 10   100101
> 11   100657
> 12   100042
> 1399661
> 1499927
> 1599426
> 1699491
> 1799646
> 18   100133
> 19   100013
> 2099942
> 2199873
> 2299924
> 2399567
> 24   100152
> 25   100688
> 26   100011
> 27   100481
> 2899980
> 29   100406
> 3099726
> 3199808
> 3299929
> 33   100050
> 3499983
> 35   100048
> 3699771
> 3799906
> 38   100215
> 39   100261
> 40   100426
> 4199847
> 4299533
> 43   100368
> 4499695
> 45   100041
> 46   100465
> 4799875
> 48   100034
> 4999920
> : bleys;
>
> Looks pretty good to me, with repeated runs showing the values bouncing
> around.
>
>
> ...
> > I was a bit interested when i saw Luke's message, but i am no
> > mathematician and, like i said, i in fact never needed the
> > functionality.  And the question i would have is how small
> > upper_bound has to be for the modulo problem to really kick in.
>
> If the implementation isn't broken, it's not a problem, period.
>
>
> > And even if, whether it 

Re: Picky, but much more efficient arc4random_uniform!

2022-05-14 Thread Luke Small
Look at my code. I don’t even use a modulus operator. I perform hit and
miss with a random bitstream.

How can I have a bias of something I don’t do? I return a bitstream which
meets the parameters of being a value less than the upper bound. Much like
arc4random_buf().

If I use arc4random_uniform() repeatedly to create a random distribution of
say numbers less than 0x1000 or even something weird like 0x1300 will the
random distribution be better with arc4random_uniform() or with mine? For
0x1000 mine will simply pluck 12 bits of random data straight from the
arc4random() (and preserve the remaining 20 bits for later) on the first
try, just like it’s arc4random_buf().

arc4random_uniform() will perform a modulus of a 32 bit number which adds
data to the bitstream. Does it make it better? Perhaps it makes it harder
to guess the source bits.

I don’t know; and I’m not going to pretend to be a cryptologist. But I’m
looking at modulo bias.

I didn’t know what it was, before, but I basically “rejection sample”:

https://research.kudelskisecurity.com/2020/07/28/the-definitive-guide-to-modulo-bias-and-how-to-avoid-it/

On Sat, May 14, 2022 at 6:14 AM Otto Moerbeek  wrote:

> On Sat, May 14, 2022 at 05:48:10AM -0500, Luke Small wrote:
>
> > arc4random_uniform_fast2 that I made, streams in data from arc4random()
> and
> > uses the datastream directly and uses it as a bit by bit right "sliding
> > window" in the last loop. arc4random_uniform() uses a modulus which I is
> > simple to implement, but I wonder how cryptographically sound or even how
> > evenly it distributes. Adding a modulus seems sloppy without something
> > better. I did make arc4random_fast_simple() which merely takes an
> > upperbound. I integrated arc4random_uniform_fast_bitsearch() or whatever
> > the top function was into it which binary searches to find the correct
> size
> > bitfield (return value) needed to barely fit the upperbound while also
> > being able to discover every possible value below the upperbound. It
> isn't
> > as fast as arc4random_uniform_fast2 if it were used repeatedly after a
> > single use of arc4random_uniform_fast_bitsearch() , but it does exactly
> the
> > same thing and appears faster than repeatedly using arc4random_uniform()
> > and it's wasteful use of arc4random() and calling the expensive rekeying
> > function more often.
> >
> > It may be interesting to determine even without looking at performance,
> > whether arc4random_fast_simple() creates a more superior, secure use of
> the
> > chacha20 stream than arc4random_uniform() with the modulus. what exactly
> > does all that extra data from the modulus do to the random distribution?
> >
> > -Luke
>
> I don't follow you at all. Your blabbering does not even use the terms
> "uniform" and "modulo bias". I wonder even if you realize what they
> mean in this context.
>
> -Otto
>
> --
-Luke


Re: Picky, but much more efficient arc4random_uniform!

2022-05-14 Thread Luke Small
arc4random_uniform_fast2 that I made, streams in data from arc4random() and
uses the datastream directly and uses it as a bit by bit right "sliding
window" in the last loop. arc4random_uniform() uses a modulus which I is
simple to implement, but I wonder how cryptographically sound or even how
evenly it distributes. Adding a modulus seems sloppy without something
better. I did make arc4random_fast_simple() which merely takes an
upperbound. I integrated arc4random_uniform_fast_bitsearch() or whatever
the top function was into it which binary searches to find the correct size
bitfield (return value) needed to barely fit the upperbound while also
being able to discover every possible value below the upperbound. It isn't
as fast as arc4random_uniform_fast2 if it were used repeatedly after a
single use of arc4random_uniform_fast_bitsearch() , but it does exactly the
same thing and appears faster than repeatedly using arc4random_uniform()
and it's wasteful use of arc4random() and calling the expensive rekeying
function more often.

It may be interesting to determine even without looking at performance,
whether arc4random_fast_simple() creates a more superior, secure use of the
chacha20 stream than arc4random_uniform() with the modulus. what exactly
does all that extra data from the modulus do to the random distribution?

-Luke


Picky, but much more efficient arc4random_uniform!

2022-05-13 Thread Luke Small
I made a couple new versions of a new kind of arc4random_uniform-like
function and some example functions which use them. Instead of having a
sufficiently large random number greater than the modulus, I pick a random
number using arc4random() from a bitfield where the length of the bitfield
is just below or slightly beyond the value of the modulus and returns the
bitfield it if it is less than the value of the modulus.

In both versions, I retrieve a bitfield from a static uint64_t which is
refreshed from periodic arc4random() calls. The functions demand a bit
length. but I provide a convenient bitsize search function:
arc4random_uniform_fast_bitsearch()

in the first version, if the chosen return value isn't less than the
modulus, the entire bitfield is trashed and a completely new bitfield is
refreshed from the cache. This can be very inefficient and for certain
upperbounds where the functions demand that all returned values less than
the upperbound are possible. This can perform worse than
arc4random_uniform() even with more random number generation churn.

in the second version, if the chosen return value isn't less than the
modulus, the bitfield is shifted right, losing the least significant bit
and a new random-value bit from the least significant bit of the cache is
put into the most significant bit of the bitfield and it tries again. For
significant expenditures of effort, this version is always more efficient
than arc4random_uniform() and slightly to much more efficient than my first
version.


The performance boost comes from less pseudorandom number generation by not
trashing perfectly good random data and preserving it so that rekeying is
performed less.


I suspect that the first version may be secure, but I'm now sure what you
think of the second version, for being secure. Is there a way to test how
well distributed and secure it is?!

Perhaps it's even better than the typical use case of arc4random_uniform
since it feeds it a bitstream instead of performing a modulous operation on
it!

I pledge("stdio") and unveil(NULL, NULL) at the beginning, so you know it
doesn't do anything but malloc() an array, do stuff on the array and print
stuff; if you don't profile, anyway.

What do you think of having such a thing in OpenBSD?




newdata() generates random a-z characters performing arc4random_uniform(26);

newdataTypableFilename() generates random filename typable characters
performing arc4random_uniform(92);

I perform other tests including presumed worst-cases on large numbers and
numbers which will have a lot of misses.




-Luke


arc4random_uniform_fast.c
Description: Binary data


clang performance issue which is much worse on openbsd

2021-11-06 Thread Luke Small
https://bugs.llvm.org/show_bug.cgi?id=50026

I reported it to the llvm people. it is two slightly different quicksort
algorithms which perform radically differently. The one which you could
assume would take more time, performs MUCH better.

I made a custom quicksort algorithm which outperforms qsort by A LOT for
sorting an array of around 300 randomly created unsigned characters, which
is what I use it for.

One guy said there's a 10% difference for sorting 3 million characters on
freebsd, but there's about 40% performance difference on OpenBSD. maybe
it's also how the OpenBSD team modified clang to prevent rop chain stuff or
something? I'm using a westmere based intell server.

-Luke


sort_test2.c
Description: Binary data


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-19 Thread Luke Small
I used the verbiage: “malloc(3)” as a general all-encompassing manpage
which includes malloc(), calloc(), freezero(), etc.

Sorry for the confusion.

> In malloc(3):
>> > “If you use smaller integer types than size_t for ‘nmemb’ and ‘size’,
>> then
>> > multiplication in freezero() may need to be cast to size_t to avoid
>> integer
>> > overflow:
>> > freezero(ptr, (size_t)nmemb * (size_t)size);”
>> > Or maybe even: freezero(ptr, (size_t)nmemb * size);
>>
>> This is bad advice.  The product of two size_t values can exceed
>> SIZE_MAX, at which point you would get integer overflow.  This is
>> why the malloc(3) man page warns against it.  Note that on 64-bit
>> platforms like amd64, size_t is already 64-bit so casting to unsigned
>> long long or uint64_t is not effective.
>>
>> On OpenBSD, calloc(3) and reallocarray(3) check for integer overflow
>> for you, which is why they are preferred over malloc(nmemb * size).
>> You can examing the code yourself:
>> http://cvsweb.openbsd.org/src/lib/libc/stdlib/reallocarray.c?rev=1.3
>>
>>  - todd
>>
> --
> -Luke
>
-- 
-Luke


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-19 Thread Luke Small
I agree it can overflow. But if you use the same variables with the same
values plugged into

ptr = calloc(nmemb, size);

as you use in

freezero(ptr, (size_t)nmemb * size);

If it can overflow, it will have done it already in calloc().


On Fri, Feb 19, 2021 at 12:23 PM Todd C. Miller  wrote:

> On Fri, 19 Feb 2021 10:38:13 -0600, Luke Small wrote:
>
> > In malloc(3):
> > “If you use smaller integer types than size_t for ‘nmemb’ and ‘size’,
> then
> > multiplication in freezero() may need to be cast to size_t to avoid
> integer
> > overflow:
> > freezero(ptr, (size_t)nmemb * (size_t)size);”
> > Or maybe even: freezero(ptr, (size_t)nmemb * size);
>
> This is bad advice.  The product of two size_t values can exceed
> SIZE_MAX, at which point you would get integer overflow.  This is
> why the malloc(3) man page warns against it.  Note that on 64-bit
> platforms like amd64, size_t is already 64-bit so casting to unsigned
> long long or uint64_t is not effective.
>
> On OpenBSD, calloc(3) and reallocarray(3) check for integer overflow
> for you, which is why they are preferred over malloc(nmemb * size).
> You can examing the code yourself:
> http://cvsweb.openbsd.org/src/lib/libc/stdlib/reallocarray.c?rev=1.3
>
>  - todd
>
-- 
-Luke


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-19 Thread Luke Small
>
> > In the manpage you could succinctly state:
> >
> > In malloc(3):
> > “If you use smaller integer types than size_t for ‘nmemb’ and ‘size’,
> then

> multiplication in freezero() may need to be cast to size_t to avoid
> integer overflow:
> > freezero(ptr, (size_t)nmemb * (size_t)size);”
> > Or maybe even: freezero(ptr, (size_t)nmemb * size);
>
> That is incorrect.


If it’s functionally incorrect, then tell me how the following cast acts
equivalently to intermediate storage or at least calls operations which act
upon uint64_t and makes the function return what is obviously intended on
your OpenBSD:

uint64_t bufferToTime(const u_char buf[])
{

return (( (( (( (( (( (( ((
  (uint64_t) buf[7]) << 8)
 | buf[6]) << 8)
 | buf[5]) << 8)
 | buf[4]) << 8)
 | buf[3]) << 8)
 | buf[2]) << 8)
 | buf[1]) << 8)
 | buf[0];
}

Because it works.
-- 
-Luke


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-19 Thread Luke Small
malloc(3) already speaks to programmers who might use int multiplication
and telling them to test for int multiplication overflow in malloc(), so
you presume that they are already prepared to use something smaller than
size_t, when you could have just said:
“only use size_t variables for integer types.” and cut out the int
multiplication overflow test example.

In the manpage you could succinctly state:

In malloc(3):
“If you use smaller integer types than size_t for ‘nmemb’ and ‘size’, then
multiplication in freezero() may need to be cast to size_t to avoid integer
overflow:
freezero(ptr, (size_t)nmemb * (size_t)size);”
Or maybe even: freezero(ptr, (size_t)nmemb * size);

Or:

void freeczero( size_t nmemb, size_t size)
{
freezero(nmemb * size);
}

I suspect that freezero() is already little more than:

void freezero(void *ptr, size_t size)
{
explicit_bzero(ptr, size);
free(ptr);
}

On Fri, Feb 19, 2021 at 12:51 AM Otto Moerbeek  wrote:

> On Thu, Feb 18, 2021 at 03:24:36PM -0600, Luke Small wrote:
>
> > However, calloc(ptr, nmemb, size) may have been called using smaller int
> > variable types which would overflow when multiplied. Where if the
> variables
> > storing the values passed to nmemb and size are less than or especially
> > equal to their original values, I think it’d be good to state that:
> >
> > freezero(ptr, (size_t)nmemb * (size_t)size);
> > is guaranteed to work, but
> > freezero(ptr, nmemb * size);
> > does not have that guarantee.
>
> Lets try to make things explicit.
>
> The function c() does the overflowe check like calloc does.
> The function f() takes a size_t.
>
> #include 
> #include 
>
> #define MUL_NO_OVERFLOW (1UL << (sizeof(size_t) * 4))
>
> void c(size_t nmemb, size_t size)
> {
> if ((nmemb >= MUL_NO_OVERFLOW || size >= MUL_NO_OVERFLOW) &&
> nmemb > 0 && SIZE_T_MAX / nmemb < size)
> printf("Overflow\n");
> else
> printf("%zu\n", nmemb * size);
> }
>
> void f(size_t m)
> {
> printf("%zu\n", m);
> }
>
> int
> main()
> {
> int a = INT_MAX;
> int b = INT_MAX;
> c(a, b);
> f(a * b);
> }
>
> Now the issues is that the multiplication in the last line of main()
> overflows:
>
> $ ./a.out
> 4611686014132420609
> 1
>
> because this is an int multiplication only after that the promotion to
> size_t is done.
>
> So you are right that this can happen, *if you are using the wrong
> types*. But I would argue that feeding anything other than either
> size_t or constants to calloc() is already wrong. You *have* to
> consider the argument conversion rules when feeding values to calloc()
> (or any function). To avoid having to think about those, start with
> size_t already for everything that is a size or count of a memory
> object.
>
> -Otto
>
-- 
-Luke


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-18 Thread Luke Small
I had a drawn out email email describing passing by value and the
function’s need to only perform size_t multiplication overload checking but
not only do you not care I don’t think it’s worth my time to merely succeed
in angering you. I love your work!

On Thu, Feb 18, 2021 at 7:10 PM Theo de Raadt  wrote:

> Luke Small  wrote:
>
> > However, calloc(ptr, nmemb, size) may have been called using smaller int
> > variable types which would overflow when multiplied.
>
> In which case the allocation would not have succeeded.


> > Where if the variables
> > storing the values passed to nmemb and size are less than or especially
> > equal to their original values, I think it’d be good to state that:
>
> Huh?
>
> > freezero(ptr, (size_t)nmemb * (size_t)size);
> > is guaranteed to work, but
> > freezero(ptr, nmemb * size);
> > does not have that guarantee.
>
> I hope I never run any software by you.
>
-- 
-Luke


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-18 Thread Luke Small
However, calloc(ptr, nmemb, size) may have been called using smaller int
variable types which would overflow when multiplied. Where if the variables
storing the values passed to nmemb and size are less than or especially
equal to their original values, I think it’d be good to state that:

freezero(ptr, (size_t)nmemb * (size_t)size);
is guaranteed to work, but
freezero(ptr, nmemb * size);
does not have that guarantee.

On Thu, Feb 18, 2021 at 3:42 AM Otto Moerbeek  wrote:

> On Wed, Feb 17, 2021 at 11:05:49AM -0700, Theo de Raadt wrote:
>
> > Luke Small  wrote:
> >
> > > I guess I always thought there'd be some more substantial overflow
> mitigation.
> >
> > You have to free with the exact same size as allocation.
>
> Small correction: the size may be smaller than the original. In that
> case, only a partial clear is guaranteed, the deallocation will still
> be for the full allocation. Originally we were more strict, but iirc
> that was causing to much headaches for some. See
> https://cvsweb.openbsd.org/src/lib/libc/stdlib/malloc.c?rev=1.221
>
> But the point stands: nmemb * size does not overflow, since the
> original allocation would have overflowed and thus failed.
>
> -Otto
>
> >
> > nmemb and size did not change.
> >
> > The math has already been checked, and regular codeflows will store the
> > multiple in a single variable after successful checking, for
> > reuse.
> >
> > > Would it be too much hand-holding to put in the manpage that to avoid
> potential
> > > freeezero() integer overflow,
> > > it may be useful to run freezero() as freezero((size_t)nmemb *
> (size_t)size);
> >
> > Wow, Those casts make it very clear you don't understand C, if you do
> > that kind of stuff elsewhere you are introducing problems.
> >
> > Sorry no you are wrong.
> >
>
-- 
-Luke


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-17 Thread Luke Small
Am I incorrect to presume that if the values are successfully used in
calloc(), that (size_t)nmemb * (size_t)size will not overflow?
Isn't the storage capacity of size_t greater than the amount of addressable
space? If it is, calloc() will throw an "out of memory" or other error
before you'll ever reach putting freezero((size_t)nmemb * (size_t)size);

-Luke


On Wed, Feb 17, 2021 at 2:36 PM Luke Small  wrote:

> if the nmemb and size values being passed to calloc() are of a larger
> integer datatype, they will have been truncated when passed to the function
> there as well.
>
> Perhaps you need something larger than size_t in the entire malloc manpage
> series?
>
>
>
> -Luke
>
>
> On Wed, Feb 17, 2021 at 2:25 PM Theo de Raadt  wrote:
>
>> >  > Would it be too much hand-holding to put in the manpage that to
>> avoid potential
>> >  > freeezero() integer overflow,
>> >  > it may be useful to run freezero() as freezero((size_t)nmemb *
>> (size_t)size);
>> >
>> >  Wow, Those casts make it very clear you don't understand C, if you do
>> >  that kind of stuff elsewhere you are introducing problems.
>>
>> If nmemb or size are of a type greater than size_t, those casts serve
>> only one
>> purpose -- truncating the high bits before performing multiply, which
>> results in
>> an incorrect size.
>>
>>
>>
>>


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-17 Thread Luke Small
if the nmemb and size values being passed to calloc() are of a larger
integer datatype, they will have been truncated when passed to the function
there as well.

Perhaps you need something larger than size_t in the entire malloc manpage
series?



-Luke


On Wed, Feb 17, 2021 at 2:25 PM Theo de Raadt  wrote:

> >  > Would it be too much hand-holding to put in the manpage that to avoid
> potential
> >  > freeezero() integer overflow,
> >  > it may be useful to run freezero() as freezero((size_t)nmemb *
> (size_t)size);
> >
> >  Wow, Those casts make it very clear you don't understand C, if you do
> >  that kind of stuff elsewhere you are introducing problems.
>
> If nmemb or size are of a type greater than size_t, those casts serve only
> one
> purpose -- truncating the high bits before performing multiply, which
> results in
> an incorrect size.
>
>
>
>


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-17 Thread Luke Small
I guess I’ve been doing it wrong all this time.

Perhaps you can tell me how the following doesn't return a 0-255 value.

uint64_t bufferToTime(const u_char buf[])
{

return (( (( (( (( (( (( ((
  (uint64_t) buf[7]) << 8)
 | buf[6]) << 8)
 | buf[5]) << 8)
 | buf[4]) << 8)
 | buf[3]) << 8)
 | buf[2]) << 8)
 | buf[1]) << 8)
 | buf[0];
}

}

On Wed, Feb 17, 2021 at 12:05 PM Theo de Raadt  wrote:

> Luke Small  wrote:
>
> > I guess I always thought there'd be some more substantial overflow
> mitigation.
>
> You have to free with the exact same size as allocation.
>
> nmemb and size did not change.
>
> The math has already been checked, and regular codeflows will store the
> multiple in a single variable after successful checking, for
> reuse.
>
> > Would it be too much hand-holding to put in the manpage that to avoid
> potential
> > freeezero() integer overflow,
> > it may be useful to run freezero() as freezero((size_t)nmemb *
> (size_t)size);
>
> Wow, Those casts make it very clear you don't understand C, if you do
> that kind of stuff elsewhere you are introducing problems.
>
> Sorry no you are wrong.
>


Re: if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-17 Thread Luke Small
I guess I always thought there'd be some more substantial overflow
mitigation.

Would it be too much hand-holding to put in the manpage that to avoid
potential freeezero() integer overflow,
it may be useful to run freezero() as freezero((size_t)nmemb *
(size_t)size);

-Luke


On Wed, Feb 17, 2021 at 11:04 AM Theo de Raadt  wrote:

> Luke Small  wrote:
>
> > if calloc() and recallocarray() needs  nmemb and size, why doesn't
> > freezero()?
> >
> > Should there be a freeczero(size_t nmemb, size_t size) ?
>
> Performing the nmemb*size overflow detection a second time provides
> no benefit.
>
>
>


if calloc() needs nmemb and size, why doesn't freezero()?

2021-02-17 Thread Luke Small
if calloc() and recallocarray() needs  nmemb and size, why doesn't
freezero()?

Should there be a freeczero(size_t nmemb, size_t size) ?

-Luke


Re: Could somebody please put unveil() in ftp(1)?

2020-06-02 Thread Luke Small
tiny logical error on line 651 in main.c
-Luke


On Tue, Jun 2, 2020 at 12:38 PM Luke Small  wrote:

> with  -uNp flags
> -Luke
>
>
> On Tue, Jun 2, 2020 at 12:33 PM Luke Small  wrote:
>
>> forgot something.
>> -Luke
>>
>>
>> On Tue, Jun 2, 2020 at 12:06 PM Luke Small  wrote:
>>
>>> I have a ftp folder diff. I altered:
>>> extern.h fetch.c main.c
>>> -Luke
>>>
>>


diff
Description: Binary data


Re: Could somebody please put unveil() in ftp(1)?

2020-06-02 Thread Luke Small
tiny logical error on line 651 in main.c
-Luke


On Tue, Jun 2, 2020 at 12:38 PM Luke Small  wrote:

> with  -uNp flags
> -Luke
>
>
> On Tue, Jun 2, 2020 at 12:33 PM Luke Small  wrote:
>
>> forgot something.
>> -Luke
>>
>>
>> On Tue, Jun 2, 2020 at 12:06 PM Luke Small  wrote:
>>
>>> I have a ftp folder diff. I altered:
>>> extern.h fetch.c main.c
>>> -Luke
>>>
>>


diff
Description: Binary data


Re: Could somebody please put unveil() in ftp(1)?

2020-06-02 Thread Luke Small
with  -uNp flags
-Luke


On Tue, Jun 2, 2020 at 12:33 PM Luke Small  wrote:

> forgot something.
> -Luke
>
>
> On Tue, Jun 2, 2020 at 12:06 PM Luke Small  wrote:
>
>> I have a ftp folder diff. I altered:
>> extern.h fetch.c main.c
>> -Luke
>>
>


diff
Description: Binary data


Re: Could somebody please put unveil() in ftp(1)?

2020-06-02 Thread Luke Small
forgot something.
-Luke


On Tue, Jun 2, 2020 at 12:06 PM Luke Small  wrote:

> I have a ftp folder diff. I altered:
> extern.h fetch.c main.c
> -Luke
>


diff
Description: Binary data


Re: Could somebody please put unveil() in ftp(1)?

2020-06-02 Thread Luke Small
I have a ftp folder diff. I altered:
extern.h fetch.c main.c
-Luke


diff
Description: Binary data


Re: pkg_add mirror latency testing, setting program

2016-08-23 Thread Luke Small
You can only tell the fastest latency for a download by testing it at your
location. It is very fast.

On Tue, Aug 23, 2016 at 5:25 AM <li...@wrant.com> wrote:

> Tue, 23 Aug 2016 09:09:38 +0000 Luke Small <lukensm...@gmail.com>
> [...]
> > It downloads the ANNOUNCEMENT file from each mirror, which is both small
> > and has the same name for every release.
> [...]
>
> Hi Luke,
>
> Two comments, first everyone knows their geographic location, time zone
> and respectively closest mirror, and then second see the mirrors status
>
> mirmon - status of OpenBSD mirrors
> [http://spacehopper.org/mirmon/]
>
> Kind regards,
> Anton
>


pkg_add mirror latency testing, setting program

2016-08-23 Thread Luke Small
I had one before that read openbsd.org/ftp.html (which is insecure because
it gets sets a mirror from data from an unencrypted connection), but I
changed it to read the /etc/examples/pkg.conf file so that there is a more
secure method. I pledged it. I tried to do pledge and setuid, but a glitch
that presumably is fixed in -current (I run 5.9) prevents me from doing so
(I use kqueue with NOTE_EXIT). I commented the setuid part. I fork() and
pipe to a process early in the execution for privilege separation. I do
more fine-grained comment and installpath editing on /etc/pkg.conf any
comments not immediately preceding installpath values are remain in the
file.

The option -s and allows floating point timeout for a test. I like to put
-s .3 with 5Mb/s download.
The option -n limits the number of mirrors written.

It downloads the ANNOUNCEMENT file from each mirror, which is both small
and has the same name for every release.

I suspect it still wouldn't be accepted in any form for the base, but is
there any reason I couldn't make a package, even though I have no idea how.
It doesn't even require any libraries or dependencies. optimizing it is
ridiculous. It spends a monumental time on its executions of ftp and hardly
any time on anything else. making the -s value small, makes it execute
pretty fast.
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */


/*
 * indent pkg_ping2.c -bap -br -ce -ci4 -cli0 -d0 -di0 -i8 -ip -l79 -nbc -ncdb -ndj -ei -nfc1 -nlp -npcs -psl -sc -sob
 */


#define EVENT_NOPOLL
#define EVENT_NOSELECT

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st {
	char *label;
	char *mirror;
	double diff;
};

static int
ftp_cmp(const void *a, const void *b)
{
	struct mirror_st **one;
	struct mirror_st **two;

	one = (struct mirror_st **) a;
	two = (struct mirror_st **) b;

	if ((*one)->diff < (*two)->diff)
		return -1;
	if ((*one)->diff > (*two)->diff)
		return 1;
	return 0;
}

static int
label_cmp(const void *a, const void *b)
{
	struct mirror_st **one;
	struct mirror_st **two;
	int8_t temp;

	one = (struct mirror_st **) a;
	two = (struct mirror_st **) b;

	/* list the USA mirrors first, it will subsort correctly */
	temp = !strncmp("USA", (*one)->label + strlen((*one)->label) - 3, 3);
	if (temp != !strncmp("USA", (*two)->label + strlen((*two)->label) - 3, 3)) {
		if (temp)
			return -1;
		return 1;
	}
	return strcmp((*one)->label, (*two)->label);
}


static double
get_time_diff(struct timeval a, struct timeval b)
{
	long sec;
	long usec;
	sec = b.tv_sec - a.tv_sec;
	usec = b.tv_usec - a.tv_usec;
	if (usec < 0) {
		--sec;
		usec += 100;
	}
	return sec + ((double) usec / 100.0);
}

__dead void
manpage(char *a)
{
	errx(1, "%s [-n maximum_mirrors_written] [-s timeout (floating-point)]", a);
}

int
main(int argc, char *argv[])
{
	if (pledge("stdio wpath cpath rpath proc exec id getpw", NULL) == -1)
		err(EXIT_FAILURE, "pledge");
	pid_t ftp_pid, write_pid;
	int parent_to_write[2];
	char letter;
	double s;
	int kq, i, pos, num, c, n, array_max, array_length;
	FILE *input, *pkgRead;
	struct utsname name;
	struct mirror_st **array;
	struct kevent ke;




	pkgRead = fopen("/etc/pkg.conf", "r");

	array_max = 300;

	if ((array = calloc(array_max, sizeof(struct mirror_st *))) == NULL)
		errx(1, "calloc failed.");

	s = 5;
	n = 5000;

	if (uname() == -1)
		err(1, NULL);

	if (argc > 1) {
		if (argc % 2 == 0)
			manpage(argv[0]);

		for (pos = 1; pos < argc; ++pos) {
			if (strlen(argv[pos]) != 2)
manpage(argv[0]);

			if (!strcmp(argv[pos], "-s")) {
++pos;
c = -1;
i = 0;
while ((letter = argv[pos][++c]) != '\0') {
	if (letter == '.')
		++i;

	if (((letter < '0' || letter > '9')
		&& letter != '.') || i > 1) {

		if (letter == '-')
			errx(1, "No negative numbers.");
		errx(1, "Incorrect floating point format.");
	}
}
errno = 0;
strtod(argv[pos], NULL);
if (errno == ERANGE)
	err(1, NULL);
if ((s = strtod(argv[pos], NULL)) > 1.0)
	errx(1, "-s should <= 1");
			} else if (!strcmp(argv[pos], "-n")) {
++pos;
if (strlen(argv[pos]) > 3)
	errx(1, "Integer should be <= 3 digits long.");
c = -1;
n = 0;
while 

Re: remove kevent perm check

2016-05-13 Thread Luke Small
That seems a bit excessive to crash the program when all you may want to do
is track the exit of a child. Does the pledge proc flag dictate that you
can't do wait() as well?


Re: [PATCH] uname, arch/machine -> %c, %a update in PKG_PATH

2016-02-03 Thread Luke Small
I suspect that unless there is a solution that doesn't involve lazy new
users to memorize more complicated named mirrors, you are going to run into
this problem over and over again.

>>  Raf Czlonka wrote:
>> - ftp.openbsd.org is, AFAIC, overloaded

> I haven't been following this thread fully, but I agree that> ftp.openbsd.org 
>  shouldn't be used in examples. Many many people use 
> the
> default mirror whenever possible.



-Luke


Re: [PATCH] uname, arch/machine -> %c, %a update in PKG_PATH

2016-02-03 Thread Luke Small
I didn't know about the miniroot program that edited installpath until I
had a network-assisted upgrade. Every time before, I just did it from disk.
I edited PKG_PATH to do that, from what I recall, I used a text editor and
to do that, I had to memorize the installpath to manually copy it in the
text editor. I am still unaware of a way to copy and paste to an editor
that is capable of running with root privileges.

I just took the claim that the main mirror was burdened at face value.

I wouldn't doubt that the simplest to remember is heavily burdened and the
longest is probably burdened the least.

-Luke

On Wed, Feb 3, 2016 at 9:12 PM, Stuart Henderson <st...@openbsd.org> wrote:

> On 2016/02/03 20:48, Luke Small wrote:
> > I suspect that unless there is a solution that doesn't involve lazy new
> > users to memorize more complicated named mirrors, you are going to run
> into
> > this problem over and over again.
>
> Why would they need to memorize them? In most cases the one they picked
> when they installed OpenBSD will be just fine, if not they can change
> pkg.conf to point at a new one from the mirrors list.
>
> > >>  Raf Czlonka wrote:
> > >> - ftp.openbsd.org is, AFAIC, overloaded
>
> Whenever I've checked speeds from ftp.openbsd.org they have been fairly
> consistent, this isn't the usual expected behaviour of an overloaded
> machine. (not super fast, but they have been consistent).
>
>


Re: I have a program I wish to submit for the base

2016-02-01 Thread Luke Small
1. You can pick a mirror relatively trivially, but since I've run the
program, the fastest one isn't the one I chose manually. Also, it can
choose multiple mirrors at once, so presumably if there is a failure, it
will choose the next mirror(s) that it wrote down in pkg.conf

2. You are saying that the ftp protocol can be implemented trivially? You
are ridiculous sir.

3. How do you suggest I filter out obviously bad choices. Add on a perl
geolocation package that isn't available in a base install. How about I
just ftp download a smaller file to discover the latency.

4. How doesn't it meet standards. I wrote it according to the style man
page as far as I can tell. And I ran it through indent. Even though I think
kernel normal form is less readable.

I think that there is some unwritten policy that nobody can get something
like this into the system. Why on earth hasn't this happened yet?
On Feb 1, 2016 10:48, "Dmitrij D. Czarkoff"  wrote:

> Jorge Castillo said:
> > Why not make it a port?
>
> Making port for figuring out PKGPATH doesn't sound right.
>
> See, there are four problems with the program:
>
> 1.  It is not good enough in doing its job.  Which is funny, because
> picking right mirror is trivially done without any program.
> 2.  It uses external tools for tasks that could be trivially implemented
> in C.
> 3.  It doesn't filter out obviously bad choices, eg. users in Europe
> will test mirrors in North America.
> 4.  It doesn't meet OpenBSD's standards for code in base.
>
> Problems #2, #3 and #4 can be fixed, but problem #1 makes this
> discussion completely pointless.  Provided that all OpenBSD developers
> who cared to participate in this discussion pointed out this issue, I'd
> suggest to stop wasting time and bandwidth right here.
>
> Luke, if you disagree with my assessment, please publish your program on
> github and convince tech media to mention it.  And move to next thing.
> Thank you in advance.
>
> --
> Dmitrij D. Czarkoff
>


Re: I have a program I wish to submit for the base

2016-01-31 Thread Luke Small
I'm not merely experimenting with kqueue because I like the shiny bells and
whistles. I want to know how fast a mirror will download the same file from
different mirrors. ftp() is shitty for expediency. It does one of three
things it fails fast, succeeds fast, or it could take FOREVERR!!! I
want to detect all three of these scenarios and stop it if it takes
forever. So I call kqueue to time how long it takes ftp to run. If it takes
too long, I kill it. I don't know of any other calls that can do this other
than kqueue. And in a fresh install with absolutely no packages, I think
the only way to do it is by using C.

-Luke

On Fri, Jan 29, 2016 at 6:44 AM, Jérémie Courrèges-Anglas <j...@wxcvbn.org>
wrote:

> Luke Small <lukensm...@gmail.com> writes:
>
> > I wanted to use kqueue. Name another script or programming language that
> > offers it from the base install. NONE!
>
> If you want to discover how to use kqueue, fine, but that's not how
> design decisions are done in OpenBSD land.
>
> > Why should I write it in another language. I already did it in C. Is
> there
> > another way other than kqueue that you can wait for the ftp call to quit,
> > while being able to kill it if it takes too long?
>
> Yes, there are other ways. There are also ways that don't involve
> ftp(1), sed(1) and uname(1).
>
> Luke, sorry if it sounds blunt but your code is just not good enough to
> be accepted into base.  You've probably learned some things when writing
> this program, and maybe it fits your use case, but that's all.
>
> Aside from that I've never felt the need for such kind of program, and
> I don't feel like there's much demand from others.
>
> Cheers,
> --
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
>


Re: I have a program I wish to submit for the base

2016-01-31 Thread Luke Small
I fixed the uname(1) call and replaced it with uname(3) I read the style
man page. ran the program through indent.

I ran it through sed because it reduces code complexity. Why re-engineer
the wheel?

I use C because I can use kqueue from a fresh install. You have to use
unaudited packages to use perl or python kqueue. I want the program to be
safe to run as root.

I use kqueue because I like it, but also because the mirror ftp calls need
to have a wait() call that can collect the status and can enforce a timeout
period. ftp can be a bitch that runs without stopping if you let it. I'm
not willing to let it run for hours, unless the user specifically lets the
timeout period be hours, where I've written it to allow that.

-Luke

On Fri, Jan 29, 2016 at 2:19 AM, Nicholas Marriott <
nicholas.marri...@gmail.com> wrote:

> Firstly, I don't think we need this in base and I think there is little
> to no chance of it being taken, even if the code is improved.
>
> Secondly:
>
> - The code is still miles off style(9) and isn't really a consistent
>   style within itself either.
>
> - Forking uname(1)? What? No offence, but that is hilarious :-). Why
>   fork uname(1) for uname(3) but not date(1) for gettimeofday(2)?
>
> - Why would you fork sed either?
>
> I think C is the wrong tool for this. Why not write a shell, perl, or
> python script?
>
> Then if people start to use it you could make a port.
>
>
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */


/*
 * Special thanks to Dan Mclaughlin for the ftp to sed idea
 *
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 *  -e 's:$::' \
 * 	-e 's:	\([^<]*\)<.*:\1:p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */


#define EVENT_NOPOLL
#define EVENT_NOSELECT

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st {
char   *country_title;
char   *mirror;
char   *install_path;
double 		diff;
struct mirror_st *next;
};

int
ftp_cmp(const void *a, const void *b)
{
struct mirror_st **one = (struct mirror_st **) a;
struct mirror_st **two = (struct mirror_st **) b;

if ((*one)->diff < (*two)->diff)
return -1;
if ((*one)->diff > (*two)->diff)
return 1;
return 0;
}

int
country_cmp(const void *a, const void *b)
{
struct mirror_st **one = (struct mirror_st **) a;
struct mirror_st **two = (struct mirror_st **) b;

//list the USA mirrors first, it will subsort correctly
int8_t temp = !strncmp("USA", (*one)->country_title, 3);
if (temp != !strncmp("USA", (*two)->country_title, 3)) {
if (temp)
return -1;
return 1;
}
return strcmp((*one)->country_title, (*two)->country_title);
}


double
get_time_diff(struct timeval a, struct timeval b)
{
long 		sec;
long 		usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0) {
--sec;
usec += 100;
}
return sec + ((double) usec / 100.0);
}

void
manpage(char *a)
{
errx(1, "%s [-s timeout] [-n maximum_mirrors_written]", a);
}


int
main(int argc, char *argv[])
{
pid_t 		ftp_pid, sed_pid;
int 		ftp_to_sed[2];
int 		sed_to_parent[2];
char 		character;
int 		i;
double 		s = 7;
int 		position , num, c, n = 5000;
FILE   *input;
struct utsname 	name;
if (uname() == -1)
err(1, NULL);

if (argc > 1) {
if (argc % 2 == 0)
manpage(argv[0]);

position = 0;
while (++position < argc) {
if (strlen(argv[position]) != 2)
manpage(argv[0]);

if (!strcmp(argv[position], "-s")) {
++position;
c = -1;
i = 0;
while ((character = argv[position][++c]) != '\0') {
if (character == '.')
++i;

if (((character < '0' || character > '9')
 && character != '.') || i > 1) {


Re: I have a program I wish to submit for the base

2016-01-31 Thread Luke Small
Whoops, got rid of putting in a null character when I should have left it
in.

-Luke
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */


/*
 * Special thanks to Dan Mclaughlin for the ftp to sed idea
 *
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 *  -e 's:$::' \
 * 	-e 's:	\([^<]*\)<.*:\1:p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */


#define EVENT_NOPOLL
#define EVENT_NOSELECT

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st {
char   *country_title;
char   *mirror;
char   *install_path;
double 		diff;
struct mirror_st *next;
};

int
ftp_cmp(const void *a, const void *b)
{
struct mirror_st **one = (struct mirror_st **) a;
struct mirror_st **two = (struct mirror_st **) b;

if ((*one)->diff < (*two)->diff)
return -1;
if ((*one)->diff > (*two)->diff)
return 1;
return 0;
}

int
country_cmp(const void *a, const void *b)
{
struct mirror_st **one = (struct mirror_st **) a;
struct mirror_st **two = (struct mirror_st **) b;

//list the USA mirrors first, it will subsort correctly
int8_t temp = !strncmp("USA", (*one)->country_title, 3);
if (temp != !strncmp("USA", (*two)->country_title, 3)) {
if (temp)
return -1;
return 1;
}
return strcmp((*one)->country_title, (*two)->country_title);
}


double
get_time_diff(struct timeval a, struct timeval b)
{
long 		sec;
long 		usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0) {
--sec;
usec += 100;
}
return sec + ((double) usec / 100.0);
}

void
manpage(char *a)
{
errx(1, "%s [-s timeout] [-n maximum_mirrors_written]", a);
}


int
main(int argc, char *argv[])
{
pid_t 		ftp_pid, sed_pid;
int 		ftp_to_sed[2];
int 		sed_to_parent[2];
char 		character;
int 		i;
double 		s = 7;
int 		position , num, c, n = 5000;
FILE   *input;
struct utsname 	name;
if (uname() == -1)
err(1, NULL);

if (argc > 1) {
if (argc % 2 == 0)
manpage(argv[0]);

position = 0;
while (++position < argc) {
if (strlen(argv[position]) != 2)
manpage(argv[0]);

if (!strcmp(argv[position], "-s")) {
++position;
c = -1;
i = 0;
while ((character = argv[position][++c]) != '\0') {
if (character == '.')
++i;

if (((character < '0' || character > '9')
 && character != '.') || i > 1) {

if (character == '-')
errx(1, "No negative numbers.");
errx(1, "Incorrect floating point format.");
}
}
errno = 0;
strtod(argv[position], NULL);
if (errno == ERANGE)
err(1, NULL);
if ((s = strtod(argv[position], NULL)) > 1.0)
errx(1, "-s should less than or equal to 1");
} else if (!strcmp(argv[position], "-n")) {
++position;
if (strlen(argv[position]) > 3)
errx(1, "Integer should be <= 3 digits long.");
c = -1;
n = 0;
while ((character = argv[position][++c]) != '\0') {
if (character < '0' || character > '9') {
if (character == '.')
errx(1, "No decimal points.");
if (character == '-')
errx(1, "No negative numbers.");
errx(1, "Incorrect integer format.");
}
n = n * 10 + (int) (character - '0');
   

Re: I have a program I wish to submit for the base

2016-01-29 Thread Luke Small
I wanted to use kqueue. Name another script or programming language that
offers it from the base install. NONE!

Why should I write it in another language. I already did it in C. Is there
another way other than kqueue that you can wait for the ftp call to quit,
while being able to kill it if it takes too long?

-Luke

On Fri, Jan 29, 2016 at 3:42 AM,  wrote:

> Fri, 29 Jan 2016 08:19:14 + Nicholas Marriott
> > Firstly, I don't think we need this in base and I think there is little
> > to no chance of it being taken, even if the code is improved.
>
> Many folks tried this part (advising Luke), he takes none and keeps
> repeating wrong concepts, his assignment looks misaligned somewhat.
>
> > Secondly:
> >
> > - The code is still miles off style(9) and isn't really a consistent
> >   style within itself either.
>
> This comes from an apprentice wannabe, probably best to recommend him
> further self help.
>
> > - Forking uname(1)? What? No offence, but that is hilarious :-). Why
> >   fork uname(1) for uname(3) but not date(1) for gettimeofday(2)?
>
> The kid knows nothing of UNIX, ask him book reading comprehension
> questions in private please.
>
> > - Why would you fork sed either?
>
> Hint: suggest another list@
>
> > I think C is the wrong tool for this. Why not write a shell, perl, or
> > python script?
>
> C is the wrong tool for that person, knows nothing of shell too.  So
> best pick learning shell first.  Typical, but never hopeless (still)?
>
> > Then if people start to use it you could make a port.
>
> Without thought at design stage, barely usable for private learning
> projects homework.  The result is reiterations on misc@ where ideas
> spark in developer heads after some kid starts asking noisily without
> listening or prior knowledge.
>


Re: I have a program I wish to submit for the base

2016-01-28 Thread Luke Small
I think I fixed all your suggestions. I don't strictly adhere to kernel
normal in the use of comments and I parse command-line arguments without
using getopt(3), but the method is robust.

-Luke

/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
 

/* Special thanks to Dan Mclaughlin for the ftp to sed idea
 * 
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 *  -e 's:$::' \
 * 	-e 's:	\([^<]*\)<.*:\1:p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */
 
 
#define EVENT_NOPOLL
#define EVENT_NOSELECT

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st
{
	char * country_title;
	char * mirror;
	char * install_path;
	double diff;
	struct mirror_st * next;
};

int ftp_cmp (const void * a, const void * b)
{
	struct mirror_st ** one = (struct mirror_st **)a;
	struct mirror_st ** two = (struct mirror_st **)b;

	if ( (*one)->diff < (*two)->diff )
		return -1;
	if ( (*one)->diff > (*two)->diff )
		return 1;
	return 0;
}

int country_cmp (const void * a, const void * b)
{
	struct mirror_st ** one = (struct mirror_st **)a;
	struct mirror_st ** two = (struct mirror_st **)b;
	
	// list the USA mirrors first, it will subsort correctly
	int8_t temp = !strncmp("USA", (*one)->country_title, 3);
	if (temp != !strncmp("USA", (*two)->country_title, 3))
	{
		if (temp)
			return -1;
		return 1;
	}

	return strcmp( (*one)->country_title, (*two)->country_title ) ;
}


double get_time_diff(struct timeval a, struct timeval b)
{
long sec;
long usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0)
{
		--sec;
		usec += 100;
}
return sec + ((double)usec / 100.0);
}

void manpage(char * a)
{
	errx(1, "%s [-s timeout] [-n maximum_mirrors_written]", a);
}


int main(int argc, char *argv[])
{
	pid_t ftp_pid, sed_pid, uname_pid;
	int ftp_to_sed[2];
	int sed_to_parent[2];
	int uname_to_parent[2];
	char uname_r[5], uname_m[20], character;
	int i;
	double s = 7;
	int position, num, c, n = 5000;
	FILE *input;

	if (argc > 1)
	{
		if (argc % 2 == 0)
			manpage(argv[0]);
		
		position = 0;
		while (++position < argc)
		{
			if (strlen(argv[position]) != 2)
manpage(argv[0]);
			
			if (!strcmp(argv[position], "-s"))
			{
++position;
c = -1;
i = 0;
while ((character = argv[position][++c]) != '\0')
{
	if (character == '.')
		++i;
	
	if ( ((character < '0' || character > '9') && character != '.') || i > 1 )
	{
		if (character == '-')
			errx(1, "No negative numbers.");
		errx(1, "Incorrect floating point format.");
	}
}
errno = 0;
strtod(argv[position], NULL);
if (errno == ERANGE)
	err(1, NULL);
if ((s = strtod(argv[position], NULL)) > 1.0)
	errx(1, "The argument should less than or equal to 1");
			}
			else if (!strcmp(argv[position], "-n"))
			{
++position;
if (strlen(argv[position]) > 3)
	errx(1, "Integer should be less than or equal to 3 digits long.");
c = -1;
n = 0;
while ((character = argv[position][++c]) != '\0')
{
	if ( character < '0' || character > '9' )
	{
		if (character == '.')
			errx(1, "No decimal points.");
		if (character == '-')
			errx(1, "No negative numbers.");
		errx(1, "Incorrect integer format.");
	}
	n = n * 10 + (int)(character - '0');
}
			}
			else
manpage(argv[0]);
		}
	}

	struct kevent ke[1];
	struct timespec timeout;
	
	timeout.tv_sec = (int)s;
	timeout.tv_nsec = (int)(   (s - (double)timeout.tv_sec) * 10  );

	int kq = kqueue();
	if (kq == -1)
		errx(1, "kq!");

	if (pipe(uname_to_parent) == -1)
		err(1, NULL);
	

	// "uname -rm" returns version and architecture like: "5.8 amd64\n" to standard out
	uname_pid = fork();
	if (uname_pid == (pid_t) 0)
	{			/* uname child */
		close(uname_to_parent[0]);
		dup2(uname_to_parent[1], STDOUT_FILENO); /*attaching to pipe(s)*/
		execl("/usr/bin/uname","/usr/bin/uname", "-rm", NULL);
errx(1, "uname execl() failed.");
	}
	if (uname_pid == -1)
		err(1, NULL);
	
	close(uname_to_parent[1]);


	EV_SET(ke, uname_to_parent[0], EVFILT_READ, EV_ADD | EV_ONESHOT, 0, 0, NULL);
	if (kevent(kq, ke, 1, NULL, 0, NULL) == -1)
	{
		kill(uname_pid, SIGKILL);
		errx(1, 

I have a program I wish to submit for the base

2016-01-28 Thread Luke Small
pkg_ping  [-s timeout]
[-n maximum_mirrors_written]

It scrapes each mirror's location and URL from  openbsd.org/ftp.html and
tests the package repository with the  version and architecture of the
machine. It kills the ftp() and sed() functions it calls from C if it takes
too long by using kqueue. It calls uname as well and I put kqueue on it
too, in case there is a chance uname can be called and stall like ftp.

After install, it can write download mirrors to /etc/pkg.conf. I want to
enable the user to write down one or many mirrors as has been calculated by
timing the download of the nearly 700 KB SHA256 file from each mirror.

I think that if pkg_add can't find a suitable mirror, pkg_ping could be
called to find the fastest available mirror(s), especially if their mirror
of choice goes down, or they put off upgrading so long that their mirror of
choice deletes their system's repository.

I think I'm done with it. It is absolutely a coincidence that it is 666
lines. I'm not changing it.

-Luke N Small
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
 

/* Special thanks to Dan Mclaughlin for the ftp to sed idea
 * 
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 *  -e 's:$::' \
 * 	-e 's:	\([^<]*\)<.*:\1:p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */
 
 
#define EVENT_NOPOLL
#define EVENT_NOSELECT

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st
{
	char * countryTitle;
	char * mirror;
	char * installPath;
	double diff;
	struct mirror_st * next;
};

int ftp_cmp (const void * a, const void * b)
{
	struct mirror_st ** one = (struct mirror_st **)a;
	struct mirror_st ** two = (struct mirror_st **)b;

	if( (*one)->diff < (*two)->diff )
		return -1;
	if( (*one)->diff > (*two)->diff )
		return 1;
	return 0;
}

int country_cmp (const void * a, const void * b)
{
	struct mirror_st ** one = (struct mirror_st **)a;
	struct mirror_st ** two = (struct mirror_st **)b;
	
	// list the USA mirrors first, it will subsort correctly
	int8_t temp = !strncmp("USA", (*one)->countryTitle, 3);
	if(temp != !strncmp("USA", (*two)->countryTitle, 3))
	{
		if(temp)
			return -1;
		return 1;
	}

	return strcmp( (*one)->countryTitle, (*two)->countryTitle ) ;
}


double getTimeDiff(struct timeval a, struct timeval b)
{
long sec;
long usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0)
{
		--sec;
		usec += 100;
}
return sec + ((double)usec / 100.0);
}

void manpage(char * a)
{
	errx(1, "%s [-s timeout] [-n maximum_mirrors_written]", a);
}


int main(int argc, char *argv[])
{
	pid_t ftpPid, sedPid, unamePid;
	int ftpToSed[2];
	int sedToParent[2];
	int unameToParent[2];
	char unameR[5], unameM[20], Char;
	int i;
	double S = 7;
	int position, num, c, N = 5000;
	FILE *input;

	if(argc > 1)
	{
		if(argc % 2 == 0)
			manpage(argv[0]);
		
		position = 0;
		while(++position < argc)
		{
			if(strlen(argv[position]) != 2)
manpage(argv[0]);
			
			if(!strcmp(argv[position], "-s"))
			{
++position;
c = -1;
i = 0;
while((Char = argv[position][++c]) != '\0')
{
	if(Char == '.')
		++i;
	
	if( ((Char < '0' || Char > '9') && Char != '.') || i > 1 )
	{
		if(Char == '-')
			errx(1, "No negative numbers.");
		errx(1, "Incorrect floating point format.");
	}
}
errno = 0;
strtod(argv[position], NULL);
if(errno == ERANGE)
	err(1, NULL);
if((S = strtod(argv[position], NULL)) > 1.0)
	errx(1, "The argument should less than or equal to 1");
			}
			else if(!strcmp(argv[position], "-n"))
			{
++position;
if(strlen(argv[position]) > 3)
	errx(1, "Integer should be less than or equal to 3 digits long.");
c = -1;
N = 0;
while((Char = argv[position][++c]) != '\0')
{
	if( Char < '0' || Char > '9' )
	{
		if(Char == '.')
			errx(1, "No decimal points.");
		if(Char == '-')
			errx(1, "No negative numbers.");
		errx(1, "Incorrect integer format.");
	}
	N = N * 10 + (int)(Char - '0');
}
			}
			else
manpage(argv[0]);
		}
	}

	struct kevent ke[2];
	struct timespec timeout;
	
	timeout.tv_sec = (int)S;
	timeout.tv_nsec = (int)(   (S - (double)timeout.tv_sec) * 

Re: I have a mirror testing program for you. - Mirror down

2016-01-23 Thread Luke Small
Ok. I added a lot of security fixes added a feature to put in a custom
floating point timeout as an argument and got rid of the 8 mirror limit. It
puts in all the mirrors that didn't either exceed the timeout period or
have a download error. It should be safe to run as root I guess. There is
nothing that can be done to make it core dump. The only thing, I suspect,
that can go wrong is a man in the middle attack downloading ftp.html. Is
there even a hash value for ftp.html ?

-Luke

On Thu, Jan 21, 2016 at 1:18 AM, Luke Small <lukensm...@gmail.com> wrote:

> The real reason I wrote this is to have an automated way to set up the
> pkg_add mirrors especially for folks that don't care to set them up
> manually (Afterall, that's what computers are for!). Before I wrote this, I
> had a PKG_PATH mirror go down and I didn't know what was going on. At least
> this could get some failover that would work for everyone running the
> release or older at least. I put in a minor edit that kills ftp and sed if
> the buffer gets too full, as well as exiting.
>
>>
>>
>>
>>> > > I have a 500 line program I wrote that reads openbsd.org.ftp.html and
>>>
>>> Here's a simple alternative that will often be good enough.
>>>
>>> ftp -o- -V http://www.openbsd.org/cgi-bin/ftplist.cgi |
>>> sed -e 's, .*,/%m/,' -e 's,^,pkgpath = ,' -e q
>>>
>>> The C program is too trusting with its fixed-size buffers and unchecked
>>> mallocs etc, it's not something to run as root as-is.
>>>
>>
>>
>
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
 

/* Special thanks to Dan Mclaughlin for the ftp to sed idea
 * 
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 *  -e 's:$::' \
 * 	-e 's:	\([^<]*\)<.*:\1:p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */
 
 
#define EVENT_NOPOLL
#define EVENT_NOSELECT

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st
{
	char * countryTitle;
	char * mirror;
	char * installPath;
	double diff;
	struct mirror_st * next;
};

int ftp_cmp (const void * a, const void * b)
{
	struct mirror_st ** one = (struct mirror_st **)a;
	struct mirror_st ** two = (struct mirror_st **)b;

	if( (*one)->diff < (*two)->diff )
		return -1;
	if( (*one)->diff > (*two)->diff )
		return 1;
	return 0;
}

int country_cmp (const void * a, const void * b)
{
	struct mirror_st ** one = (struct mirror_st **)a;
	struct mirror_st ** two = (struct mirror_st **)b;
	
	// list the USA mirrors first, it will subsort correctly
	int8_t temp = !strncmp("USA", (*one)->countryTitle, 3);
	if(temp != !strncmp("USA", (*two)->countryTitle, 3))
	{
		if(temp)
			return -1;
		return 1;
	}

	return strcmp( (*one)->countryTitle, (*two)->countryTitle ) ;
}


double getTimeDiff(struct timeval a, struct timeval b)
{
long sec;
long usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0)
{
		--sec;
		usec += 100;
}
return sec + ((double)usec / 100.0);
}

// Can take one argument which sets a positive floating point timeout
int main(int argc, char *argv[])
{
	pid_t ftpPid, sedPid;
	int ftpToSed[2];
	int sedToParent[2];
	int unameToParent[2];
	char unameR[5], unameM[20], Char;
	int i, c;
	register int position, num;

	FILE *input;

	if(argc > 1)
	{
		c = -1;
		i = 0;
		while((Char = argv[1][++c]) != '\0')
		{
			if(Char == '.')
++i;
			
			if( ((Char < '0' || Char > '9') && Char != '.') || i > 1 )
			{
if(Char == '-')
	errx(1, "No negative numbers.");
errx(1, "Incorrect floating point format.");
			}
		}
		strtod(argv[1], NULL);
		if(errno == ERANGE)
			err(1, NULL);
	}


	pipe(unameToParent);
	

	// "uname -rm" returns version and architecture like: "5.8 amd64" to standard out
	
	if(fork() == (pid_t) 0)
	{			/* uname child */
		close(unameToParent[0]);
		dup2(unameToParent[1], STDOUT_FILENO); /*attaching to pipe(s)*/
		execl("/usr/bin/uname","/usr/bin/uname", "-rm", NULL);
errx(1, "execl() f

Re: I have a mirror testing program for you. - Mirror down

2016-01-20 Thread Luke Small
The real reason I wrote this is to have an automated way to set up the
pkg_add mirrors especially for folks that don't care to set them up
manually (Afterall, that's what computers are for!). Before I wrote this, I
had a PKG_PATH mirror go down and I didn't know what was going on. At least
this could get some failover that would work for everyone running the
release or older at least. I put in a minor edit that kills ftp and sed if
the buffer gets too full, as well as exiting.

>
>
>
>> > > I have a 500 line program I wrote that reads openbsd.org.ftp.html and
>>
>> Here's a simple alternative that will often be good enough.
>>
>> ftp -o- -V http://www.openbsd.org/cgi-bin/ftplist.cgi |
>> sed -e 's, .*,/%m/,' -e 's,^,pkgpath = ,' -e q
>>
>> The C program is too trusting with its fixed-size buffers and unchecked
>> mallocs etc, it's not something to run as root as-is.
>>
>
>
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
 

/* Special thanks to Dan Mclaughlin for the ftp to sed idea
 * 
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 * 	-e 's:	\([^<]*\)<.*:\1 :p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */
 
 
#define EVENT_NOPOLL
#define EVENT_NOSELECT


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st
{
	char * countryTitle;
	char * mirror;
	char * installPath;
	double diff;
	struct mirror_st * next;
};

int ftp_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;

	if( (*one)->diff < (*two)->diff )
		return -1;
	if( (*one)->diff > (*two)->diff )
		return 1;
	return 0;
}

int country_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;
	
	// list the USA mirrors first, it will subsort correctly
	int8_t temp = !strncmp("USA", (*one)->countryTitle, 3);
	if(temp != !strncmp("USA", (*two)->countryTitle, 3))
	{
		if(temp)
			return -1;
		return 1;
	}

	return strcmp( (*one)->countryTitle, (*two)->countryTitle ) ;
}


double getTimeDiff(struct timeval a, struct timeval b)
{
long sec;
long usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0)
{
		--sec;
		usec += 100;
}
return sec + ((double)usec / 100.0);
}

int main()
{
	pid_t ftpPid, sedPid;
	int ftpToSed[2];
	int sedToParent[2];
	int unameToParent[2];
	char unameR[5], unameM[20];
	int i = 0, c;
	register int position, num;

	FILE *input;



	pipe(unameToParent);
	

	// "uname -rm" returns version and architecture like: "5.8 amd64" to standard out
	
	if(fork() == (pid_t) 0)
	{			/* uname child */
		close(unameToParent[0]);
		dup2(unameToParent[1], STDOUT_FILENO); /*attaching to pipe(s)*/
		execl("/usr/bin/uname","/usr/bin/uname", "-rm", NULL);
err(1, "execl() failed\n");
	}
	
	close(unameToParent[1]);

	input = fdopen (unameToParent[0], "r");

	num = 0;
	position = -1;
	while ((c = getc(input)) != EOF)
	{
		if(num == 0)
		{
			if(position >= 5)
err(1, "unameR[] got too long!\n");
			if(c != ' ')
unameR[++position] = c;
			else
			{
unameR[position + 1] = '\0';
num = 1;
position = -1;
			}
		}
		else
		{
			if(position >= 20)
err(1, "unameM[] got too long!\n");
			if(c != '\n')
unameM[++position] = c;
			else
unameM[position + 1] = '\0';
		}
	}
	fclose (input);
	close(unameToParent[0]);





	pipe(ftpToSed); /*make pipes*/

	struct kevent ke[2];

	int kq = kqueue();
	if (kq == -1)
		err(1, "kq!");

	int kqProc = kqueue();
	if (kqProc == -1)
		err(1, "kqProc!");
		
		
	ftpPid = fork();
	if(ftpPid == (pid_t) 0)
	{			/*ftp child*/
		close(ftpToSed[0]);
		dup2(ftpToSed[1], STDOUT_FILENO); /*attaching to pipe(s)*/
execl("/usr/bin/ftp","ftp", "-Vo", "-", "http://www.openbsd.org/ftp.html;, NULL);
err(1, "execl() failed\n");
	}
	EV_SET(ke, ftpPid, EVFILT_PROC, EV_ADD | EV_ONESHOT, NOTE_EXIT, 0, );
	if (kevent(kqProc, ke, 1, NULL, 0, NULL) == -1)
		err(1, "kevent register fail.");

	close(ftpToSed[1]);

	pipe(sedToParent);
	

	sedPid = fork();
	if(sedPid == (pid_t) 0)
	{			/* sed child */
		close(sedToParent[0]);
		dup2(ftpToSed[0], STDIN_FILENO); /*attaching to pipe(s)*/
	

Re: I have a mirror testing program for you.

2016-01-20 Thread Luke Small
< The C program is too trusting with its fixed-size buffers and unchecked
< mallocs etc, it's not something to run as root as-is.

I realize I got a little lazy with no checking the mallocs, but that is
fixed.

I wrote this to be resource-light and thorough. No half-ass bullshit. If
somebody wants to not update their system for over a year, it will find the
remaining mirrors, no matter where they are.I got rid of the fixed size
buffers with the "totalLength" integer.

I also made the mirrors that instantly give an error have a larger diff
than the ones that merely took too long, just in case those are the only
ones available. Maybe I could enable an argument to increase the timeout
beyond 7 seconds.
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
 

/* Special thanks to Dan Mclaughlin for the ftp to sed idea
 * 
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 * 	-e 's:	\([^<]*\)<.*:\1 :p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */
 
 
#define EVENT_NOPOLL
#define EVENT_NOSELECT


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st
{
	char * countryTitle;
	char * mirror;
	char * installPath;
	double diff;
	struct mirror_st * next;
};

int ftp_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;

	if( (*one)->diff < (*two)->diff )
		return -1;
	if( (*one)->diff > (*two)->diff )
		return 1;
	return 0;
}

int country_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;
	
	// list the USA mirrors first, it will subsort correctly
	int8_t temp = !strncmp("USA", (*one)->countryTitle, 3);
	if(temp != !strncmp("USA", (*two)->countryTitle, 3))
	{
		if(temp)
			return -1;
		return 1;
	}

	return strcmp( (*one)->countryTitle, (*two)->countryTitle ) ;
}


double getTimeDiff(struct timeval a, struct timeval b)
{
long sec;
long usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0)
{
		--sec;
		usec += 100;
}
return sec + ((double)usec / 100.0);
}

int main()
{
	pid_t ftpPid, sedPid;
	int ftpToSed[2];
	int sedToParent[2];
	int unameToParent[2];
	char unameR[5], unameM[20];
	int i = 0, c;
	register int position, num;

	FILE *input;





	pipe(unameToParent);
	

	// "uname -rm" returns version and architecture like: "5.8 amd64" to standard out
	
	if(fork() == (pid_t) 0)
	{			/* uname child */
		close(unameToParent[0]);
		dup2(unameToParent[1], STDOUT_FILENO); /*attaching to pipe(s)*/
		execl("/usr/bin/uname","/usr/bin/uname", "-rm", NULL);
err(1, "execl() failed\n");
	}
	
	close(unameToParent[1]);

	input = fdopen (unameToParent[0], "r");

	num = 0;
	position = -1;
	while ((c = getc(input)) != EOF)
	{
		if(num == 0)
		{
			if(c != ' ')
unameR[++position] = c;
			else
			{
unameR[position + 1] = '\0';
num = 1;
position = -1;
			}
		}
		else
		{
			if(c != '\n')
unameM[++position] = c;
			else
unameM[position + 1] = '\0';
		}
	}
	fclose (input);
	close(unameToParent[0]);





	pipe(ftpToSed); /*make pipes*/

	struct kevent ke[2];

	int kq = kqueue();
	if (kq == -1)
		err(1, "kq!");

	int kqProc = kqueue();
	if (kqProc == -1)
		err(1, "kqProc!");
		
		
	ftpPid = fork();
	if(ftpPid == (pid_t) 0)
	{			/*ftp child*/
		close(ftpToSed[0]);
		dup2(ftpToSed[1], STDOUT_FILENO); /*attaching to pipe(s)*/
execl("/usr/bin/ftp","ftp", "-Vo", "-", "http://www.openbsd.org/ftp.html;, NULL);
err(1, "execl() failed\n");
	}
	EV_SET(ke, ftpPid, EVFILT_PROC, EV_ADD | EV_ONESHOT, NOTE_EXIT, 0, );
	if (kevent(kqProc, ke, 1, NULL, 0, NULL) == -1)
		err(1, "kevent register fail.");

	close(ftpToSed[1]);

	pipe(sedToParent);
	

	sedPid = fork();
	if(sedPid == (pid_t) 0)
	{			/* sed child */
		close(sedToParent[0]);
		dup2(ftpToSed[0], STDIN_FILENO); /*attaching to pipe(s)*/
		dup2(sedToParent[1], STDOUT_FILENO);
		execl("/usr/bin/sed","sed","-n","-e", "s:\t\\([^<]*\\)<.*:\\1 :p",
		"-e", "s:^\\(\t[hfr].*\\):\\1:p", NULL);
		kill(ftpPid, SIGKILL);
err(1, "execl() failed\n");
	}
	EV_SET(ke, sedPid, EVFILT_PROC, EV_ADD | EV_ONESHOT, NOTE_EXIT, 0, );
	if 

Re: I have a mirror testing program for you. - Mirror down

2016-01-20 Thread Luke Small
OK, there, I put in error checks, so that the index used to write into the
arrays can't get too big.

-Luke

On Wed, Jan 20, 2016 at 4:27 AM, Stuart Henderson <st...@openbsd.org> wrote:

> On 2016/01/20 10:38, Benjamin Baier wrote:
> > Important thing first, the mirror http://openbsd.cs.fau.de/pub/OpenBSD/
> > seems to be down.
>
> +cc maintainer, could you take a look please Simon? Down for v4+v6,
> traceroute stops at informatik.gate.uni-erlangen.de (131.188.20.38 /
> 2001:638:a000::3341:41) with !A on v6.
>
> > On Tue, 19 Jan 2016 22:19:42 -0600
> > Luke Small <lukensm...@gmail.com> wrote:
> >
> > > I have a 500 line program I wrote that reads openbsd.org.ftp.html and
>
> Here's a simple alternative that will often be good enough.
>
> ftp -o- -V http://www.openbsd.org/cgi-bin/ftplist.cgi |
> sed -e 's, .*,/%m/,' -e 's,^,pkgpath = ,' -e q
>
> The C program is too trusting with its fixed-size buffers and unchecked
> mallocs etc, it's not something to run as root as-is.
>
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
 

/* Special thanks to Dan Mclaughlin for the ftp to sed idea
 * 
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 * 	-e 's:	\([^<]*\)<.*:\1 :p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */
 
 
#define EVENT_NOPOLL
#define EVENT_NOSELECT


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st
{
	char * countryTitle;
	char * mirror;
	char * installPath;
	double diff;
	struct mirror_st * next;
};

int ftp_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;

	if( (*one)->diff < (*two)->diff )
		return -1;
	if( (*one)->diff > (*two)->diff )
		return 1;
	return 0;
}

int country_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;
	
	// list the USA mirrors first, it will subsort correctly
	int8_t temp = !strncmp("USA", (*one)->countryTitle, 3);
	if(temp != !strncmp("USA", (*two)->countryTitle, 3))
	{
		if(temp)
			return -1;
		return 1;
	}

	return strcmp( (*one)->countryTitle, (*two)->countryTitle ) ;
}


double getTimeDiff(struct timeval a, struct timeval b)
{
long sec;
long usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0)
{
		--sec;
		usec += 100;
}
return sec + ((double)usec / 100.0);
}

int main()
{
	pid_t ftpPid, sedPid;
	int ftpToSed[2];
	int sedToParent[2];
	int unameToParent[2];
	char unameR[5], unameM[20];
	int i = 0, c;
	register int position, num;

	FILE *input;



	pipe(unameToParent);
	

	// "uname -rm" returns version and architecture like: "5.8 amd64" to standard out
	
	if(fork() == (pid_t) 0)
	{			/* uname child */
		close(unameToParent[0]);
		dup2(unameToParent[1], STDOUT_FILENO); /*attaching to pipe(s)*/
		execl("/usr/bin/uname","/usr/bin/uname", "-rm", NULL);
err(1, "execl() failed\n");
	}
	
	close(unameToParent[1]);

	input = fdopen (unameToParent[0], "r");

	num = 0;
	position = -1;
	while ((c = getc(input)) != EOF)
	{
		if(num == 0)
		{
			if(position >= 5)
err(1, "unameR got too long!\n");
			if(c != ' ')
unameR[++position] = c;
			else
			{
unameR[position + 1] = '\0';
num = 1;
position = -1;
			}
		}
		else
		{
			if(position >= 20)
err(1, "unameM got too long!\n");
			if(c != '\n')
unameM[++position] = c;
			else
unameM[position + 1] = '\0';
		}
	}
	fclose (input);
	close(unameToParent[0]);





	pipe(ftpToSed); /*make pipes*/

	struct kevent ke[2];

	int kq = kqueue();
	if (kq == -1)
		err(1, "kq!");

	int kqProc = kqueue();
	if (kqProc == -1)
		err(1, "kqProc!");
		
		
	ftpPid = fork();
	if(ftpPid == (pid_t) 0)
	{			/*ftp child*/
		close(ftpToSed[0]);
		dup2(ftpToSed[1], STDOUT_FILENO); /*attaching to pipe(s)*/
execl("/usr/bin/ftp","ftp", "-Vo", "-", "http://www.openbsd.

I have a mirror testing program for you.

2016-01-19 Thread Luke Small
I have a 500 line program I wrote that reads openbsd.org.ftp.html and
scraps off the html and ftp mirrors, records them all without redundancies
as http mirrors in memory and downloads the appropriate version and machine
architecture's SHA256 in the package folder. It tests all the mirrors for
time, one at a time and uses kqueue to kill any laggy ftp calls. It uses
ftp() calls for all its networking, so it shouldn't be too much of a
security issue I'd guess. It writes the top 8 mirrors into /etc/pkg.conf it
erases all the installpath entries while leaving everything else in the
file. It can run as an unprivileged user, but of course it won't rewrite
/etc/pkg.conf
-Luke N Small
/*
 * Copyright (c) 2016 Luke N. Small
 *
 * Permission to use, copy, modify, and distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
 

/* Special thanks to Dan Mclaughlin for the ftp to sed idea
 * 
 * ftp -o - http://www.openbsd.org/ftp.html | \
 * sed -n \
 * 	-e 's:	\([^<]*\)<.*:\1 :p' \
 * 	-e 's:^\(	[hfr].*\):\1:p'
 */
 
 
#define EVENT_NOPOLL
#define EVENT_NOSELECT


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

struct mirror_st
{
	char * countryTitle;
	char * mirror;
	char * installPath;
	double diff;
	struct mirror_st * next;
};

int ftp_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;

	if( (*one)->diff < (*two)->diff )
		return -1;
	if( (*one)->diff > (*two)->diff )
		return 1;
	return 0;
}

int country_cmp (const void * a, const void * b)
{
	struct mirror_st **one = (struct mirror_st **)a;
	struct mirror_st **two = (struct mirror_st **)b;
	
	// list the USA mirrors first, it will subsort correctly
	int8_t temp = !strncmp("USA", (*one)->countryTitle, 3);
	if(temp != !strncmp("USA", (*two)->countryTitle, 3))
	{
		if(temp)
			return -1;
		return 1;
	}

	return strcmp( (*one)->countryTitle, (*two)->countryTitle ) ;
}


double getTimeDiff(struct timeval a, struct timeval b)
{
long sec;
long usec;
sec = b.tv_sec - a.tv_sec;
usec = b.tv_usec - a.tv_usec;
if (usec < 0)
{
		--sec;
		usec += 100;
}
return sec + ((double)usec / 100.0);
}

int main()
{
	pid_t ftpPid, sedPid;
	int ftpToSed[2];
	int sedToParent[2];
	int unameToParent[2];
	char unameR[5], unameM[20];
	int i = 0, c;
	register int position, num;

	FILE *input;





	pipe(unameToParent);
	

	// "uname -rm" returns version and architecture like: "5.8 amd64" to standard out
	
	if(fork() == (pid_t) 0)
	{			/* uname child */
		close(unameToParent[0]);
		dup2(unameToParent[1], STDOUT_FILENO); /*attaching to pipe(s)*/
		execl("/usr/bin/uname","/usr/bin/uname", "-rm", NULL);
err(1, "execl() failed\n");
	}
	
	close(unameToParent[1]);

	input = fdopen (unameToParent[0], "r");

	num = 0;
	position = -1;
	while ((c = getc(input)) != EOF)
	{
		if(num == 0)
		{
			if(c != ' ')
unameR[++position] = c;
			else
			{
unameR[position + 1] = '\0';
num = 1;
position = -1;
			}
		}
		else
		{
			if(c != '\n')
unameM[++position] = c;
			else
unameM[position + 1] = '\0';
		}
	}
	fclose (input);
	close(unameToParent[0]);





	pipe(ftpToSed); /*make pipes*/

	struct kevent ke[2];

	int kq = kqueue();
	if (kq == -1)
		err(1, "kq!");

	int kqProc = kqueue();
	if (kqProc == -1)
		err(1, "kqProc!");
		
		
	ftpPid = fork();
	if(ftpPid == (pid_t) 0)
	{			/*ftp child*/
		close(ftpToSed[0]);
		dup2(ftpToSed[1], STDOUT_FILENO); /*attaching to pipe(s)*/
execl("/usr/bin/ftp","ftp", "-Vo", "-", "http://www.openbsd.org/ftp.html;, NULL);
err(1, "execl() failed\n");
	}
	EV_SET(ke, ftpPid, EVFILT_PROC, EV_ADD | EV_ONESHOT, NOTE_EXIT, 0, );
	if (kevent(kqProc, ke, 1, NULL, 0, NULL) == -1)
		err(1, "kevent register fail.");

	close(ftpToSed[1]);

	pipe(sedToParent);
	

	sedPid = fork();
	if(sedPid == (pid_t) 0)
	{			/* sed child */
		close(sedToParent[0]);
		dup2(ftpToSed[0], STDIN_FILENO); /*attaching to pipe(s)*/
		dup2(sedToParent[1], STDOUT_FILENO);
		execl("/usr/bin/sed","sed","-n","-e", "s:\t\\([^<]*\\)<.*:\\1 :p",
		"-e", "s:^\\(\t[hfr].*\\):\\1:p", NULL);
		kill(ftpPid, SIGKILL);
err(1, "execl() failed\n");
	}
	EV_SET(ke, sedPid, EVFILT_PROC, EV_ADD | EV_ONESHOT, NOTE_EXIT, 0, );
	if (kevent(kqProc, ke, 1,